ABI - search results

DPDK patches and discussions
 help / color / mirror / Atom feed

Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download:

* [dpdk-dev] [PATCH v5 0/3] MAC address fail to be added shouldn't be stored
@ 2017-04-30  4:11  3% Wei Dai
  0 siblings, 0 replies; 200+ results
From: Wei Dai @ 2017-04-30  4:11 UTC (permalink / raw)
  To: wenzhuo.lu, thomas, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, helin.zhang, konstantin.ananyev, jingjing.wu,
	jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin, shepard.siegel,
	ed.czeck, john.miller
  Cc: dev, Wei Dai

Current ethdev always stores MAC address even it fails to be added.
Other function may regard the failed MAC address valid and lead to
some errors. So There is a need to check if the addr is added
successfully or not and discard it if it fails.

In 3rd patch, add a command "add_more_mac_addr port_id base_mac_addr count"
to add more than one MAC address one time.
This command can simplify the test for the first patch.
Normally a MAC address may fails to be added only after many MAC
addresses have been added.
Without this command, a tester may only trigger failed MAC address
by running many times of testpmd command 'mac_addr add' .

For v4 patch set, have got acknowledgement from
Nelio Laranjeiro <nelio.laranjeiro@6wind.com>  for mlx changes
Yuanhan Liu <yuanhan.liu@linux.intel.com>  for virtio changes

---
Changes
v5:
  1. rebase master branch
  2. add support to drivers/net/ark
  3. fix some minor defects
  
v4:
  1. rebase master branch
  2. follow code style
	
v3:
  1. Change return value for some specific NIC according to feedbacks
     from the community;
  2. Add ABI change in release note;
  3. Add more detailed commit message.

v2:
  fix warnings and erros from check-git-log.sh and checkpatch.pl


Wei Dai (3):
  ethdev: fix adding invalid MAC addr
  doc: change type of return value of adding MAC addr
  app/testpmd: add a command to add many MAC addrs

 app/test-pmd/cmdline.c                 | 55 ++++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst |  7 +++++
 drivers/net/ark/ark_ethdev.c           | 15 ++++++----
 drivers/net/bnx2x/bnx2x_ethdev.c       |  7 +++--
 drivers/net/bnxt/bnxt_ethdev.c         | 16 +++++-----
 drivers/net/e1000/base/e1000_api.c     |  2 +-
 drivers/net/e1000/em_ethdev.c          |  8 ++---
 drivers/net/e1000/igb_ethdev.c         |  9 +++---
 drivers/net/enic/enic.h                |  2 +-
 drivers/net/enic/enic_ethdev.c         |  4 +--
 drivers/net/enic/enic_main.c           |  9 +++---
 drivers/net/fm10k/fm10k_ethdev.c       |  3 +-
 drivers/net/i40e/i40e_ethdev.c         | 17 ++++++-----
 drivers/net/i40e/i40e_ethdev_vf.c      | 14 ++++-----
 drivers/net/ixgbe/ixgbe_ethdev.c       | 33 ++++++++++++--------
 drivers/net/mlx4/mlx4.c                | 16 ++++++----
 drivers/net/mlx5/mlx5.h                |  4 +--
 drivers/net/mlx5/mlx5_mac.c            | 16 ++++++----
 drivers/net/qede/qede_ethdev.c         |  6 ++--
 drivers/net/ring/rte_eth_ring.c        |  3 +-
 drivers/net/virtio/virtio_ethdev.c     | 13 ++++----
 lib/librte_ether/rte_ethdev.c          | 15 ++++++----
 lib/librte_ether/rte_ethdev.h          |  2 +-
 23 files changed, 185 insertions(+), 91 deletions(-)
 mode change 100644 => 100755 app/test-pmd/cmdline.c
 mode change 100644 => 100755 doc/guides/rel_notes/release_17_05.rst

-- 
2.7.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v5 2/3] doc: change type of return value of adding MAC addr
  2017-04-29  6:08  3% [dpdk-dev] [PATCH v5 0/3] MAC address fail to be added shouldn't be stored Wei Dai
@ 2017-04-29  6:08  5% ` Wei Dai
  0 siblings, 0 replies; 200+ results
From: Wei Dai @ 2017-04-29  6:08 UTC (permalink / raw)
  To: wenzhuo.lu, thomas, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, helin.zhang, konstantin.ananyev, jingjing.wu,
	jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin, shepard.siegel,
	ed.czeck, john.miller
  Cc: dev, Wei Dai

Add following lines in section of API change in release note.

If a MAC address fails to be added without this change, it is still
stored and may be regarded as a valid one. This may lead to errors
in application. The type of return value of eth_mac_addr_add_t in
rte_ethdev.h is changed. Any specific NIC also follows this change.

Signed-off-by: Wei Dai <wei.dai@intel.com>
---
 doc/guides/rel_notes/release_17_05.rst | 7 +++++++
 1 file changed, 7 insertions(+)
 mode change 100644 => 100755 doc/guides/rel_notes/release_17_05.rst

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
old mode 100644
new mode 100755
index ad20e86..8850fbe
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -481,6 +481,13 @@ ABI Changes
 * The ``rte_cryptodev_info.sym`` structure has new field ``max_nb_sessions_per_qp``
   to support drivers which may support limited number of sessions per queue_pair.
 
+* **Return if the MAC address is added successfully or not.**
+
+  If a MAC address fails to be added without this change, it is still stored
+  and may be regarded as a valid one. This may lead to errors in application.
+  The type of return value of eth_mac_addr_add_t in rte_ethdev.h is changed.
+  Any specific NIC also follows this change.
+
 
 Removed Items
 -------------
-- 
2.7.4

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v5 0/3] MAC address fail to be added shouldn't be stored
@ 2017-04-29  6:08  3% Wei Dai
  2017-04-29  6:08  5% ` [dpdk-dev] [PATCH v5 2/3] doc: change type of return value of adding MAC addr Wei Dai
  0 siblings, 1 reply; 200+ results
From: Wei Dai @ 2017-04-29  6:08 UTC (permalink / raw)
  To: wenzhuo.lu, thomas, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, helin.zhang, konstantin.ananyev, jingjing.wu,
	jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin, shepard.siegel,
	ed.czeck, john.miller
  Cc: dev, Wei Dai

Current ethdev always stores MAC address even it fails to be added.
Other function may regard the failed MAC address valid and lead to
some errors. So There is a need to check if the addr is added
successfully or not and discard it if it fails.

In 3rd patch, add a command "add_more_mac_addr port_id base_mac_addr count"
to add more than one MAC address one time.
This command can simplify the test for the first patch.
Normally a MAC address may fails to be added only after many MAC
addresses have been added.
Without this command, a tester may only trigger failed MAC address
by running many times of testpmd command 'mac_addr add' .

For v4 patch set, have got acknowledgement from
Nelio Laranjeiro <nelio.laranjeiro@6wind.com>  for mlx changes
Yuanhan Liu <yuanhan.liu@linux.intel.com>  for virtio changes

---
Changes
v5:
  1. rebase master branch
  2. add support to drivers/net/ark
  3. fix some minor defects
  
v4:
  1. rebase master branch
  2. follow code style
	
v3:
  1. Change return value for some specific NIC according to feedbacks
     from the community;
  2. Add ABI change in release note;
  3. Add more detailed commit message.

v2:
  fix warnings and erros from check-git-log.sh and checkpatch.pl


Wei Dai (3):
  ethdev: fix adding invalid MAC addr
  doc: change type of return value of adding MAC addr
  app/testpmd: add a command to add many MAC addrs

 app/test-pmd/cmdline.c                 | 55 ++++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst |  7 +++++
 drivers/net/ark/ark_ethdev.c           | 15 ++++++----
 drivers/net/bnx2x/bnx2x_ethdev.c       |  7 +++--
 drivers/net/bnxt/bnxt_ethdev.c         | 16 +++++-----
 drivers/net/e1000/base/e1000_api.c     |  2 +-
 drivers/net/e1000/em_ethdev.c          |  8 ++---
 drivers/net/e1000/igb_ethdev.c         |  9 +++---
 drivers/net/enic/enic.h                |  2 +-
 drivers/net/enic/enic_ethdev.c         |  4 +--
 drivers/net/enic/enic_main.c           |  9 +++---
 drivers/net/fm10k/fm10k_ethdev.c       |  3 +-
 drivers/net/i40e/i40e_ethdev.c         | 17 ++++++-----
 drivers/net/i40e/i40e_ethdev_vf.c      | 14 ++++-----
 drivers/net/ixgbe/ixgbe_ethdev.c       | 33 ++++++++++++--------
 drivers/net/mlx4/mlx4.c                | 16 ++++++----
 drivers/net/mlx5/mlx5.h                |  4 +--
 drivers/net/mlx5/mlx5_mac.c            | 16 ++++++----
 drivers/net/qede/qede_ethdev.c         |  6 ++--
 drivers/net/ring/rte_eth_ring.c        |  3 +-
 drivers/net/virtio/virtio_ethdev.c     | 13 ++++----
 lib/librte_ether/rte_ethdev.c          | 15 ++++++----
 lib/librte_ether/rte_ethdev.h          |  2 +-
 23 files changed, 185 insertions(+), 91 deletions(-)
 mode change 100644 => 100755 app/test-pmd/cmdline.c
 mode change 100644 => 100755 doc/guides/rel_notes/release_17_05.rst

-- 
2.7.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [RFC PATCH] timer: inform periodic timers of multiple expiries
  @ 2017-04-28 13:29  4% ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-04-28 13:29 UTC (permalink / raw)
  To: dev

I'd like some agreement soon on the approach to be taken to fix this issue,
in case we need an ABI change notice in 17.05 - i.e. if we take the
approach given in the patch below.

Also, while the alternative solution of calling a function multiple
times is not an ABI/API change, I view it as more problematic from a
compatibility point of view, as it would be a "silent" functionality
change for existing apps.

Regards,
/Bruce

On Fri, Apr 28, 2017 at 02:25:38PM +0100, Bruce Richardson wrote:
> if timer_manage is called much less frequently than the period of a
> periodic timer, then timer expiries will be missed. For example, if a timer
> has a period of 300us, but timer_manage is called every 1ms, then there
> will only be one timer callback called every 1ms instead of 3 within that
> time.
> 
> While we can fix this by having each function called multiple times within
> timer-manage, this will lead to out-of-order timeouts, and will be slower
> with all the function call overheads - especially in the case of a timeout
> doing something trivial like incrementing a counter. Therefore, we instead
> modify the callback functions to take a counter value of the number of
> expiries that have passed since the last time it was called.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  examples/l2fwd-jobstats/main.c                     |  7 +++++--
>  examples/l2fwd-keepalive/main.c                    |  8 ++++----
>  examples/l3fwd-power/main.c                        |  5 +++--
>  examples/performance-thread/common/lthread_sched.c |  4 +++-
>  examples/performance-thread/common/lthread_sched.h |  2 +-
>  examples/timer/main.c                              | 10 ++++++----
>  lib/librte_timer/rte_timer.c                       |  9 ++++++++-
>  lib/librte_timer/rte_timer.h                       |  2 +-
>  test/test/test_timer.c                             | 14 +++++++++-----
>  test/test/test_timer_perf.c                        |  4 +++-
>  test/test/test_timer_racecond.c                    |  3 ++-
>  11 files changed, 45 insertions(+), 23 deletions(-)
> 
> diff --git a/examples/l2fwd-jobstats/main.c b/examples/l2fwd-jobstats/main.c
> index e6e6c22..b264344 100644
> --- a/examples/l2fwd-jobstats/main.c
> +++ b/examples/l2fwd-jobstats/main.c
> @@ -410,7 +410,8 @@ l2fwd_job_update_cb(struct rte_jobstats *job, int64_t result)
>  }
>  
>  static void
> -l2fwd_fwd_job(__rte_unused struct rte_timer *timer, void *arg)
> +l2fwd_fwd_job(__rte_unused struct rte_timer *timer,
> +		__rte_unused unsigned int count, void *arg)
>  {
>  	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
>  	struct rte_mbuf *m;
> @@ -460,7 +461,9 @@ l2fwd_fwd_job(__rte_unused struct rte_timer *timer, void *arg)
>  }
>  
>  static void
> -l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
> +l2fwd_flush_job(__rte_unused struct rte_timer *timer,
> +		__rte_unused unsigned int count,
> +		__rte_unused void *arg)
>  {
>  	uint64_t now;
>  	unsigned lcore_id;
> diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd-keepalive/main.c
> index 4623d2a..26eba12 100644
> --- a/examples/l2fwd-keepalive/main.c
> +++ b/examples/l2fwd-keepalive/main.c
> @@ -144,8 +144,9 @@ struct rte_keepalive *rte_global_keepalive_info;
>  
>  /* Print out statistics on packets dropped */
>  static void
> -print_stats(__attribute__((unused)) struct rte_timer *ptr_timer,
> -	__attribute__((unused)) void *ptr_data)
> +print_stats(__rte_unused struct rte_timer *ptr_timer,
> +		__rte_unused unsigned int count,
> +		__rte_unused void *ptr_data)
>  {
>  	uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
>  	unsigned portid;
> @@ -748,8 +749,7 @@ main(int argc, char **argv)
>  				(check_period * rte_get_timer_hz()) / 1000,
>  				PERIODICAL,
>  				rte_lcore_id(),
> -				(void(*)(struct rte_timer*, void*))
> -				&rte_keepalive_dispatch_pings,
> +				(void *)&rte_keepalive_dispatch_pings,
>  				rte_global_keepalive_info
>  				) != 0 )
>  			rte_exit(EXIT_FAILURE, "Keepalive setup failure.\n");
> diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
> index ec40a17..318aefd 100644
> --- a/examples/l3fwd-power/main.c
> +++ b/examples/l3fwd-power/main.c
> @@ -410,8 +410,9 @@ signal_exit_now(int sigtype)
>  
>  /*  Freqency scale down timer callback */
>  static void
> -power_timer_cb(__attribute__((unused)) struct rte_timer *tim,
> -			  __attribute__((unused)) void *arg)
> +power_timer_cb(__rte_unused struct rte_timer *tim,
> +		__rte_unused unsigned int count,
> +		__rte_unused void *arg)
>  {
>  	uint64_t hz;
>  	float sleep_time_ratio;
> diff --git a/examples/performance-thread/common/lthread_sched.c b/examples/performance-thread/common/lthread_sched.c
> index c64c21f..c4c7d3b 100644
> --- a/examples/performance-thread/common/lthread_sched.c
> +++ b/examples/performance-thread/common/lthread_sched.c
> @@ -437,7 +437,9 @@ static inline void _lthread_resume(struct lthread *lt)
>   * Handle sleep timer expiry
>  */
>  void
> -_sched_timer_cb(struct rte_timer *tim, void *arg)
> +_sched_timer_cb(struct rte_timer *tim,
> +		unsigned int count __rte_unused,
> +		void *arg)
>  {
>  	struct lthread *lt = (struct lthread *) arg;
>  	uint64_t state = lt->state;
> diff --git a/examples/performance-thread/common/lthread_sched.h b/examples/performance-thread/common/lthread_sched.h
> index 7cddda9..f2af8b3 100644
> --- a/examples/performance-thread/common/lthread_sched.h
> +++ b/examples/performance-thread/common/lthread_sched.h
> @@ -149,7 +149,7 @@ _reschedule(void)
>  }
>  
>  extern struct lthread_sched *schedcore[];
> -void _sched_timer_cb(struct rte_timer *tim, void *arg);
> +void _sched_timer_cb(struct rte_timer *tim, unsigned int count, void *arg);
>  void _sched_shutdown(__rte_unused void *arg);
>  
>  #ifdef __cplusplus
> diff --git a/examples/timer/main.c b/examples/timer/main.c
> index 37ad559..92a6a1f 100644
> --- a/examples/timer/main.c
> +++ b/examples/timer/main.c
> @@ -55,8 +55,9 @@ static struct rte_timer timer1;
>  
>  /* timer0 callback */
>  static void
> -timer0_cb(__attribute__((unused)) struct rte_timer *tim,
> -	  __attribute__((unused)) void *arg)
> +timer0_cb(__rte_unused struct rte_timer *tim,
> +	  __rte_unused unsigned int count,
> +	  __rte_unused void *arg)
>  {
>  	static unsigned counter = 0;
>  	unsigned lcore_id = rte_lcore_id();
> @@ -71,8 +72,9 @@ timer0_cb(__attribute__((unused)) struct rte_timer *tim,
>  
>  /* timer1 callback */
>  static void
> -timer1_cb(__attribute__((unused)) struct rte_timer *tim,
> -	  __attribute__((unused)) void *arg)
> +timer1_cb(__rte_unused struct rte_timer *tim,
> +	  __rte_unused unsigned int count,
> +	  __rte_unused void *arg)
>  {
>  	unsigned lcore_id = rte_lcore_id();
>  	uint64_t hz;
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
> index 18782fa..d1e2c12 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -590,7 +590,14 @@ void rte_timer_manage(void)
>  		priv_timer[lcore_id].running_tim = tim;
>  
>  		/* execute callback function with list unlocked */
> -		tim->f(tim, tim->arg);
> +		if (tim->period == 0)
> +			tim->f(tim, 1, tim->arg);
> +		else {
> +			/* for periodic check how many expiries we have */
> +			uint64_t over_time = cur_time - tim->expire;
> +			unsigned int extra_expiries = over_time / tim->period;
> +			tim->f(tim, 1 + extra_expiries, tim->arg);
> +		}
>  
>  		__TIMER_STAT_ADD(pending, -1);
>  		/* the timer was stopped or reloaded by the callback
> diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
> index a276a73..bc434ec 100644
> --- a/lib/librte_timer/rte_timer.h
> +++ b/lib/librte_timer/rte_timer.h
> @@ -117,7 +117,7 @@ struct rte_timer;
>  /**
>   * Callback function type for timer expiry.
>   */
> -typedef void (*rte_timer_cb_t)(struct rte_timer *, void *);
> +typedef void (*rte_timer_cb_t)(struct rte_timer *, unsigned int count, void *);
>  
>  #define MAX_SKIPLIST_DEPTH 10
>  
> diff --git a/test/test/test_timer.c b/test/test/test_timer.c
> index 2f6525a..0b86d3c 100644
> --- a/test/test/test_timer.c
> +++ b/test/test/test_timer.c
> @@ -153,7 +153,8 @@ struct mytimerinfo {
>  
>  static struct mytimerinfo mytiminfo[NB_TIMER];
>  
> -static void timer_basic_cb(struct rte_timer *tim, void *arg);
> +static void
> +timer_basic_cb(struct rte_timer *tim, unsigned int count, void *arg);
>  
>  static void
>  mytimer_reset(struct mytimerinfo *timinfo, uint64_t ticks,
> @@ -167,6 +168,7 @@ mytimer_reset(struct mytimerinfo *timinfo, uint64_t ticks,
>  /* timer callback for stress tests */
>  static void
>  timer_stress_cb(__attribute__((unused)) struct rte_timer *tim,
> +		__attribute__((unused)) unsigned int count,
>  		__attribute__((unused)) void *arg)
>  {
>  	long r;
> @@ -293,9 +295,11 @@ static volatile int cb_count = 0;
>  /* callback for second stress test. will only be called
>   * on master lcore */
>  static void
> -timer_stress2_cb(struct rte_timer *tim __rte_unused, void *arg __rte_unused)
> +timer_stress2_cb(struct rte_timer *tim __rte_unused,
> +		unsigned int count,
> +		void *arg __rte_unused)
>  {
> -	cb_count++;
> +	cb_count += count;
>  }
>  
>  #define NB_STRESS2_TIMERS 8192
> @@ -430,7 +434,7 @@ timer_stress2_main_loop(__attribute__((unused)) void *arg)
>  
>  /* timer callback for basic tests */
>  static void
> -timer_basic_cb(struct rte_timer *tim, void *arg)
> +timer_basic_cb(struct rte_timer *tim, unsigned int count, void *arg)
>  {
>  	struct mytimerinfo *timinfo = arg;
>  	uint64_t hz = rte_get_timer_hz();
> @@ -440,7 +444,7 @@ timer_basic_cb(struct rte_timer *tim, void *arg)
>  	if (rte_timer_pending(tim))
>  		return;
>  
> -	timinfo->count ++;
> +	timinfo->count += count;
>  
>  	RTE_LOG(INFO, TESTTIMER,
>  		"%"PRIu64": callback id=%u count=%u on core %u\n",
> diff --git a/test/test/test_timer_perf.c b/test/test/test_timer_perf.c
> index fa77efb..5b3867d 100644
> --- a/test/test/test_timer_perf.c
> +++ b/test/test/test_timer_perf.c
> @@ -48,7 +48,9 @@
>  int outstanding_count = 0;
>  
>  static void
> -timer_cb(struct rte_timer *t __rte_unused, void *param __rte_unused)
> +timer_cb(struct rte_timer *t __rte_unused,
> +		unsigned int count __rte_unused,
> +		void *param __rte_unused)
>  {
>  	outstanding_count--;
>  }
> diff --git a/test/test/test_timer_racecond.c b/test/test/test_timer_racecond.c
> index 7824ec4..b1c5c86 100644
> --- a/test/test/test_timer_racecond.c
> +++ b/test/test/test_timer_racecond.c
> @@ -65,7 +65,8 @@ static volatile unsigned stop_slaves;
>  static int reload_timer(struct rte_timer *tim);
>  
>  static void
> -timer_cb(struct rte_timer *tim, void *arg __rte_unused)
> +timer_cb(struct rte_timer *tim, unsigned int count __rte_unused,
> +		void *arg __rte_unused)
>  {
>  	/* Simulate slow callback function, 100 us. */
>  	rte_delay_us(100);
> -- 
> 2.9.3
> 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v1 3/6] ethdev: get xstats ID by name
    2017-04-27 14:42  1% ` [dpdk-dev] [PATCH v1 1/6] ethdev: revert patches extending xstats API in ethdev Michal Jastrzebski
  2017-04-27 14:42  2% ` [dpdk-dev] [PATCH v1 2/6] ethdev: retrieve xstats by ID Michal Jastrzebski
@ 2017-04-27 14:42  4% ` Michal Jastrzebski
  2 siblings, 0 replies; 200+ results
From: Michal Jastrzebski @ 2017-04-27 14:42 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, deepak.k.jain, Kuba Kozak

From: Kuba Kozak <kubax.kozak@intel.com>

Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.

doc: added release note

Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
---
 doc/guides/rel_notes/release_17_05.rst |  2 ++
 lib/librte_ether/rte_ethdev.c          | 44 ++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 20 ++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  1 +
 4 files changed, 67 insertions(+)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 12d6b49..2e1463f 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -454,6 +454,8 @@ API Changes
 
   * Added new functions ``rte_eth_xstats_get_by_id`` and ``rte_eth_xstats_get_names_by_id``
 
+  * Added new function ``rte_eth_xstats_get_id_by_name``
+
 
 ABI Changes
 -----------
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 331d2e3..8aa6331 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1382,6 +1382,50 @@ get_xstats_count(uint8_t port_id)
 }
 
 int
+rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id)
+{
+	int cnt_xstats, idx_xstat;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	if (!id) {
+		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
+		return -ENOMEM;
+	}
+
+	if (!xstat_name) {
+		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
+		return -ENOMEM;
+	}
+
+	/* Get count */
+	cnt_xstats = rte_eth_xstats_get_names_by_id(port_id, NULL, 0, NULL);
+	if (cnt_xstats  < 0) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
+		return -ENODEV;
+	}
+
+	/* Get id-name lookup table */
+	struct rte_eth_xstat_name xstats_names[cnt_xstats];
+
+	if (cnt_xstats != rte_eth_xstats_get_names_by_id(
+			port_id, xstats_names, cnt_xstats, NULL)) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
+		return -1;
+	}
+
+	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
+		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
+			*id = idx_xstat;
+			return 0;
+		};
+	}
+
+	return -EINVAL;
+}
+
+int
 rte_eth_xstats_get_names_by_id(uint8_t port_id,
 	struct rte_eth_xstat_name *xstats_names, unsigned int size,
 	uint64_t *ids)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8594416..782a6a8 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2389,6 +2389,26 @@ int rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
 			     uint64_t *values, unsigned int n);
 
 /**
+ * Gets the ID of a statistic from its name.
+ *
+ * This function searches for the statistics using string compares, and
+ * as such should not be used on the fast-path. For fast-path retrieval of
+ * specific statistics, store the ID as provided in *id* from this function,
+ * and pass the ID to rte_eth_xstats_get()
+ *
+ * @param port_id The port to look up statistics from
+ * @param xstat_name The name of the statistic to return
+ * @param[out] id A pointer to an app-supplied uint64_t which should be
+ *                set to the ID of the stat if the stat exists.
+ * @return
+ *    0 on success
+ *    -ENODEV for invalid port_id,
+ *    -EINVAL if the xstat_name doesn't exist in port_id
+ */
+int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id);
+
+/**
  * Reset extended statistics of an Ethernet device.
  *
  * @param port_id
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 1abd717..d6726bb 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -152,6 +152,7 @@ DPDK_17.05 {
 	rte_eth_dev_attach_secondary;
 	rte_eth_find_next;
 	rte_eth_xstats_get_by_id;
+	rte_eth_xstats_get_id_by_name;
 	rte_eth_xstats_get_names_by_id;
 
 } DPDK_17.02;
-- 
2.7.4

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v1 2/6] ethdev: retrieve xstats by ID
    2017-04-27 14:42  1% ` [dpdk-dev] [PATCH v1 1/6] ethdev: revert patches extending xstats API in ethdev Michal Jastrzebski
@ 2017-04-27 14:42  2% ` Michal Jastrzebski
  2017-04-27 14:42  4% ` [dpdk-dev] [PATCH v1 3/6] ethdev: get xstats ID by name Michal Jastrzebski
  2 siblings, 0 replies; 200+ results
From: Michal Jastrzebski @ 2017-04-27 14:42 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, deepak.k.jain, Kuba Kozak

From: Kuba Kozak <kubax.kozak@intel.com>

Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping  managed
by the application.
Added new functions rte_eth_xstats_get_names_by_id and
rte_eth_xstats_get_by_id using additional arguments (in compare
to rte_eth_xstats_get_names and rte_eth_xstats_get) - array of ids
and array of values.

doc: add description for modified xstats API
Documentation change for new extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.

Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst | 167 ++++++++++++++++++---
 doc/guides/rel_notes/release_17_05.rst  |   4 +
 lib/librte_ether/rte_ethdev.c           | 250 ++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h           |  70 +++++++++
 lib/librte_ether/rte_ether_version.map  |   2 +
 5 files changed, 476 insertions(+), 17 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index e48c121..4987f70 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -334,24 +334,21 @@ The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK
 Extended Statistics API
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-The extended statistics API allows each individual PMD to expose a unique set
-of statistics. Accessing these from application programs is done via two
-functions:
-
-* ``rte_eth_xstats_get``: Fills in an array of ``struct rte_eth_xstat``
-  with extended statistics.
-* ``rte_eth_xstats_get_names``: Fills in an array of
-  ``struct rte_eth_xstat_name`` with extended statistic name lookup
-  information.
-
-Each ``struct rte_eth_xstat`` contains an identifier and value pair, and
-each ``struct rte_eth_xstat_name`` contains a string. Each identifier
-within the ``struct rte_eth_xstat`` lookup array must have a corresponding
-entry in the ``struct rte_eth_xstat_name`` lookup array. Within the latter
-the index of the entry is the identifier the string is associated with.
-These identifiers, as well as the number of extended statistic exposed, must
-remain constant during runtime. Note that extended statistic identifiers are
+The extended statistics API allows a PMD to expose all statistics that are
+available to it, including statistics that are unique to the device.
+Each statistic has three properties ``name``, ``id`` and ``value``:
+
+* ``name``: A human readable string formatted by the scheme detailed below.
+* ``id``: An integer that represents only that statistic.
+* ``value``: A unsigned 64-bit integer that is the value of the statistic.
+
+Note that extended statistic identifiers are
 driver-specific, and hence might not be the same for different ports.
+The API consists of various ``rte_eth_xstats_*()`` functions, and allows an
+application to be flexible in how it retrieves statistics.
+
+Scheme for Human Readable Names
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 A naming scheme exists for the strings exposed to clients of the API. This is
 to allow scraping of the API for statistics of interest. The naming scheme uses
@@ -392,3 +389,139 @@ Some additions in the metadata scheme are as follows:
 An example where queue numbers are used is as follows: ``tx_q7_bytes`` which
 indicates this statistic applies to queue number 7, and represents the number
 of transmitted bytes on that queue.
+
+API Design
+^^^^^^^^^^
+
+The xstats API uses the ``name``, ``id``, and ``value`` to allow performant
+lookup of specific statistics. Performant lookup means two things;
+
+* No string comparisons with the ``name`` of the statistic in fast-path
+* Allow requesting of only the statistics of interest
+
+The API ensures these requirements are met by mapping the ``name`` of the
+statistic to a unique ``id``, which is used as a key for lookup in the fast-path.
+The API allows applications to request an array of ``id`` values, so that the
+PMD only performs the required calculations. Expected usage is that the
+application scans the ``name`` of each statistic, and caches the ``id``
+if it has an interest in that statistic. On the fast-path, the integer can be used
+to retrieve the actual ``value`` of the statistic that the ``id`` represents.
+
+API Functions
+^^^^^^^^^^^^^
+
+The API is built out of a small number of functions, which can be used to
+retrieve the number of statistics and the names, IDs and values of those
+statistics.
+
+* ``rte_eth_xstats_get_names_by_id()``: returns the names of the statistics. When given a
+  ``NULL`` parameter the function returns the number of statistics that are available.
+
+* ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches
+  ``xstat_name``. If found, the ``id`` integer is set.
+
+* ``rte_eth_xstats_get_by_id()``: Fills in an array of ``uint64_t`` values
+  with matching the provided ``ids`` array. If the ``ids`` array is NULL, it
+  returns all statistics that are available.
+
+
+Application Usage
+^^^^^^^^^^^^^^^^^
+
+Imagine an application that wants to view the dropped packet count. If no
+packets are dropped, the application does not read any other metrics for
+performance reasons. If packets are dropped, the application has a particular
+set of statistics that it requests. This "set" of statistics allows the app to
+decide what next steps to perform. The following code-snippets show how the
+xstats API can be used to achieve this goal.
+
+First step is to get all statistics names and list them:
+
+.. code-block:: c
+
+    struct rte_eth_xstat_name *xstats_names;
+    uint64_t *values;
+    int len, i;
+
+    /* Get number of stats */
+    len = rte_eth_xstats_get_names_by_id(port_id, NULL, NULL, 0);
+    if (len < 0) {
+        printf("Cannot get xstats count\n");
+        goto err;
+    }
+
+    xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+    if (xstats_names == NULL) {
+        printf("Cannot allocate memory for xstat names\n");
+        goto err;
+    }
+
+    /* Retrieve xstats names, passing NULL for IDs to return all statistics */
+    if (len != rte_eth_xstats_get_names_by_id(port_id, xstats_names, NULL, len)) {
+        printf("Cannot get xstat names\n");
+        goto err;
+    }
+
+    values = malloc(sizeof(values) * len);
+    if (values == NULL) {
+        printf("Cannot allocate memory for xstats\n");
+        goto err;
+    }
+
+    /* Getting xstats values */
+    if (len != rte_eth_xstats_get_by_id(port_id, NULL, values, len)) {
+        printf("Cannot get xstat values\n");
+        goto err;
+    }
+
+    /* Print all xstats names and values */
+    for (i = 0; i < len; i++) {
+        printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]);
+    }
+
+The application has access to the names of all of the statistics that the PMD
+exposes. The application can decide which statistics are of interest, cache the
+ids of those statistics by looking up the name as follows:
+
+.. code-block:: c
+
+    uint64_t id;
+    uint64_t value;
+    const char *xstat_name = "rx_errors";
+
+    if(!rte_eth_xstats_get_id_by_name(port_id, xstat_name, &id)) {
+        rte_eth_xstats_get_by_id(port_id, &id, &value, 1);
+        printf("%s: %"PRIu64"\n", xstat_name, value);
+    }
+    else {
+        printf("Cannot find xstats with a given name\n");
+        goto err;
+    }
+
+The API provides flexibility to the application so that it can look up multiple
+statistics using an array containing multiple ``id`` numbers. This reduces the
+function call overhead of retrieving statistics, and makes lookup of multiple
+statistics simpler for the application.
+
+.. code-block:: c
+
+    #define APP_NUM_STATS 4
+    /* application cached these ids previously; see above */
+    uint64_t ids_array[APP_NUM_STATS] = {3,4,7,21};
+    uint64_t value_array[APP_NUM_STATS];
+
+    /* Getting multiple xstats values from array of IDs */
+    rte_eth_xstats_get_by_id(port_id, ids_array, value_array, APP_NUM_STATS);
+
+    uint32_t i;
+    for(i = 0; i < APP_NUM_STATS; i++) {
+        printf("%d: %"PRIu64"\n", ids_array[i], value_array[i]);
+    }
+
+
+This array lookup API for xstats allows the application create multiple
+"groups" of statistics, and look up the values of those IDs using a single API
+call. As an end result, the application is able to achieve its goal of
+monitoring a single statistic ("rx_errors" in this case), and if that shows
+packets being dropped, it can easily retrieve a "set" of statistics using the
+IDs array parameter to ``rte_eth_xstats_get_by_id`` function.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index dcd55ff..12d6b49 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -450,6 +450,10 @@ API Changes
   * The vhost public header file ``rte_virtio_net.h`` is renamed to
     ``rte_vhost.h``
 
+* **Reworked rte_ethdev library**
+
+  * Added new functions ``rte_eth_xstats_get_by_id`` and ``rte_eth_xstats_get_names_by_id``
+
 
 ABI Changes
 -----------
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 9922430..331d2e3 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1360,12 +1360,19 @@ get_xstats_count(uint8_t port_id)
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
+	if (dev->dev_ops->xstats_get_names_by_id != NULL) {
+		count = (*dev->dev_ops->xstats_get_names_by_id)(dev, NULL,
+				NULL, 0);
+		if (count < 0)
+			return count;
+	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
 			return count;
 	} else
 		count = 0;
+
 	count += RTE_NB_STATS;
 	count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
 		 RTE_NB_RXQ_STATS;
@@ -1375,6 +1382,123 @@ get_xstats_count(uint8_t port_id)
 }
 
 int
+rte_eth_xstats_get_names_by_id(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids)
+{
+	/* Get all xstats */
+	if (!ids) {
+		struct rte_eth_dev *dev;
+		int cnt_used_entries;
+		int cnt_expected_entries;
+		int cnt_driver_entries;
+		uint32_t idx, id_queue;
+		uint16_t num_q;
+
+		cnt_expected_entries = get_xstats_count(port_id);
+		if (xstats_names == NULL || cnt_expected_entries < 0 ||
+				(int)size < cnt_expected_entries)
+			return cnt_expected_entries;
+
+		/* port_id checked in get_xstats_count() */
+		dev = &rte_eth_devices[port_id];
+		cnt_used_entries = 0;
+
+		for (idx = 0; idx < RTE_NB_STATS; idx++) {
+			snprintf(xstats_names[cnt_used_entries].name,
+				sizeof(xstats_names[0].name),
+				"%s", rte_stats_strings[idx].name);
+			cnt_used_entries++;
+		}
+		num_q = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"rx_q%u%s",
+					id_queue,
+					rte_rxq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+
+		}
+		num_q = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"tx_q%u%s",
+					id_queue,
+					rte_txq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+		}
+
+		if (dev->dev_ops->xstats_get_names_by_id != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries =
+				(*dev->dev_ops->xstats_get_names_by_id)(
+				dev,
+				xstats_names + cnt_used_entries,
+				NULL,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+
+		} else if (dev->dev_ops->xstats_get_names != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+				dev,
+				xstats_names + cnt_used_entries,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+		}
+
+		return cnt_used_entries;
+	}
+	/* Get only xstats given by IDS */
+	else {
+		uint16_t len, i;
+		struct rte_eth_xstat_name *xstats_names_copy;
+
+		len = rte_eth_xstats_get_names_by_id(port_id, NULL, 0, NULL);
+
+		xstats_names_copy =
+				malloc(sizeof(struct rte_eth_xstat_name) * len);
+		if (!xstats_names_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			     "ERROR: can't allocate memory for values_copy\n");
+			free(xstats_names_copy);
+			return -1;
+		}
+
+		rte_eth_xstats_get_names_by_id(port_id, xstats_names_copy,
+				len, NULL);
+
+		for (i = 0; i < size; i++) {
+			if (ids[i] >= len) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			strcpy(xstats_names[i].name,
+					xstats_names_copy[ids[i]].name);
+		}
+		free(xstats_names_copy);
+		return size;
+	}
+}
+
+int
 rte_eth_xstats_get_names(uint8_t port_id,
 	struct rte_eth_xstat_name *xstats_names,
 	unsigned int size)
@@ -1441,6 +1565,132 @@ rte_eth_xstats_get_names(uint8_t port_id,
 
 /* retrieve ethdev extended statistics */
 int
+rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids, uint64_t *values,
+	unsigned int n)
+{
+	/* If need all xstats */
+	if (!ids) {
+		struct rte_eth_stats eth_stats;
+		struct rte_eth_dev *dev;
+		unsigned int count = 0, i, q;
+		signed int xcount = 0;
+		uint64_t val, *stats_ptr;
+		uint16_t nb_rxqs, nb_txqs;
+
+		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+		dev = &rte_eth_devices[port_id];
+
+		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		nb_txqs = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+
+		/* Return generic statistics */
+		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+			(nb_txqs * RTE_NB_TXQ_STATS);
+
+
+		/* implemented by the driver */
+		if (dev->dev_ops->xstats_get_by_id != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 */
+			xcount = (*dev->dev_ops->xstats_get_by_id)(dev,
+					NULL,
+					values ? values + count : NULL,
+					(n > count) ? n - count : 0);
+
+			if (xcount < 0)
+				return xcount;
+		/* implemented by the driver */
+		} else if (dev->dev_ops->xstats_get != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 * Compatibility for PMD without xstats_get_by_ids
+			 */
+			unsigned int size = (n > count) ? n - count : 1;
+			struct rte_eth_xstat xstats[size];
+
+			xcount = (*dev->dev_ops->xstats_get)(dev,
+					values ? xstats : NULL,	size);
+
+			if (xcount < 0)
+				return xcount;
+
+			if (values != NULL)
+				for (i = 0 ; i < (unsigned int)xcount; i++)
+					values[i + count] = xstats[i].value;
+		}
+
+		if (n < count + xcount || values == NULL)
+			return count + xcount;
+
+		/* now fill the xstats structure */
+		count = 0;
+		rte_eth_stats_get(port_id, &eth_stats);
+
+		/* global stats */
+		for (i = 0; i < RTE_NB_STATS; i++) {
+			stats_ptr = RTE_PTR_ADD(&eth_stats,
+						rte_stats_strings[i].offset);
+			val = *stats_ptr;
+			values[count++] = val;
+		}
+
+		/* per-rxq stats */
+		for (q = 0; q < nb_rxqs; q++) {
+			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_rxq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		/* per-txq stats */
+		for (q = 0; q < nb_txqs; q++) {
+			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_txq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		return count + xcount;
+	}
+	/* Need only xstats given by IDS array */
+	else {
+		uint16_t i, size;
+		uint64_t *values_copy;
+
+		size = rte_eth_xstats_get_by_id(port_id, NULL, NULL, 0);
+
+		values_copy = malloc(sizeof(values_copy) * size);
+		if (!values_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			    "ERROR: can't allocate memory for values_copy\n");
+			return -1;
+		}
+
+		rte_eth_xstats_get_by_id(port_id, NULL, values_copy, size);
+
+		for (i = 0; i < n; i++) {
+			if (ids[i] >= size) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			values[i] = values_copy[ids[i]];
+		}
+		free(values_copy);
+		return n;
+	}
+}
+
+int
 rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
 	unsigned int n)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index a46290c..8594416 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -185,6 +185,7 @@ extern "C" {
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
+#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1121,6 +1122,12 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_by_id_t)(struct rte_eth_dev *dev,
+				      const uint64_t *ids,
+				      uint64_t *values,
+				      unsigned int n);
+/**< @internal Get extended stats of an Ethernet device. */
+
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1128,6 +1135,11 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_names_by_id_t)(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, const uint64_t *ids,
+	unsigned int size);
+/**< @internal Get names of extended stats of an Ethernet device. */
+
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1566,6 +1578,11 @@ struct eth_dev_ops {
 	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
+
+	eth_xstats_get_by_id_t     xstats_get_by_id;
+	/**< Get extended device statistic values by ID. */
+	eth_xstats_get_names_by_id_t xstats_get_names_by_id;
+	/**< Get name of extended device statistics by ID. */
 };
 
 /**
@@ -2319,6 +2336,59 @@ int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
 		unsigned int n);
 
 /**
+ * Retrieve names of extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   An rte_eth_xstat_name array of at least *size* elements to
+ *   be filled. If set to NULL, the function returns the required number
+ *   of elements.
+ * @param ids
+ *   IDs array given by app to retrieve specific statistics
+ * @param size
+ *   The size of the xstats_names array (number of elements).
+ * @return
+ *   - A positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int
+rte_eth_xstats_get_names_by_id(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ids
+ *   A pointer to an ids array passed by application. This tells wich
+ *   statistics values function should retrieve. This parameter
+ *   can be set to NULL if n is 0. In this case function will retrieve
+ *   all avalible statistics.
+ * @param values
+ *   A pointer to a table to be filled with device statistics values.
+ * @param n
+ *   The size of the ids array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
+			     uint64_t *values, unsigned int n);
+
+/**
  * Reset extended statistics of an Ethernet device.
  *
  * @param port_id
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 238c2a1..1abd717 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -151,5 +151,7 @@ DPDK_17.05 {
 
 	rte_eth_dev_attach_secondary;
 	rte_eth_find_next;
+	rte_eth_xstats_get_by_id;
+	rte_eth_xstats_get_names_by_id;
 
 } DPDK_17.02;
-- 
2.7.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v1 1/6] ethdev: revert patches extending xstats API in ethdev
  @ 2017-04-27 14:42  1% ` Michal Jastrzebski
  2017-04-27 14:42  2% ` [dpdk-dev] [PATCH v1 2/6] ethdev: retrieve xstats by ID Michal Jastrzebski
  2017-04-27 14:42  4% ` [dpdk-dev] [PATCH v1 3/6] ethdev: get xstats ID by name Michal Jastrzebski
  2 siblings, 0 replies; 200+ results
From: Michal Jastrzebski @ 2017-04-27 14:42 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, deepak.k.jain, Kuba Kozak

From: Kuba Kozak <kubax.kozak@intel.com>

Revert patches to provide clear view for
upcoming changes. Reverted patches are listed below:
commit ea85e7d711b6 ("ethdev: retrieve xstats by ID")
commit a954495245c4 ("ethdev: get xstats ID by name")
commit 1223608adb9b ("app/proc-info: support xstats by ID")
commit 25e38f09af9c ("net/e1000: support xstats by ID")
commit 923419333f5a ("net/ixgbe: support xstats by ID")

Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
---
 app/proc_info/main.c                    | 148 +----------
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 173 ++-----------
 doc/guides/rel_notes/release_17_05.rst  |   9 -
 drivers/net/e1000/igb_ethdev.c          |  92 +------
 drivers/net/ixgbe/ixgbe_ethdev.c        | 179 -------------
 lib/librte_ether/rte_ethdev.c           | 430 ++++++++------------------------
 lib/librte_ether/rte_ethdev.h           | 167 +------------
 lib/librte_ether/rte_ether_version.map  |   5 -
 9 files changed, 151 insertions(+), 1071 deletions(-)

diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index 16b27b2..d576b42 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -86,14 +86,6 @@ static uint32_t reset_stats;
 static uint32_t reset_xstats;
 /**< Enable memory info. */
 static uint32_t mem_info;
-/**< Enable displaying xstat name. */
-static uint32_t enable_xstats_name;
-static char *xstats_name;
-
-/**< Enable xstats by ids. */
-#define MAX_NB_XSTATS_IDS 1024
-static uint32_t nb_xstats_ids;
-static uint64_t xstats_ids[MAX_NB_XSTATS_IDS];
 
 /**< display usage */
 static void
@@ -107,9 +99,6 @@ proc_info_usage(const char *prgname)
 			"default\n"
 		"  --metrics: to display derived metrics of the ports, disabled by "
 			"default\n"
-		"  --xstats-name NAME: displays the ID of a single xstats NAME\n"
-		"  --xstats-ids IDLIST: to display xstat values by id. "
-			"The argument is comma-separated list of xstat ids to print out.\n"
 		"  --stats-reset: to reset port statistics\n"
 		"  --xstats-reset: to reset port extended statistics\n"
 		"  --collectd-format: to print statistics to STDOUT in expected by collectd format\n"
@@ -143,33 +132,6 @@ parse_portmask(const char *portmask)
 
 }
 
-/*
- * Parse ids value list into array
- */
-static int
-parse_xstats_ids(char *list, uint64_t *ids, int limit) {
-	int length;
-	char *token;
-	char *ctx = NULL;
-	char *endptr;
-
-	length = 0;
-	token = strtok_r(list, ",", &ctx);
-	while (token != NULL) {
-		ids[length] = strtoull(token, &endptr, 10);
-		if (*endptr != '\0')
-			return -EINVAL;
-
-		length++;
-		if (length >= limit)
-			return -E2BIG;
-
-		token = strtok_r(NULL, ",", &ctx);
-	}
-
-	return length;
-}
-
 static int
 proc_info_preparse_args(int argc, char **argv)
 {
@@ -216,9 +178,7 @@ proc_info_parse_args(int argc, char **argv)
 		{"xstats", 0, NULL, 0},
 		{"metrics", 0, NULL, 0},
 		{"xstats-reset", 0, NULL, 0},
-		{"xstats-name", required_argument, NULL, 1},
 		{"collectd-format", 0, NULL, 0},
-		{"xstats-ids", 1, NULL, 1},
 		{"host-id", 0, NULL, 0},
 		{NULL, 0, 0, 0}
 	};
@@ -264,28 +224,7 @@ proc_info_parse_args(int argc, char **argv)
 					MAX_LONG_OPT_SZ))
 				reset_xstats = 1;
 			break;
-		case 1:
-			/* Print xstat single value given by name*/
-			if (!strncmp(long_option[option_index].name,
-					"xstats-name", MAX_LONG_OPT_SZ)) {
-				enable_xstats_name = 1;
-				xstats_name = optarg;
-				printf("name:%s:%s\n",
-						long_option[option_index].name,
-						optarg);
-			} else if (!strncmp(long_option[option_index].name,
-					"xstats-ids",
-					MAX_LONG_OPT_SZ))	{
-				nb_xstats_ids = parse_xstats_ids(optarg,
-						xstats_ids, MAX_NB_XSTATS_IDS);
-
-				if (nb_xstats_ids <= 0) {
-					printf("xstats-id list parse error.\n");
-					return -1;
-				}
 
-			}
-			break;
 		default:
 			proc_info_usage(prgname);
 			return -1;
@@ -412,82 +351,20 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 }
 
 static void
-nic_xstats_by_name_display(uint8_t port_id, char *name)
-{
-	uint64_t id;
-
-	printf("###### NIC statistics for port %-2d, statistic name '%s':\n",
-			   port_id, name);
-
-	if (rte_eth_xstats_get_id_by_name(port_id, name, &id) == 0)
-		printf("%s: %"PRIu64"\n", name, id);
-	else
-		printf("Statistic not found...\n");
-
-}
-
-static void
-nic_xstats_by_ids_display(uint8_t port_id, uint64_t *ids, int len)
-{
-	struct rte_eth_xstat_name *xstats_names;
-	uint64_t *values;
-	int ret, i;
-	static const char *nic_stats_border = "########################";
-
-	values = malloc(sizeof(values) * len);
-	if (values == NULL) {
-		printf("Cannot allocate memory for xstats\n");
-		return;
-	}
-
-	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
-	if (xstats_names == NULL) {
-		printf("Cannot allocate memory for xstat names\n");
-		free(values);
-		return;
-	}
-
-	if (len != rte_eth_xstats_get_names(
-			port_id, xstats_names, len, ids)) {
-		printf("Cannot get xstat names\n");
-		goto err;
-	}
-
-	printf("###### NIC extended statistics for port %-2d #########\n",
-			   port_id);
-	printf("%s############################\n", nic_stats_border);
-	ret = rte_eth_xstats_get(port_id, ids, values, len);
-	if (ret < 0 || ret > len) {
-		printf("Cannot get xstats\n");
-		goto err;
-	}
-
-	for (i = 0; i < len; i++)
-		printf("%s: %"PRIu64"\n",
-			xstats_names[i].name,
-			values[i]);
-
-	printf("%s############################\n", nic_stats_border);
-err:
-	free(values);
-	free(xstats_names);
-}
-
-static void
 nic_xstats_display(uint8_t port_id)
 {
 	struct rte_eth_xstat_name *xstats_names;
-	uint64_t *values;
+	struct rte_eth_xstat *xstats;
 	int len, ret, i;
 	static const char *nic_stats_border = "########################";
 
-	len = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
+	len = rte_eth_xstats_get_names(port_id, NULL, 0);
 	if (len < 0) {
 		printf("Cannot get xstats count\n");
 		return;
 	}
-	values = malloc(sizeof(values) * len);
-	if (values == NULL) {
+	xstats = malloc(sizeof(xstats[0]) * len);
+	if (xstats == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		return;
 	}
@@ -495,11 +372,11 @@ nic_xstats_display(uint8_t port_id)
 	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
 	if (xstats_names == NULL) {
 		printf("Cannot allocate memory for xstat names\n");
-		free(values);
+		free(xstats);
 		return;
 	}
 	if (len != rte_eth_xstats_get_names(
-			port_id, xstats_names, len, NULL)) {
+			port_id, xstats_names, len)) {
 		printf("Cannot get xstat names\n");
 		goto err;
 	}
@@ -508,7 +385,7 @@ nic_xstats_display(uint8_t port_id)
 			   port_id);
 	printf("%s############################\n",
 			   nic_stats_border);
-	ret = rte_eth_xstats_get(port_id, NULL, values, len);
+	ret = rte_eth_xstats_get(port_id, xstats, len);
 	if (ret < 0 || ret > len) {
 		printf("Cannot get xstats\n");
 		goto err;
@@ -524,18 +401,18 @@ nic_xstats_display(uint8_t port_id)
 						  xstats_names[i].name);
 			sprintf(buf, "PUTVAL %s/dpdkstat-port.%u/%s-%s N:%"
 				PRIu64"\n", host_id, port_id, counter_type,
-				xstats_names[i].name, values[i]);
+				xstats_names[i].name, xstats[i].value);
 			write(stdout_fd, buf, strlen(buf));
 		} else {
 			printf("%s: %"PRIu64"\n", xstats_names[i].name,
-					values[i]);
+			       xstats[i].value);
 		}
 	}
 
 	printf("%s############################\n",
 			   nic_stats_border);
 err:
-	free(values);
+	free(xstats);
 	free(xstats_names);
 }
 
@@ -674,11 +551,6 @@ main(int argc, char **argv)
 				nic_stats_clear(i);
 			else if (reset_xstats)
 				nic_xstats_clear(i);
-			else if (enable_xstats_name)
-				nic_xstats_by_name_display(i, xstats_name);
-			else if (nb_xstats_ids > 0)
-				nic_xstats_by_ids_display(i, xstats_ids,
-						nb_xstats_ids);
 			else if (enable_metrics)
 				metrics_display(i);
 		}
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index ef07925..4d873cd 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -264,9 +264,9 @@ nic_stats_clear(portid_t port_id)
 void
 nic_xstats_display(portid_t port_id)
 {
+	struct rte_eth_xstat *xstats;
 	int cnt_xstats, idx_xstat;
 	struct rte_eth_xstat_name *xstats_names;
-	uint64_t *values;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
 	if (!rte_eth_dev_is_valid_port(port_id)) {
@@ -275,7 +275,7 @@ nic_xstats_display(portid_t port_id)
 	}
 
 	/* Get count */
-	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
 	if (cnt_xstats  < 0) {
 		printf("Error: Cannot get count of xstats\n");
 		return;
@@ -288,24 +288,23 @@ nic_xstats_display(portid_t port_id)
 		return;
 	}
 	if (cnt_xstats != rte_eth_xstats_get_names(
-			port_id, xstats_names, cnt_xstats, NULL)) {
+			port_id, xstats_names, cnt_xstats)) {
 		printf("Error: Cannot get xstats lookup\n");
 		free(xstats_names);
 		return;
 	}
 
 	/* Get stats themselves */
-	values = malloc(sizeof(values) * cnt_xstats);
-	if (values == NULL) {
+	xstats = malloc(sizeof(struct rte_eth_xstat) * cnt_xstats);
+	if (xstats == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		free(xstats_names);
 		return;
 	}
-	if (cnt_xstats != rte_eth_xstats_get(port_id, NULL, values,
-			cnt_xstats)) {
+	if (cnt_xstats != rte_eth_xstats_get(port_id, xstats, cnt_xstats)) {
 		printf("Error: Unable to get xstats\n");
 		free(xstats_names);
-		free(values);
+		free(xstats);
 		return;
 	}
 
@@ -313,9 +312,9 @@ nic_xstats_display(portid_t port_id)
 	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++)
 		printf("%s: %"PRIu64"\n",
 			xstats_names[idx_xstat].name,
-			values[idx_xstat]);
+			xstats[idx_xstat].value);
 	free(xstats_names);
-	free(values);
+	free(xstats);
 }
 
 void
diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index a1a758b..e48c121 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -334,21 +334,24 @@ The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK
 Extended Statistics API
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-The extended statistics API allows a PMD to expose all statistics that are
-available to it, including statistics that are unique to the device.
-Each statistic has three properties ``name``, ``id`` and ``value``:
-
-* ``name``: A human readable string formatted by the scheme detailed below.
-* ``id``: An integer that represents only that statistic.
-* ``value``: A unsigned 64-bit integer that is the value of the statistic.
-
-Note that extended statistic identifiers are
+The extended statistics API allows each individual PMD to expose a unique set
+of statistics. Accessing these from application programs is done via two
+functions:
+
+* ``rte_eth_xstats_get``: Fills in an array of ``struct rte_eth_xstat``
+  with extended statistics.
+* ``rte_eth_xstats_get_names``: Fills in an array of
+  ``struct rte_eth_xstat_name`` with extended statistic name lookup
+  information.
+
+Each ``struct rte_eth_xstat`` contains an identifier and value pair, and
+each ``struct rte_eth_xstat_name`` contains a string. Each identifier
+within the ``struct rte_eth_xstat`` lookup array must have a corresponding
+entry in the ``struct rte_eth_xstat_name`` lookup array. Within the latter
+the index of the entry is the identifier the string is associated with.
+These identifiers, as well as the number of extended statistic exposed, must
+remain constant during runtime. Note that extended statistic identifiers are
 driver-specific, and hence might not be the same for different ports.
-The API consists of various ``rte_eth_xstats_*()`` functions, and allows an
-application to be flexible in how it retrieves statistics.
-
-Scheme for Human Readable Names
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 A naming scheme exists for the strings exposed to clients of the API. This is
 to allow scraping of the API for statistics of interest. The naming scheme uses
@@ -360,8 +363,8 @@ strings split by a single underscore ``_``. The scheme is as follows:
 * detail n
 * unit
 
-Examples of common statistics xstats strings, formatted to comply to the
-above scheme:
+Examples of common statistics xstats strings, formatted to comply to the scheme
+proposed above:
 
 * ``rx_bytes``
 * ``rx_crc_errors``
@@ -375,7 +378,7 @@ associated with the receive side of the NIC.  The second component ``packets``
 indicates that the unit of measure is packets.
 
 A more complicated example: ``tx_size_128_to_255_packets``. In this example,
-``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc. are
+``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc are
 more details, and ``packets`` indicates that this is a packet counter.
 
 Some additions in the metadata scheme are as follows:
@@ -389,139 +392,3 @@ Some additions in the metadata scheme are as follows:
 An example where queue numbers are used is as follows: ``tx_q7_bytes`` which
 indicates this statistic applies to queue number 7, and represents the number
 of transmitted bytes on that queue.
-
-API Design
-^^^^^^^^^^
-
-The xstats API uses the ``name``, ``id``, and ``value`` to allow performant
-lookup of specific statistics. Performant lookup means two things;
-
-* No string comparisons with the ``name`` of the statistic in fast-path
-* Allow requesting of only the statistics of interest
-
-The API ensures these requirements are met by mapping the ``name`` of the
-statistic to a unique ``id``, which is used as a key for lookup in the fast-path.
-The API allows applications to request an array of ``id`` values, so that the
-PMD only performs the required calculations. Expected usage is that the
-application scans the ``name`` of each statistic, and caches the ``id``
-if it has an interest in that statistic. On the fast-path, the integer can be used
-to retrieve the actual ``value`` of the statistic that the ``id`` represents.
-
-API Functions
-^^^^^^^^^^^^^
-
-The API is built out of a small number of functions, which can be used to
-retrieve the number of statistics and the names, IDs and values of those
-statistics.
-
-* ``rte_eth_xstats_get_names()``: returns the names of the statistics. When given a
-  ``NULL`` parameter the function returns the number of statistics that are available.
-
-* ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches
-  ``xstat_name``. If found, the ``id`` integer is set.
-
-* ``rte_eth_xstats_get()``: Fills in an array of ``uint64_t`` values
-  with matching the provided ``ids`` array. If the ``ids`` array is NULL, it
-  returns all statistics that are available.
-
-
-Application Usage
-^^^^^^^^^^^^^^^^^
-
-Imagine an application that wants to view the dropped packet count. If no
-packets are dropped, the application does not read any other metrics for
-performance reasons. If packets are dropped, the application has a particular
-set of statistics that it requests. This "set" of statistics allows the app to
-decide what next steps to perform. The following code-snippets show how the
-xstats API can be used to achieve this goal.
-
-First step is to get all statistics names and list them:
-
-.. code-block:: c
-
-    struct rte_eth_xstat_name *xstats_names;
-    uint64_t *values;
-    int len, i;
-
-    /* Get number of stats */
-    len = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
-    if (len < 0) {
-        printf("Cannot get xstats count\n");
-        goto err;
-    }
-
-    xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
-    if (xstats_names == NULL) {
-        printf("Cannot allocate memory for xstat names\n");
-        goto err;
-    }
-
-    /* Retrieve xstats names, passing NULL for IDs to return all statistics */
-    if (len != rte_eth_xstats_get_names(port_id, xstats_names, NULL, len)) {
-        printf("Cannot get xstat names\n");
-        goto err;
-    }
-
-    values = malloc(sizeof(values) * len);
-    if (values == NULL) {
-        printf("Cannot allocate memory for xstats\n");
-        goto err;
-    }
-
-    /* Getting xstats values */
-    if (len != rte_eth_xstats_get(port_id, NULL, values, len)) {
-        printf("Cannot get xstat values\n");
-        goto err;
-    }
-
-    /* Print all xstats names and values */
-    for (i = 0; i < len; i++) {
-        printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]);
-    }
-
-The application has access to the names of all of the statistics that the PMD
-exposes. The application can decide which statistics are of interest, cache the
-ids of those statistics by looking up the name as follows:
-
-.. code-block:: c
-
-    uint64_t id;
-    uint64_t value;
-    const char *xstat_name = "rx_errors";
-
-    if(!rte_eth_xstats_get_id_by_name(port_id, xstat_name, &id)) {
-        rte_eth_xstats_get(port_id, &id, &value, 1);
-        printf("%s: %"PRIu64"\n", xstat_name, value);
-    }
-    else {
-        printf("Cannot find xstats with a given name\n");
-        goto err;
-    }
-
-The API provides flexibility to the application so that it can look up multiple
-statistics using an array containing multiple ``id`` numbers. This reduces the
-function call overhead of retrieving statistics, and makes lookup of multiple
-statistics simpler for the application.
-
-.. code-block:: c
-
-    #define APP_NUM_STATS 4
-    /* application cached these ids previously; see above */
-    uint64_t ids_array[APP_NUM_STATS] = {3,4,7,21};
-    uint64_t value_array[APP_NUM_STATS];
-
-    /* Getting multiple xstats values from array of IDs */
-    rte_eth_xstats_get(port_id, ids_array, value_array, APP_NUM_STATS);
-
-    uint32_t i;
-    for(i = 0; i < APP_NUM_STATS; i++) {
-        printf("%d: %"PRIu64"\n", ids_array[i], value_array[i]);
-    }
-
-
-This array lookup API for xstats allows the application create multiple
-"groups" of statistics, and look up the values of those IDs using a single API
-call. As an end result, the application is able to achieve its goal of
-monitoring a single statistic ("rx_errors" in this case), and if that shows
-packets being dropped, it can easily retrieve a "set" of statistics using the
-IDs array parameter to ``rte_eth_xstats_get`` function.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index ad20e86..dcd55ff 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -450,15 +450,6 @@ API Changes
   * The vhost public header file ``rte_virtio_net.h`` is renamed to
     ``rte_vhost.h``
 
-* **Reworked rte_ethdev library**
-
-  * Changed set of input parameters for ``rte_eth_xstats_get`` and ``rte_eth_xstats_get_names`` functions.
-
-  * Added new functions ``rte_eth_xstats_get_all`` and ``rte_eth_xstats_get_names_all to provide backward compatibility for
-    ``rte_eth_xstats_get`` and ``rte_eth_xstats_get_names``
-
-  * Added new function ``rte_eth_xstats_get_id_by_name``
-
 
 ABI Changes
 -----------
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index ca9f98c..137780e 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -116,13 +116,9 @@ static void eth_igb_stats_get(struct rte_eth_dev *dev,
 				struct rte_eth_stats *rte_stats);
 static int eth_igb_xstats_get(struct rte_eth_dev *dev,
 			      struct rte_eth_xstat *xstats, unsigned n);
-static int eth_igb_xstats_get_by_ids(struct rte_eth_dev *dev, uint64_t *ids,
-		uint64_t *values, unsigned int n);
 static int eth_igb_xstats_get_names(struct rte_eth_dev *dev,
-		struct rte_eth_xstat_name *xstats_names, unsigned int limit);
-static int eth_igb_xstats_get_names_by_ids(struct rte_eth_dev *dev,
-		struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
-		unsigned int limit);
+				    struct rte_eth_xstat_name *xstats_names,
+				    unsigned int limit);
 static void eth_igb_stats_reset(struct rte_eth_dev *dev);
 static void eth_igb_xstats_reset(struct rte_eth_dev *dev);
 static int eth_igb_fw_version_get(struct rte_eth_dev *dev,
@@ -393,8 +389,6 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.link_update          = eth_igb_link_update,
 	.stats_get            = eth_igb_stats_get,
 	.xstats_get           = eth_igb_xstats_get,
-	.xstats_get_by_ids    = eth_igb_xstats_get_by_ids,
-	.xstats_get_names_by_ids = eth_igb_xstats_get_names_by_ids,
 	.xstats_get_names     = eth_igb_xstats_get_names,
 	.stats_reset          = eth_igb_stats_reset,
 	.xstats_reset         = eth_igb_xstats_reset,
@@ -1868,41 +1862,6 @@ static int eth_igb_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 	return IGB_NB_XSTATS;
 }
 
-static int eth_igb_xstats_get_names_by_ids(struct rte_eth_dev *dev,
-		struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
-		unsigned int limit)
-{
-	unsigned int i;
-
-	if (!ids) {
-		if (xstats_names == NULL)
-			return IGB_NB_XSTATS;
-
-		for (i = 0; i < IGB_NB_XSTATS; i++)
-			snprintf(xstats_names[i].name,
-					sizeof(xstats_names[i].name),
-					"%s", rte_igb_stats_strings[i].name);
-
-		return IGB_NB_XSTATS;
-
-	} else {
-		struct rte_eth_xstat_name xstats_names_copy[IGB_NB_XSTATS];
-
-		eth_igb_xstats_get_names_by_ids(dev, xstats_names_copy, NULL,
-				IGB_NB_XSTATS);
-
-		for (i = 0; i < limit; i++) {
-			if (ids[i] >= IGB_NB_XSTATS) {
-				PMD_INIT_LOG(ERR, "id value isn't valid");
-				return -1;
-			}
-			strcpy(xstats_names[i].name,
-					xstats_names_copy[ids[i]].name);
-		}
-		return limit;
-	}
-}
-
 static int
 eth_igb_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 		   unsigned n)
@@ -1933,53 +1892,6 @@ eth_igb_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 	return IGB_NB_XSTATS;
 }
 
-static int
-eth_igb_xstats_get_by_ids(struct rte_eth_dev *dev, uint64_t *ids,
-		uint64_t *values, unsigned int n)
-{
-	unsigned int i;
-
-	if (!ids) {
-		struct e1000_hw *hw =
-			E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-		struct e1000_hw_stats *hw_stats =
-			E1000_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
-
-		if (n < IGB_NB_XSTATS)
-			return IGB_NB_XSTATS;
-
-		igb_read_stats_registers(hw, hw_stats);
-
-		/* If this is a reset xstats is NULL, and we have cleared the
-		 * registers by reading them.
-		 */
-		if (!values)
-			return 0;
-
-		/* Extended stats */
-		for (i = 0; i < IGB_NB_XSTATS; i++)
-			values[i] = *(uint64_t *)(((char *)hw_stats) +
-					rte_igb_stats_strings[i].offset);
-
-		return IGB_NB_XSTATS;
-
-	} else {
-		uint64_t values_copy[IGB_NB_XSTATS];
-
-		eth_igb_xstats_get_by_ids(dev, NULL, values_copy,
-				IGB_NB_XSTATS);
-
-		for (i = 0; i < n; i++) {
-			if (ids[i] >= IGB_NB_XSTATS) {
-				PMD_INIT_LOG(ERR, "id value isn't valid");
-				return -1;
-			}
-			values[i] = values_copy[ids[i]];
-		}
-		return n;
-	}
-}
-
 static void
 igbvf_read_stats_registers(struct e1000_hw *hw, struct e1000_vf_stats *hw_stats)
 {
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7b856bb..c226e0a 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -182,20 +182,12 @@ static int ixgbe_dev_xstats_get(struct rte_eth_dev *dev,
 				struct rte_eth_xstat *xstats, unsigned n);
 static int ixgbevf_dev_xstats_get(struct rte_eth_dev *dev,
 				  struct rte_eth_xstat *xstats, unsigned n);
-static int
-ixgbe_dev_xstats_get_by_ids(struct rte_eth_dev *dev, uint64_t *ids,
-		uint64_t *values, unsigned int n);
 static void ixgbe_dev_stats_reset(struct rte_eth_dev *dev);
 static void ixgbe_dev_xstats_reset(struct rte_eth_dev *dev);
 static int ixgbe_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, __rte_unused unsigned limit);
 static int ixgbevf_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, __rte_unused unsigned limit);
-static int ixgbe_dev_xstats_get_names_by_ids(
-	__rte_unused struct rte_eth_dev *dev,
-	struct rte_eth_xstat_name *xstats_names,
-	uint64_t *ids,
-	unsigned int limit);
 static int ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev *eth_dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -531,11 +523,9 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.link_update          = ixgbe_dev_link_update,
 	.stats_get            = ixgbe_dev_stats_get,
 	.xstats_get           = ixgbe_dev_xstats_get,
-	.xstats_get_by_ids    = ixgbe_dev_xstats_get_by_ids,
 	.stats_reset          = ixgbe_dev_stats_reset,
 	.xstats_reset         = ixgbe_dev_xstats_reset,
 	.xstats_get_names     = ixgbe_dev_xstats_get_names,
-	.xstats_get_names_by_ids = ixgbe_dev_xstats_get_names_by_ids,
 	.queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
 	.fw_version_get       = ixgbe_fw_version_get,
 	.dev_infos_get        = ixgbe_dev_info_get,
@@ -3192,84 +3182,6 @@ static int ixgbe_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 	return cnt_stats;
 }
 
-static int ixgbe_dev_xstats_get_names_by_ids(
-	__rte_unused struct rte_eth_dev *dev,
-	struct rte_eth_xstat_name *xstats_names,
-	uint64_t *ids,
-	unsigned int limit)
-{
-	if (!ids) {
-		const unsigned int cnt_stats = ixgbe_xstats_calc_num();
-		unsigned int stat, i, count;
-
-		if (xstats_names != NULL) {
-			count = 0;
-
-			/* Note: limit >= cnt_stats checked upstream
-			 * in rte_eth_xstats_names()
-			 */
-
-			/* Extended stats from ixgbe_hw_stats */
-			for (i = 0; i < IXGBE_NB_HW_STATS; i++) {
-				snprintf(xstats_names[count].name,
-					sizeof(xstats_names[count].name),
-					"%s",
-					rte_ixgbe_stats_strings[i].name);
-				count++;
-			}
-
-			/* MACsec Stats */
-			for (i = 0; i < IXGBE_NB_MACSEC_STATS; i++) {
-				snprintf(xstats_names[count].name,
-					sizeof(xstats_names[count].name),
-					"%s",
-					rte_ixgbe_macsec_strings[i].name);
-				count++;
-			}
-
-			/* RX Priority Stats */
-			for (stat = 0; stat < IXGBE_NB_RXQ_PRIO_STATS; stat++) {
-				for (i = 0; i < IXGBE_NB_RXQ_PRIO_VALUES; i++) {
-					snprintf(xstats_names[count].name,
-					    sizeof(xstats_names[count].name),
-					    "rx_priority%u_%s", i,
-					    rte_ixgbe_rxq_strings[stat].name);
-					count++;
-				}
-			}
-
-			/* TX Priority Stats */
-			for (stat = 0; stat < IXGBE_NB_TXQ_PRIO_STATS; stat++) {
-				for (i = 0; i < IXGBE_NB_TXQ_PRIO_VALUES; i++) {
-					snprintf(xstats_names[count].name,
-					    sizeof(xstats_names[count].name),
-					    "tx_priority%u_%s", i,
-					    rte_ixgbe_txq_strings[stat].name);
-					count++;
-				}
-			}
-		}
-		return cnt_stats;
-	}
-
-	uint16_t i;
-	uint16_t size = ixgbe_xstats_calc_num();
-	struct rte_eth_xstat_name xstats_names_copy[size];
-
-	ixgbe_dev_xstats_get_names_by_ids(dev, xstats_names_copy, NULL,
-			size);
-
-	for (i = 0; i < limit; i++) {
-		if (ids[i] >= size) {
-			PMD_INIT_LOG(ERR, "id value isn't valid");
-			return -1;
-		}
-		strcpy(xstats_names[i].name,
-				xstats_names_copy[ids[i]].name);
-	}
-	return limit;
-}
-
 static int ixgbevf_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned limit)
 {
@@ -3360,97 +3272,6 @@ ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstat *xstats,
 	return count;
 }
 
-static int
-ixgbe_dev_xstats_get_by_ids(struct rte_eth_dev *dev, uint64_t *ids,
-		uint64_t *values, unsigned int n)
-{
-	if (!ids) {
-		struct ixgbe_hw *hw =
-				IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-		struct ixgbe_hw_stats *hw_stats =
-				IXGBE_DEV_PRIVATE_TO_STATS(
-						dev->data->dev_private);
-		struct ixgbe_macsec_stats *macsec_stats =
-				IXGBE_DEV_PRIVATE_TO_MACSEC_STATS(
-					dev->data->dev_private);
-		uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
-		unsigned int i, stat, count = 0;
-
-		count = ixgbe_xstats_calc_num();
-
-		if (!ids && n < count)
-			return count;
-
-		total_missed_rx = 0;
-		total_qbrc = 0;
-		total_qprc = 0;
-		total_qprdc = 0;
-
-		ixgbe_read_stats_registers(hw, hw_stats, macsec_stats,
-				&total_missed_rx, &total_qbrc, &total_qprc,
-				&total_qprdc);
-
-		/* If this is a reset xstats is NULL, and we have cleared the
-		 * registers by reading them.
-		 */
-		if (!ids && !values)
-			return 0;
-
-		/* Extended stats from ixgbe_hw_stats */
-		count = 0;
-		for (i = 0; i < IXGBE_NB_HW_STATS; i++) {
-			values[count] = *(uint64_t *)(((char *)hw_stats) +
-					rte_ixgbe_stats_strings[i].offset);
-			count++;
-		}
-
-		/* MACsec Stats */
-		for (i = 0; i < IXGBE_NB_MACSEC_STATS; i++) {
-			values[count] = *(uint64_t *)(((char *)macsec_stats) +
-					rte_ixgbe_macsec_strings[i].offset);
-			count++;
-		}
-
-		/* RX Priority Stats */
-		for (stat = 0; stat < IXGBE_NB_RXQ_PRIO_STATS; stat++) {
-			for (i = 0; i < IXGBE_NB_RXQ_PRIO_VALUES; i++) {
-				values[count] =
-					*(uint64_t *)(((char *)hw_stats) +
-					rte_ixgbe_rxq_strings[stat].offset +
-					(sizeof(uint64_t) * i));
-				count++;
-			}
-		}
-
-		/* TX Priority Stats */
-		for (stat = 0; stat < IXGBE_NB_TXQ_PRIO_STATS; stat++) {
-			for (i = 0; i < IXGBE_NB_TXQ_PRIO_VALUES; i++) {
-				values[count] =
-					*(uint64_t *)(((char *)hw_stats) +
-					rte_ixgbe_txq_strings[stat].offset +
-					(sizeof(uint64_t) * i));
-				count++;
-			}
-		}
-		return count;
-	}
-
-	uint16_t i;
-	uint16_t size = ixgbe_xstats_calc_num();
-	uint64_t values_copy[size];
-
-	ixgbe_dev_xstats_get_by_ids(dev, NULL, values_copy, size);
-
-	for (i = 0; i < n; i++) {
-		if (ids[i] >= size) {
-			PMD_INIT_LOG(ERR, "id value isn't valid");
-			return -1;
-		}
-		values[i] = values_copy[ids[i]];
-	}
-	return n;
-}
-
 static void
 ixgbe_dev_xstats_reset(struct rte_eth_dev *dev)
 {
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 89f6514..9922430 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1360,19 +1360,12 @@ get_xstats_count(uint8_t port_id)
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
-	if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
-		count = (*dev->dev_ops->xstats_get_names_by_ids)(dev, NULL,
-				NULL, 0);
-		if (count < 0)
-			return count;
-	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
 			return count;
 	} else
 		count = 0;
-
 	count += RTE_NB_STATS;
 	count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
 		 RTE_NB_RXQ_STATS;
@@ -1382,367 +1375,150 @@ get_xstats_count(uint8_t port_id)
 }
 
 int
-rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
-		uint64_t *id)
-{
-	int cnt_xstats, idx_xstat;
-
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-
-	if (!id) {
-		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
-		return -1;
-	}
-
-	if (!xstat_name) {
-		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
-		return -1;
-	}
-
-	/* Get count */
-	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
-	if (cnt_xstats  < 0) {
-		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
-		return -1;
-	}
-
-	/* Get id-name lookup table */
-	struct rte_eth_xstat_name xstats_names[cnt_xstats];
-
-	if (cnt_xstats != rte_eth_xstats_get_names(
-			port_id, xstats_names, cnt_xstats, NULL)) {
-		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
-		return -1;
-	}
-
-	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
-		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
-			*id = idx_xstat;
-			return 0;
-		};
-	}
-
-	return -EINVAL;
-}
-
-int
-rte_eth_xstats_get_names_v1607(uint8_t port_id,
+rte_eth_xstats_get_names(uint8_t port_id,
 	struct rte_eth_xstat_name *xstats_names,
 	unsigned int size)
 {
-	return rte_eth_xstats_get_names(port_id, xstats_names, size, NULL);
-}
-VERSION_SYMBOL(rte_eth_xstats_get_names, _v1607, 16.07);
-
-int
-rte_eth_xstats_get_names_v1705(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names, unsigned int size,
-	uint64_t *ids)
-{
-	/* Get all xstats */
-	if (!ids) {
-		struct rte_eth_dev *dev;
-		int cnt_used_entries;
-		int cnt_expected_entries;
-		int cnt_driver_entries;
-		uint32_t idx, id_queue;
-		uint16_t num_q;
+	struct rte_eth_dev *dev;
+	int cnt_used_entries;
+	int cnt_expected_entries;
+	int cnt_driver_entries;
+	uint32_t idx, id_queue;
+	uint16_t num_q;
 
-		cnt_expected_entries = get_xstats_count(port_id);
-		if (xstats_names == NULL || cnt_expected_entries < 0 ||
-				(int)size < cnt_expected_entries)
-			return cnt_expected_entries;
+	cnt_expected_entries = get_xstats_count(port_id);
+	if (xstats_names == NULL || cnt_expected_entries < 0 ||
+			(int)size < cnt_expected_entries)
+		return cnt_expected_entries;
 
-		/* port_id checked in get_xstats_count() */
-		dev = &rte_eth_devices[port_id];
-		cnt_used_entries = 0;
+	/* port_id checked in get_xstats_count() */
+	dev = &rte_eth_devices[port_id];
+	cnt_used_entries = 0;
 
-		for (idx = 0; idx < RTE_NB_STATS; idx++) {
+	for (idx = 0; idx < RTE_NB_STATS; idx++) {
+		snprintf(xstats_names[cnt_used_entries].name,
+			sizeof(xstats_names[0].name),
+			"%s", rte_stats_strings[idx].name);
+		cnt_used_entries++;
+	}
+	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+	for (id_queue = 0; id_queue < num_q; id_queue++) {
+		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
-				"%s", rte_stats_strings[idx].name);
+				"rx_q%u%s",
+				id_queue, rte_rxq_stats_strings[idx].name);
 			cnt_used_entries++;
 		}
-		num_q = RTE_MIN(dev->data->nb_rx_queues,
-				RTE_ETHDEV_QUEUE_STAT_CNTRS);
-		for (id_queue = 0; id_queue < num_q; id_queue++) {
-			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
-				snprintf(xstats_names[cnt_used_entries].name,
-					sizeof(xstats_names[0].name),
-					"rx_q%u%s",
-					id_queue,
-					rte_rxq_stats_strings[idx].name);
-				cnt_used_entries++;
-			}
 
-		}
-		num_q = RTE_MIN(dev->data->nb_tx_queues,
-				RTE_ETHDEV_QUEUE_STAT_CNTRS);
-		for (id_queue = 0; id_queue < num_q; id_queue++) {
-			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
-				snprintf(xstats_names[cnt_used_entries].name,
-					sizeof(xstats_names[0].name),
-					"tx_q%u%s",
-					id_queue,
-					rte_txq_stats_strings[idx].name);
-				cnt_used_entries++;
-			}
-		}
-
-		if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
-			/* If there are any driver-specific xstats, append them
-			 * to end of list.
-			 */
-			cnt_driver_entries =
-				(*dev->dev_ops->xstats_get_names_by_ids)(
-				dev,
-				xstats_names + cnt_used_entries,
-				NULL,
-				size - cnt_used_entries);
-			if (cnt_driver_entries < 0)
-				return cnt_driver_entries;
-			cnt_used_entries += cnt_driver_entries;
-
-		} else if (dev->dev_ops->xstats_get_names != NULL) {
-			/* If there are any driver-specific xstats, append them
-			 * to end of list.
-			 */
-			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
-				dev,
-				xstats_names + cnt_used_entries,
-				size - cnt_used_entries);
-			if (cnt_driver_entries < 0)
-				return cnt_driver_entries;
-			cnt_used_entries += cnt_driver_entries;
-		}
-
-		return cnt_used_entries;
 	}
-	/* Get only xstats given by IDS */
-	else {
-		uint16_t len, i;
-		struct rte_eth_xstat_name *xstats_names_copy;
-
-		len = rte_eth_xstats_get_names_v1705(port_id, NULL, 0, NULL);
-
-		xstats_names_copy =
-				malloc(sizeof(struct rte_eth_xstat_name) * len);
-		if (!xstats_names_copy) {
-			RTE_PMD_DEBUG_TRACE(
-			     "ERROR: can't allocate memory for values_copy\n");
-			free(xstats_names_copy);
-			return -1;
-		}
-
-		rte_eth_xstats_get_names_v1705(port_id, xstats_names_copy,
-				len, NULL);
-
-		for (i = 0; i < size; i++) {
-			if (ids[i] >= len) {
-				RTE_PMD_DEBUG_TRACE(
-					"ERROR: id value isn't valid\n");
-				return -1;
-			}
-			strcpy(xstats_names[i].name,
-					xstats_names_copy[ids[i]].name);
+	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+	for (id_queue = 0; id_queue < num_q; id_queue++) {
+		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+			snprintf(xstats_names[cnt_used_entries].name,
+				sizeof(xstats_names[0].name),
+				"tx_q%u%s",
+				id_queue, rte_txq_stats_strings[idx].name);
+			cnt_used_entries++;
 		}
-		free(xstats_names_copy);
-		return size;
 	}
-}
-BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names, _v1705, 17.05);
-
-MAP_STATIC_SYMBOL(int
-		rte_eth_xstats_get_names(uint8_t port_id,
-			struct rte_eth_xstat_name *xstats_names,
-			unsigned int size,
-			uint64_t *ids), rte_eth_xstats_get_names_v1705);
-
-/* retrieve ethdev extended statistics */
-int
-rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned int n)
-{
-	uint64_t *values_copy;
-	uint16_t size, i;
 
-	values_copy = malloc(sizeof(values_copy) * n);
-	if (!values_copy) {
-		RTE_PMD_DEBUG_TRACE(
-				"ERROR: Cannot allocate memory for xstats\n");
-		return -1;
+	if (dev->dev_ops->xstats_get_names != NULL) {
+		/* If there are any driver-specific xstats, append them
+		 * to end of list.
+		 */
+		cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+			dev,
+			xstats_names + cnt_used_entries,
+			size - cnt_used_entries);
+		if (cnt_driver_entries < 0)
+			return cnt_driver_entries;
+		cnt_used_entries += cnt_driver_entries;
 	}
-	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	for (i = 0; i < n; i++) {
-		xstats[i].id = i;
-		xstats[i].value = values_copy[i];
-	}
-	free(values_copy);
-	return size;
+	return cnt_used_entries;
 }
-VERSION_SYMBOL(rte_eth_xstats_get, _v22, 2.2);
 
 /* retrieve ethdev extended statistics */
 int
-rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
 	unsigned int n)
 {
-	/* If need all xstats */
-	if (!ids) {
-		struct rte_eth_stats eth_stats;
-		struct rte_eth_dev *dev;
-		unsigned int count = 0, i, q;
-		signed int xcount = 0;
-		uint64_t val, *stats_ptr;
-		uint16_t nb_rxqs, nb_txqs;
-
-		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-		dev = &rte_eth_devices[port_id];
-
-		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues,
-				RTE_ETHDEV_QUEUE_STAT_CNTRS);
-		nb_txqs = RTE_MIN(dev->data->nb_tx_queues,
-				RTE_ETHDEV_QUEUE_STAT_CNTRS);
-
-		/* Return generic statistics */
-		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
-			(nb_txqs * RTE_NB_TXQ_STATS);
-
-
-		/* implemented by the driver */
-		if (dev->dev_ops->xstats_get_by_ids != NULL) {
-			/* Retrieve the xstats from the driver at the end of the
-			 * xstats struct. Retrieve all xstats.
-			 */
-			xcount = (*dev->dev_ops->xstats_get_by_ids)(dev,
-					NULL,
-					values ? values + count : NULL,
-					(n > count) ? n - count : 0);
-
-			if (xcount < 0)
-				return xcount;
-		/* implemented by the driver */
-		} else if (dev->dev_ops->xstats_get != NULL) {
-			/* Retrieve the xstats from the driver at the end of the
-			 * xstats struct. Retrieve all xstats.
-			 * Compatibility for PMD without xstats_get_by_ids
-			 */
-			unsigned int size = (n > count) ? n - count : 1;
-			struct rte_eth_xstat xstats[size];
-
-			xcount = (*dev->dev_ops->xstats_get)(dev,
-					values ? xstats : NULL,	size);
-
-			if (xcount < 0)
-				return xcount;
-
-			if (values != NULL)
-				for (i = 0 ; i < (unsigned int)xcount; i++)
-					values[i + count] = xstats[i].value;
-		}
+	struct rte_eth_stats eth_stats;
+	struct rte_eth_dev *dev;
+	unsigned int count = 0, i, q;
+	signed int xcount = 0;
+	uint64_t val, *stats_ptr;
+	uint16_t nb_rxqs, nb_txqs;
 
-		if (n < count + xcount || values == NULL)
-			return count + xcount;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
-		/* now fill the xstats structure */
-		count = 0;
-		rte_eth_stats_get(port_id, &eth_stats);
+	dev = &rte_eth_devices[port_id];
 
-		/* global stats */
-		for (i = 0; i < RTE_NB_STATS; i++) {
-			stats_ptr = RTE_PTR_ADD(&eth_stats,
-						rte_stats_strings[i].offset);
-			val = *stats_ptr;
-			values[count++] = val;
-		}
+	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
 
-		/* per-rxq stats */
-		for (q = 0; q < nb_rxqs; q++) {
-			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
-				stats_ptr = RTE_PTR_ADD(&eth_stats,
-					    rte_rxq_stats_strings[i].offset +
-					    q * sizeof(uint64_t));
-				val = *stats_ptr;
-				values[count++] = val;
-			}
-		}
+	/* Return generic statistics */
+	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+		(nb_txqs * RTE_NB_TXQ_STATS);
 
-		/* per-txq stats */
-		for (q = 0; q < nb_txqs; q++) {
-			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
-				stats_ptr = RTE_PTR_ADD(&eth_stats,
-					    rte_txq_stats_strings[i].offset +
-					    q * sizeof(uint64_t));
-				val = *stats_ptr;
-				values[count++] = val;
-			}
-		}
+	/* implemented by the driver */
+	if (dev->dev_ops->xstats_get != NULL) {
+		/* Retrieve the xstats from the driver at the end of the
+		 * xstats struct.
+		 */
+		xcount = (*dev->dev_ops->xstats_get)(dev,
+				     xstats ? xstats + count : NULL,
+				     (n > count) ? n - count : 0);
 
-		return count + xcount;
+		if (xcount < 0)
+			return xcount;
 	}
-	/* Need only xstats given by IDS array */
-	else {
-		uint16_t i, size;
-		uint64_t *values_copy;
 
-		size = rte_eth_xstats_get_v1705(port_id, NULL, NULL, 0);
+	if (n < count + xcount || xstats == NULL)
+		return count + xcount;
 
-		values_copy = malloc(sizeof(values_copy) * size);
-		if (!values_copy) {
-			RTE_PMD_DEBUG_TRACE(
-			    "ERROR: can't allocate memory for values_copy\n");
-			return -1;
-		}
+	/* now fill the xstats structure */
+	count = 0;
+	rte_eth_stats_get(port_id, &eth_stats);
 
-		rte_eth_xstats_get_v1705(port_id, NULL, values_copy, size);
+	/* global stats */
+	for (i = 0; i < RTE_NB_STATS; i++) {
+		stats_ptr = RTE_PTR_ADD(&eth_stats,
+					rte_stats_strings[i].offset);
+		val = *stats_ptr;
+		xstats[count++].value = val;
+	}
 
-		for (i = 0; i < n; i++) {
-			if (ids[i] >= size) {
-				RTE_PMD_DEBUG_TRACE(
-					"ERROR: id value isn't valid\n");
-				return -1;
-			}
-			values[i] = values_copy[ids[i]];
+	/* per-rxq stats */
+	for (q = 0; q < nb_rxqs; q++) {
+		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+			stats_ptr = RTE_PTR_ADD(&eth_stats,
+					rte_rxq_stats_strings[i].offset +
+					q * sizeof(uint64_t));
+			val = *stats_ptr;
+			xstats[count++].value = val;
 		}
-		free(values_copy);
-		return n;
 	}
-}
-BIND_DEFAULT_SYMBOL(rte_eth_xstats_get, _v1705, 17.05);
-
-MAP_STATIC_SYMBOL(int
-		rte_eth_xstats_get(uint8_t port_id, uint64_t *ids,
-		uint64_t *values, unsigned int n), rte_eth_xstats_get_v1705);
 
-__rte_deprecated int
-rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned int n)
-{
-	uint64_t *values_copy;
-	uint16_t size, i;
-
-	values_copy = malloc(sizeof(values_copy) * n);
-	if (!values_copy) {
-		RTE_PMD_DEBUG_TRACE(
-				"ERROR: Cannot allocate memory for xstats\n");
-		return -1;
+	/* per-txq stats */
+	for (q = 0; q < nb_txqs; q++) {
+		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+			stats_ptr = RTE_PTR_ADD(&eth_stats,
+					rte_txq_stats_strings[i].offset +
+					q * sizeof(uint64_t));
+			val = *stats_ptr;
+			xstats[count++].value = val;
+		}
 	}
-	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	for (i = 0; i < n; i++) {
+	for (i = 0; i < count; i++)
 		xstats[i].id = i;
-		xstats[i].value = values_copy[i];
-	}
-	free(values_copy);
-	return size;
-}
+	/* add an offset to driver-specific stats */
+	for ( ; i < count + xcount; i++)
+		xstats[i].id += count;
 
-__rte_deprecated int
-rte_eth_xstats_get_names_all(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names, unsigned int n)
-{
-	return rte_eth_xstats_get_names(port_id, xstats_names, n, NULL);
+	return count + xcount;
 }
 
 /* reset ethdev extended statistics */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index e0f7ee5..a46290c 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -185,7 +185,6 @@ extern "C" {
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
-#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1122,10 +1121,6 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
-typedef int (*eth_xstats_get_by_ids_t)(struct rte_eth_dev *dev,
-		uint64_t *ids, uint64_t *values, unsigned int n);
-/**< @internal Get extended stats of an Ethernet device. */
-
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1133,17 +1128,6 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
-typedef int (*eth_xstats_get_names_by_ids_t)(struct rte_eth_dev *dev,
-	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
-	unsigned int size);
-/**< @internal Get names of extended stats of an Ethernet device. */
-
-typedef int (*eth_xstats_get_by_name_t)(struct rte_eth_dev *dev,
-		struct rte_eth_xstat_name *xstats_names,
-		struct rte_eth_xstat *xstat,
-		const char *name);
-/**< @internal Get xstat specified by name of an Ethernet device. */
-
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1582,12 +1566,6 @@ struct eth_dev_ops {
 	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
-	eth_xstats_get_by_ids_t    xstats_get_by_ids;
-	/**< Get extended device statistics by ID. */
-	eth_xstats_get_names_by_ids_t xstats_get_names_by_ids;
-	/**< Get name of extended device statistics by ID. */
-	eth_xstats_get_by_name_t   xstats_get_by_name;
-	/**< Get extended device statistics by name. */
 };
 
 /**
@@ -2291,55 +2269,8 @@ int rte_eth_stats_get(uint8_t port_id, struct rte_eth_stats *stats);
  */
 void rte_eth_stats_reset(uint8_t port_id);
 
-
-/**
- * Gets the ID of a statistic from its name.
- *
- * This function searches for the statistics using string compares, and
- * as such should not be used on the fast-path. For fast-path retrieval of
- * specific statistics, store the ID as provided in *id* from this function,
- * and pass the ID to rte_eth_xstats_get()
- *
- * @param port_id The port to look up statistics from
- * @param xstat_name The name of the statistic to return
- * @param[out] id A pointer to an app-supplied uint64_t which should be
- *                set to the ID of the stat if the stat exists.
- * @return
- *    0 on success
- *    -ENODEV for invalid port_id,
- *    -EINVAL if the xstat_name doesn't exist in port_id
- */
-int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
-		uint64_t *id);
-
-/**
- * Retrieve all extended statistics of an Ethernet device.
- *
- * @param port_id
- *   The port identifier of the Ethernet device.
- * @param xstats
- *   A pointer to a table of structure of type *rte_eth_xstat*
- *   to be filled with device statistics ids and values: id is the
- *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
- *   and value is the statistic counter.
- *   This parameter can be set to NULL if n is 0.
- * @param n
- *   The size of the xstats array (number of elements).
- * @return
- *   - A positive value lower or equal to n: success. The return value
- *     is the number of entries filled in the stats table.
- *   - A positive value higher than n: error, the given statistics table
- *     is too small. The return value corresponds to the size that should
- *     be given to succeed. The entries in the table are not valid and
- *     shall not be used by the caller.
- *   - A negative value on error (invalid port id).
- */
-__rte_deprecated
-int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned int n);
-
 /**
- * Retrieve names of all extended statistics of an Ethernet device.
+ * Retrieve names of extended statistics of an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
@@ -2347,7 +2278,7 @@ int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
  *   An rte_eth_xstat_name array of at least *size* elements to
  *   be filled. If set to NULL, the function returns the required number
  *   of elements.
- * @param n
+ * @param size
  *   The size of the xstats_names array (number of elements).
  * @return
  *   - A positive value lower or equal to size: success. The return value
@@ -2358,9 +2289,9 @@ int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-__rte_deprecated
-int rte_eth_xstats_get_names_all(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+int rte_eth_xstats_get_names(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names,
+		unsigned int size);
 
 /**
  * Retrieve extended statistics of an Ethernet device.
@@ -2384,92 +2315,8 @@ int rte_eth_xstats_get_names_all(uint8_t port_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned int n);
-
-/**
- * Retrieve extended statistics of an Ethernet device.
- *
- * @param port_id
- *   The port identifier of the Ethernet device.
- * @param ids
- *   A pointer to an ids array passed by application. This tells wich
- *   statistics values function should retrieve. This parameter
- *   can be set to NULL if n is 0. In this case function will retrieve
- *   all avalible statistics.
- * @param values
- *   A pointer to a table to be filled with device statistics values.
- * @param n
- *   The size of the ids array (number of elements).
- * @return
- *   - A positive value lower or equal to n: success. The return value
- *     is the number of entries filled in the stats table.
- *   - A positive value higher than n: error, the given statistics table
- *     is too small. The return value corresponds to the size that should
- *     be given to succeed. The entries in the table are not valid and
- *     shall not be used by the caller.
- *   - A negative value on error (invalid port id).
- */
-int rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
-	unsigned int n);
-
-int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
-	unsigned int n);
-
-/**
- * Retrieve extended statistics of an Ethernet device.
- *
- * @param port_id
- *   The port identifier of the Ethernet device.
- * @param xstats_names
- *   A pointer to a table of structure of type *rte_eth_xstat*
- *   to be filled with device statistics ids and values: id is the
- *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
- *   and value is the statistic counter.
- *   This parameter can be set to NULL if n is 0.
- * @param n
- *   The size of the xstats array (number of elements).
- * @return
- *   - A positive value lower or equal to n: success. The return value
- *     is the number of entries filled in the stats table.
- *   - A positive value higher than n: error, the given statistics table
- *     is too small. The return value corresponds to the size that should
- *     be given to succeed. The entries in the table are not valid and
- *     shall not be used by the caller.
- *   - A negative value on error (invalid port id).
- */
-int rte_eth_xstats_get_names_v1607(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names, unsigned int n);
-
-/**
- * Retrieve names of extended statistics of an Ethernet device.
- *
- * @param port_id
- *   The port identifier of the Ethernet device.
- * @param xstats_names
- *   An rte_eth_xstat_name array of at least *size* elements to
- *   be filled. If set to NULL, the function returns the required number
- *   of elements.
- * @param ids
- *   IDs array given by app to retrieve specific statistics
- * @param size
- *   The size of the xstats_names array (number of elements).
- * @return
- *   - A positive value lower or equal to size: success. The return value
- *     is the number of entries filled in the stats table.
- *   - A positive value higher than size: error, the given statistics table
- *     is too small. The return value corresponds to the size that should
- *     be given to succeed. The entries in the table are not valid and
- *     shall not be used by the caller.
- *   - A negative value on error (invalid port id).
- */
-int rte_eth_xstats_get_names_v1705(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names, unsigned int size,
-	uint64_t *ids);
-
-int rte_eth_xstats_get_names(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names, unsigned int size,
-	uint64_t *ids);
+int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
+		unsigned int n);
 
 /**
  * Reset extended statistics of an Ethernet device.
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 6e118c4..238c2a1 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -151,10 +151,5 @@ DPDK_17.05 {
 
 	rte_eth_dev_attach_secondary;
 	rte_eth_find_next;
-	rte_eth_xstats_get;
-	rte_eth_xstats_get_all;
-	rte_eth_xstats_get_id_by_name;
-	rte_eth_xstats_get_names;
-	rte_eth_xstats_get_names_all;
 
 } DPDK_17.02;
-- 
2.7.4

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH] doc: postpone ABI change in ethdev
  2017-04-18 15:48  4% [dpdk-dev] [PATCH] doc: postpone ABI change in ethdev Bernard Iremonger
@ 2017-04-26 15:02  4% ` Mcnamara, John
  0 siblings, 0 replies; 200+ results
From: Mcnamara, John @ 2017-04-26 15:02 UTC (permalink / raw)
  To: Iremonger, Bernard, dev; +Cc: Yigit, Ferruh



> -----Original Message-----
> From: Iremonger, Bernard
> Sent: Tuesday, April 18, 2017 4:49 PM
> To: dev@dpdk.org
> Cc: Mcnamara, John <john.mcnamara@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; Iremonger, Bernard <bernard.iremonger@intel.com>
> Subject: [PATCH] doc: postpone ABI change in ethdev
> 
> The change of _rte_eth_dev_callback_process has not been done in 17.05.
> Let's postpone to 17.08.
> 
> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 2/3] doc: add device removal event to release note
  @ 2017-04-25 10:10  4%   ` Gaetan Rivet
  0 siblings, 0 replies; 200+ results
From: Gaetan Rivet @ 2017-04-25 10:10 UTC (permalink / raw)
  To: dev; +Cc: Ferruh Yigit

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 doc/guides/rel_notes/release_17_05.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index eb5b30f..293d3fd 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -298,6 +298,11 @@ New Features
 
   * DES DOCSIS BPI algorithm.
 
+* **Added Device Removal Interrupt.**
+
+  * Added a new ethdev event ``RTE_ETH_DEV_INTR_RMV`` to signify the sudden removal of a device. This
+    event can be advertized by PCI drivers and enabled accordingly.
+
 
 Resolved Issues
 ---------------
@@ -459,7 +464,6 @@ API Changes
 
   * Added new function ``rte_eth_xstats_get_id_by_name``
 
-
 ABI Changes
 -----------
 
-- 
2.1.4

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-24 12:32  3%                   ` Olivier Matz
@ 2017-04-24 12:41  0%                     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2017-04-24 12:41 UTC (permalink / raw)
  To: Kuba Kozak
  Cc: dev, Olivier Matz, harry.van.haaren, deepak.k.jain, Jacek Piasecki

24/04/2017 14:32, Olivier Matz:
> Hi,
> 
> On Thu, 20 Apr 2017 22:31:35 +0200, Thomas Monjalon <thomas@monjalon.net> 
wrote:
> > 13/04/2017 16:59, Kuba Kozak:
> > > Extended xstats API in ethdev library to allow grouping of stats
> > > logically
> > > so they can be retrieved per logical grouping  managed by the
> > > application.
> > > Changed existing functions rte_eth_xstats_get_names and
> > > rte_eth_xstats_get
> > > to use a new list of arguments: array of ids and array of values.
> > > ABI versioning mechanism was used to support backward compatibility.
> > > Introduced two new functions rte_eth_xstats_get_all and
> > > rte_eth_xstats_get_names_all which keeps functionality of the previous
> > > ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
> > > but use new API inside. Both functions marked as deprecated.
> > > Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
> > > xstats ids by its names.
> > > Extended functionality of proc_info application:
> > > --xstats-name NAME: to display single xstat value by NAME
> > > Updated test-pmd application to use new API.
> > 
> > Applied, thanks
> 
> I'm adapting my application to the upcoming dpdk 17.05. I see
> several problems with this patchset:
> 
> - the API of rte_eth_xstats_get() and rte_eth_xstats_get_names()
>   has been modified, and from what I see it was not announced.
>   It looks that ABI is preserved however.
> 
> - the new functions rte_eth_xstats_get_all() and
>   rte_eth_xstats_get_names_all() are marked as deprecated, which
>   looks strange for new functions.
> 
> About the new api:
> 
> int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
>        unsigned int n);
> 
> int rte_eth_xstats_get_names(uint8_t port_id,
>        struct rte_eth_xstat_name *xstats_names, unsigned int size,
>        uint64_t *ids);
> 
> - the argument "id" is not at the same place
> 
> - why having "size" in one function and "n" in the second (it was
>   renamed in the patch)?
> 
> - the argument "id" should be const
> 
> - a table of uint64_t is returned in place of the struct rte_eth_xstat
>   table: if no ids are given, the driver cannot return partial or
>   disordered stats anymore. See
>   513c78ae3fd6 ("ethdev: fix extended statistics name index")
> 
> 
> So, I wonder if it wouldn't be more simple to keep the old
> API intact (it would avoid unannounced breakage). The new feature
> can be implemented in an additional API:
> 
>  rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
>        uint64_t *values, unsigned int size)
>  rte_eth_xstats_get_names_by_id(uint8_t port_id, const uint64_t *ids,
>        struct rte_eth_xstat_name *xstats_names, unsigned int size)
> 
> Or:
> 
>  rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
>        struct rte_eth_xstat *values, unsigned int size)
>  rte_eth_xstats_get_names_by_id(uint8_t port_id, const uint64_t *ids,
>        struct rte_eth_xstat_name *xstats_names, unsigned int size)
> 
>  (which would allow to deprecate the old API, but I'm not sure
>   we need to)
> 
> 
> Can we fix that for 17.05?

Bottom line is that I have skipped the complete review of this patchset
because it was not properly split, preventing a good review.
Lesson learned: I won't accept anymore a patchset which is not split
enough to allow a proper overview.

Back to the issues, please try to fix it quickly or we should revert
it for 17.05-rc3.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-20 20:31  0%                 ` Thomas Monjalon
@ 2017-04-24 12:32  3%                   ` Olivier Matz
  2017-04-24 12:41  0%                     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2017-04-24 12:32 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Kuba Kozak, dev, harry.van.haaren, deepak.k.jain

Hi,

On Thu, 20 Apr 2017 22:31:35 +0200, Thomas Monjalon <thomas@monjalon.net> wrote:
> 13/04/2017 16:59, Kuba Kozak:
> > Extended xstats API in ethdev library to allow grouping of stats logically
> > so they can be retrieved per logical grouping  managed by the application.
> > Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
> > to use a new list of arguments: array of ids and array of values.
> > ABI versioning mechanism was used to support backward compatibility.
> > Introduced two new functions rte_eth_xstats_get_all and
> > rte_eth_xstats_get_names_all which keeps functionality of the previous
> > ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
> > but use new API inside. Both functions marked as deprecated.
> > Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
> > xstats ids by its names.
> > Extended functionality of proc_info application:
> > --xstats-name NAME: to display single xstat value by NAME
> > Updated test-pmd application to use new API.  
> 
> Applied, thanks

I'm adapting my application to the upcoming dpdk 17.05. I see
several problems with this patchset:

- the API of rte_eth_xstats_get() and rte_eth_xstats_get_names()
  has been modified, and from what I see it was not announced.
  It looks that ABI is preserved however.

- the new functions rte_eth_xstats_get_all() and
  rte_eth_xstats_get_names_all() are marked as deprecated, which
  looks strange for new functions.

About the new api:

int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
       unsigned int n);

int rte_eth_xstats_get_names(uint8_t port_id,
       struct rte_eth_xstat_name *xstats_names, unsigned int size,
       uint64_t *ids);

- the argument "id" is not at the same place

- why having "size" in one function and "n" in the second (it was
  renamed in the patch)?

- the argument "id" should be const

- a table of uint64_t is returned in place of the struct rte_eth_xstat
  table: if no ids are given, the driver cannot return partial or
  disordered stats anymore. See
  513c78ae3fd6 ("ethdev: fix extended statistics name index")


So, I wonder if it wouldn't be more simple to keep the old
API intact (it would avoid unannounced breakage). The new feature
can be implemented in an additional API:

 rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
       uint64_t *values, unsigned int size)
 rte_eth_xstats_get_names_by_id(uint8_t port_id, const uint64_t *ids,
       struct rte_eth_xstat_name *xstats_names, unsigned int size)

Or:

 rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
       struct rte_eth_xstat *values, unsigned int size)
 rte_eth_xstats_get_names_by_id(uint8_t port_id, const uint64_t *ids,
       struct rte_eth_xstat_name *xstats_names, unsigned int size)

 (which would allow to deprecate the old API, but I'm not sure
  we need to)


Can we fix that for 17.05?

Regards,
Olivier

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v7 1/3] lib/librte_ether: add support for port reset
  2017-04-20 20:49  3%     ` Thomas Monjalon
@ 2017-04-21  3:20  0%       ` Zhao1, Wei
  0 siblings, 0 replies; 200+ results
From: Zhao1, Wei @ 2017-04-21  3:20 UTC (permalink / raw)
  To: Thomas Monjalon, Lu, Wenzhuo; +Cc: dev

Hi,  Thomas

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Friday, April 21, 2017 4:50 AM
> To: Zhao1, Wei <wei.zhao1@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v7 1/3] lib/librte_ether: add support for port
> reset
> 
> 10/04/2017 05:02, Wei Zhao:
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1509,6 +1512,9 @@ struct eth_dev_ops {
> >  	eth_l2_tunnel_offload_set_t   l2_tunnel_offload_set;
> >  	/** Enable/disable l2 tunnel offload functions. */
> >
> > +	/** Reset device. */
> > +	eth_dev_reset_t dev_reset;
> > +
> >  	eth_set_queue_rate_limit_t set_queue_rate_limit; /**< Set queue
> rate
> > limit. */
> >
> >  	rss_hash_update_t          rss_hash_update; /** Configure RSS hash
> 
> This new op should be added at the end of the structure to avoid ABI issue.

Ok, I will change as your suggestion in next version.

> 
> > protocols. */ @@ -4413,6 +4419,28 @@ int
> > rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
> >
> >  /**
> > + * Reset an ethernet device when it's not working. One scenario is,
> > + after
> > PF + * port is down and up, the related VF port should be reset.
> > + * The API will stop the port, clear the rx/tx queues, re-setup the
> > + rx/tx
> > + * queues, restart the port.
> > + * Before calling this API, APP should stop the rx/tx. When tx is
> > + being
> > stopped, + * APP can drop the packets and release the buffer instead
> > of sending them. + * This function can also do some restore work for
> > the port, for example, it can + * restore the added parameters of
> > vlan,  mac_addrs, promisc_unicast_enabled + * flag and
> promisc_multicast_enabled flag.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + *
> > + * @return
> > + *   - (0) if successful.
> > + *   - (-ENODEV) if port identifier is invalid.
> > + *   - (-ENOTSUP) if hardware doesn't support this function.
> > + */
> > +int
> > +rte_eth_dev_reset(uint8_t port_id);
> 
> The declarations and function definitions should be better placed after start
> and stop functions.

Ok, I will change as your suggestion in next version.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 1/3] lib/librte_ether: add support for port reset
  @ 2017-04-20 20:49  3%     ` Thomas Monjalon
  2017-04-21  3:20  0%       ` Zhao1, Wei
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2017-04-20 20:49 UTC (permalink / raw)
  To: Wei Zhao, Wenzhuo Lu; +Cc: dev

10/04/2017 05:02, Wei Zhao:
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1509,6 +1512,9 @@ struct eth_dev_ops {
>  	eth_l2_tunnel_offload_set_t   l2_tunnel_offload_set;
>  	/** Enable/disable l2 tunnel offload functions. */
> 
> +	/** Reset device. */
> +	eth_dev_reset_t dev_reset;
> +
>  	eth_set_queue_rate_limit_t set_queue_rate_limit; /**< Set queue rate
> limit. */
> 
>  	rss_hash_update_t          rss_hash_update; /** Configure RSS hash

This new op should be added at the end of the structure
to avoid ABI issue.

> protocols. */ @@ -4413,6 +4419,28 @@ int
>  rte_eth_dev_get_name_by_port(uint8_t port_id, char *name);
> 
>  /**
> + * Reset an ethernet device when it's not working. One scenario is, after
> PF + * port is down and up, the related VF port should be reset.
> + * The API will stop the port, clear the rx/tx queues, re-setup the rx/tx
> + * queues, restart the port.
> + * Before calling this API, APP should stop the rx/tx. When tx is being
> stopped, + * APP can drop the packets and release the buffer instead of
> sending them. + * This function can also do some restore work for the port,
> for example, it can + * restore the added parameters of vlan,  mac_addrs,
> promisc_unicast_enabled + * flag and promisc_multicast_enabled flag.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + *
> + * @return
> + *   - (0) if successful.
> + *   - (-ENODEV) if port identifier is invalid.
> + *   - (-ENOTSUP) if hardware doesn't support this function.
> + */
> +int
> +rte_eth_dev_reset(uint8_t port_id);

The declarations and function definitions should be better placed
after start and stop functions.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
                                   ` (2 preceding siblings ...)
  2017-04-13 16:21  0%                 ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Van Haaren, Harry
@ 2017-04-20 20:31  0%                 ` Thomas Monjalon
  2017-04-24 12:32  3%                   ` Olivier Matz
  3 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2017-04-20 20:31 UTC (permalink / raw)
  To: Kuba Kozak; +Cc: dev, harry.van.haaren, deepak.k.jain

13/04/2017 16:59, Kuba Kozak:
> Extended xstats API in ethdev library to allow grouping of stats logically
> so they can be retrieved per logical grouping  managed by the application.
> Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
> to use a new list of arguments: array of ids and array of values.
> ABI versioning mechanism was used to support backward compatibility.
> Introduced two new functions rte_eth_xstats_get_all and
> rte_eth_xstats_get_names_all which keeps functionality of the previous
> ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
> but use new API inside. Both functions marked as deprecated.
> Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
> xstats ids by its names.
> Extended functionality of proc_info application:
> --xstats-name NAME: to display single xstat value by NAME
> Updated test-pmd application to use new API.

Applied, thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 17.08 Roadmap
  2017-04-14 14:30  4% ` Adrien Mazarguil
@ 2017-04-19  1:23  0%   ` Lu, Wenzhuo
  0 siblings, 0 replies; 200+ results
From: Lu, Wenzhuo @ 2017-04-19  1:23 UTC (permalink / raw)
  To: Adrien Mazarguil, O'Driscoll, Tim; +Cc: dev

Hi Adrien,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Friday, April 14, 2017 10:30 PM
> To: O'Driscoll, Tim
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] 17.08 Roadmap
> 
> On Fri, Apr 14, 2017 at 01:27:13PM +0000, O'Driscoll, Tim wrote:
> >
> > API to Configure Queue Regions for RSS: This provides support for
> configuration of queue regions for RSS, so that different traffic classes or
> different packet classification types to be separated into different queues.
> This will be implemented for I40E.
> 
> About this last topic, do you mean devising a new API is necessary or do you
> plan to implement it through rte_flow? I'm asking as it looks like this is what
> the rte_flow RSS action is defined for, see [1]. The mlx5 PMD adds support
> for it in 17.05 [2].
Yes. We'll try to use rte_flow if appropriate. And this RSS action is planned to be used.

> 
> I also intend to submit a few changes to rte_flow:
> 
> - VLAN item fix (according to this thread [3]). Impacts PMDs that implement
>   the VLAN and associated items. Not sure it can be accepted or 17.08 due to
>   ABI breakage, but it will be submitted regardless.
BTW,
Ixgbe has removed the tpid check. http://dpdk.org/dev/patchwork/patch/23637/
I don't find another one using tpid.
So, PMD should be ready. We can remove the tpid from rte now.

> 
> - A new isolated operation mode for rte_flow, guaranteeing applications can
>   expect to receive packets from the flow rules they define *only* for
>   complete control. No more "default" RX rules, RSS and so on. It means
> PMDs
>   are free to reassign these resources to flow rules. No planned ABI
>   breakage.
> 
> [1] http://dpdk.org/doc/guides/prog_guide/rte_flow.html#action-rss
> [2] http://dpdk.org/browse/next/dpdk-next-
> net/commit/?id=1bfb7bb4423349ab13decead0af8ffd006e8e398
> [3] http://dpdk.org/ml/archives/dev/2017-March/060231.html
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: postpone ABI change in ethdev
@ 2017-04-18 15:48  4% Bernard Iremonger
  2017-04-26 15:02  4% ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Bernard Iremonger @ 2017-04-18 15:48 UTC (permalink / raw)
  To: dev; +Cc: john.mcnamara, ferruh.yigit, Bernard Iremonger

The change of _rte_eth_dev_callback_process has not been done in 17.05.
Let's postpone to 17.08.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index a3e7c720c..00e379c00 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -36,8 +36,8 @@ Deprecation Notices
   ``eth_driver``. Similarly, ``rte_pci_driver`` is planned to be removed from
   ``rte_cryptodev_driver`` in 17.05.
 
-* ethdev: An API change is planned for 17.05 for the function
-  ``_rte_eth_dev_callback_process``. In 17.05 the function will return an ``int``
+* ethdev: An API change is planned for 17.08 for the function
+  ``_rte_eth_dev_callback_process``. In 17.08 the function will return an ``int``
   instead of ``void`` and a fourth parameter ``void *ret_param`` will be added.
 
 * ethdev: for 17.05 it is planned to deprecate the following nine rte_eth_dev_*
-- 
2.11.0

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] 17.08 Roadmap
  @ 2017-04-14 14:30  4% ` Adrien Mazarguil
  2017-04-19  1:23  0%   ` Lu, Wenzhuo
  0 siblings, 1 reply; 200+ results
From: Adrien Mazarguil @ 2017-04-14 14:30 UTC (permalink / raw)
  To: O'Driscoll, Tim; +Cc: dev

On Fri, Apr 14, 2017 at 01:27:13PM +0000, O'Driscoll, Tim wrote:
> Below are the features that we're planning to submit for the 17.08 release. We'll submit a patch to update the roadmap page with this info.
> 
> It would be good if others are also willing to share their plans so that we can build up a complete picture of what's planned for 17.08 and make sure there's no duplication.
> 
> Generic QoS API: The proposed API is currently being added to the next-tm repository (http://dpdk.org/browse/next/dpdk-next-tm/). In 17.08, implementations of that API will be added for I40E, IXGBE, and the existing software QoS implementation. The API will move from next-tm to the main DPDK repository.
> 
> Support for IPFIX: An RFC will be created in the next few weeks to support IPFIX (IP Flow Information Export - see https://en.wikipedia.org/wiki/IP_Flow_Information_Export for details) within DPDK. The actual implementation of IPFIX will be the responsibility of the application, but a library will be proposed which will enable an application to implement an IPFIX Observation Point, Metering Process and Exporter Process. Depending on the response to this RFC, we'll consider implementing the IPFIX library for 17.08.
> 
> GRO (Generic Receive Offload): Generic Receive Offload is a widely used SW-based offloading technique to reduce per-packet processing overhead. It improves performance by reassembling small packets into large ones. A new library will be added to DPDK which will implement GRO.
> 
> Generic Flow Enhancements: The rte_flow API was added in 17.02 and implemented for IXGBE and I40E. Support will be added for IGB, and enhancements will also be implemented for IXGBE (NVGRE/L2 Tunnel filters).
> 
> Add Packet Type Recognition to IXGBE Vector PMD: The I40E Vector PMD supports packet type recognition but the IXGBE vPMD doesn't. This is a problem for VPP (https://wiki.fd.io/view/VPP) as they have to add a patch on top of DPDK to implement this.
> 
> VF Port Reset for IXGBE: This was implemented in 17.05 for I40E (see http://dpdk.org/ml/archives/dev/2017-April/063538.html). Support will be added for IXGBE in 17.08.
> 
> Cryptodev Multi-Core SW Scheduler: The cryptodev scheduler was first introduced in 17.02 and further enhanced in 17.05. It will be enhanced again in 17.08 to support the ability to use a pool of cores for software crypto. This will allow sufficient capacity to be provisioned for encrypting/decrypting large flows in software.
> 
> API to Configure Queue Regions for RSS: This provides support for configuration of queue regions for RSS, so that different traffic classes or different packet classification types to be separated into different queues. This will be implemented for I40E.

About this last topic, do you mean devising a new API is necessary or do you
plan to implement it through rte_flow? I'm asking as it looks like this is
what the rte_flow RSS action is defined for, see [1]. The mlx5 PMD adds
support for it in 17.05 [2].

I also intend to submit a few changes to rte_flow:

- VLAN item fix (according to this thread [3]). Impacts PMDs that implement
  the VLAN and associated items. Not sure it can be accepted or 17.08 due to
  ABI breakage, but it will be submitted regardless.

- A new isolated operation mode for rte_flow, guaranteeing applications can
  expect to receive packets from the flow rules they define *only* for
  complete control. No more "default" RX rules, RSS and so on. It means PMDs
  are free to reassign these resources to flow rules. No planned ABI
  breakage.

[1] http://dpdk.org/doc/guides/prog_guide/rte_flow.html#action-rss
[2] http://dpdk.org/browse/next/dpdk-next-net/commit/?id=1bfb7bb4423349ab13decead0af8ffd006e8e398
[3] http://dpdk.org/ml/archives/dev/2017-March/060231.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID
  2017-04-13 14:59  3%                 ` [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID Kuba Kozak
@ 2017-04-13 16:23  0%                   ` Van Haaren, Harry
  0 siblings, 0 replies; 200+ results
From: Van Haaren, Harry @ 2017-04-13 16:23 UTC (permalink / raw)
  To: Kozak, KubaX, dev; +Cc: Jain, Deepak K, Piasecki, JacekX, Kulasek, TomaszX

> From: Kozak, KubaX
> Sent: Thursday, April 13, 2017 3:59 PM
> To: dev@dpdk.org
> Cc: Van Haaren, Harry <harry.van.haaren@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
> Piasecki, JacekX <jacekx.piasecki@intel.com>; Kozak, KubaX <kubax.kozak@intel.com>; Kulasek,
> TomaszX <tomaszx.kulasek@intel.com>
> Subject: [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID
> 
> From: Jacek Piasecki <jacekx.piasecki@intel.com>
> 
> Extended xstats API in ethdev library to allow grouping of stats
> logically so they can be retrieved per logical grouping  managed
> by the application.
> Changed existing functions rte_eth_xstats_get_names and
> rte_eth_xstats_get to use a new list of arguments: array of ids
> and array of values. ABI versioning mechanism was used to
> support backward compatibility.
> Introduced two new functions rte_eth_xstats_get_all and
> rte_eth_xstats_get_names_all which keeps functionality of the
> previous ones (respectively rte_eth_xstats_get and
> rte_eth_xstats_get_names) but use new API inside.
> 
> test-pmd: add support for new xstats API retrieving by id in
> testpmd application: xstats_get() and
> xstats_get_names() call with modified parameters.
> 
> doc: add description for modified xstats API
> Documentation change for modified extended statistics API functions.
> The old API only allows retrieval of *all* of the NIC statistics
> at once. Given this requires a MMIO read PCI transaction per statistic
> it is an inefficient way of retrieving just a few key statistics.
> Often a monitoring agent only has an interest in a few key statistics,
> and the old API forces wasting CPU time and PCIe bandwidth in retrieving
> *all* statistics; even those that the application didn't explicitly
> show an interest in.
> The new, more flexible API allow retrieval of statistics per ID.
> If a PMD wishes, it can be implemented to read just the required
> NIC registers. As a result, the monitoring application no longer wastes
> PCIe bandwidth and CPU time.
> 
> Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
> Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>

Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
  2017-04-13 14:59  3%                 ` [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID Kuba Kozak
  2017-04-13 14:59  4%                 ` [dpdk-dev] [PATCH v6 2/5] ethdev: added new function for xstats ID Kuba Kozak
@ 2017-04-13 16:21  0%                 ` Van Haaren, Harry
  2017-04-20 20:31  0%                 ` Thomas Monjalon
  3 siblings, 0 replies; 200+ results
From: Van Haaren, Harry @ 2017-04-13 16:21 UTC (permalink / raw)
  To: Kozak, KubaX, dev; +Cc: Jain, Deepak K

> From: Kozak, KubaX
> Sent: Thursday, April 13, 2017 3:59 PM
> To: dev@dpdk.org
> Cc: Van Haaren, Harry <harry.van.haaren@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
> Kozak, KubaX <kubax.kozak@intel.com>
> Subject: [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats
> 
> Extended xstats API in ethdev library to allow grouping of stats logically
> so they can be retrieved per logical grouping  managed by the application.
> Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
> to use a new list of arguments: array of ids and array of values.
> ABI versioning mechanism was used to support backward compatibility.
> Introduced two new functions rte_eth_xstats_get_all and
> rte_eth_xstats_get_names_all which keeps functionality of the previous
> ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
> but use new API inside. Both functions marked as deprecated.
> Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
> xstats ids by its names.
> Extended functionality of proc_info application:
> --xstats-name NAME: to display single xstat value by NAME
> Updated test-pmd application to use new API.
> 
> v6 changes:
> * patches arrangement in patchset
> * fixes spelling bugs
> * release notes
> 
> v5 changes:
> * fix clang shared build compilation
> * remove wrong versioning macros
> * Makefile LIBABIVER 6 change
> 
> v4 changes:
> * documentation change after API modification
> * fix xstats display for PMD without _by_ids() functions
> * fix ABI validator errors
> 
> v3 changes:
> * checkpatch fixes
> * removed malloc bug in ethdev
> * add new command to proc_info and IDs parsing
> * merged testpmd and proc_info patch with library patch

Please keep Acks when only re-arranging patches and changes.

I'll re-Ack the patches on patchwork now :)

Thanks, -Harry

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 0/3] MAC address fail to be added shouldn't be stored
  2017-04-13  8:21  3% ` [dpdk-dev] [PATCH v4 0/3] MAC address fail to be added shouldn't be stored Wei Dai
@ 2017-04-13 13:54  0%   ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2017-04-13 13:54 UTC (permalink / raw)
  To: Dai, Wei, thomas.monjalon, harish.patil, rasesh.mody,
	stephen.hurd, ajit.khaparde, Lu, Wenzhuo, Zhang, Helin, Wu,
	Jingjing, Chen, Jing D, adrien.mazarguil, nelio.laranjeiro,
	Richardson, Bruce, yuanhan.liu, maxime.coquelin
  Cc: dev



> 
> Current ethdev always stores MAC address even it fails to be added.
> Other function may regard the failed MAC address valid and lead to
> some errors. So There is a need to check if the addr is added
> successfully or not and discard it if it fails.
> 
> In 3rd patch, add a command "add_more_mac_addr port_id base_mac_addr count"
> to add more than one MAC address one time.
> This command can simplify the test for the first patch.
> Normally a MAC address may fails to be added only after many MAC
> addresses have been added.
> Without this command, a tester may only trigger failed MAC address
> by running many times of testpmd command 'mac_addr add' .
> 
> ---
> Changes
> v4:
> 	1. rebase master branch
> 	2. follow code style
> 
> v3:
>   1. Change return value for some specific NIC according to feedbacks
>      from the community;
>   2. Add ABI change in release note;
>   3. Add more detailed commit message.
> 
> v2:
>   fix warnings and erros from check-git-log.sh and checkpatch.pl
> 
> Wei Dai (3):
>   ethdev: fix adding invalid MAC addr
>   doc: change type of return value of adding MAC addr
>   app/testpmd: add a command to add many MAC addrs
> 
>  app/test-pmd/cmdline.c                 | 55 ++++++++++++++++++++++++++++++++++
>  doc/guides/rel_notes/release_17_05.rst |  7 +++++
>  drivers/net/bnx2x/bnx2x_ethdev.c       |  7 +++--
>  drivers/net/bnxt/bnxt_ethdev.c         | 12 ++++----
>  drivers/net/e1000/base/e1000_api.c     |  2 +-
>  drivers/net/e1000/em_ethdev.c          |  6 ++--
>  drivers/net/e1000/igb_ethdev.c         |  5 ++--
>  drivers/net/enic/enic.h                |  2 +-
>  drivers/net/enic/enic_ethdev.c         |  4 +--
>  drivers/net/enic/enic_main.c           |  6 ++--
>  drivers/net/fm10k/fm10k_ethdev.c       |  3 +-
>  drivers/net/i40e/i40e_ethdev.c         | 11 +++----
>  drivers/net/i40e/i40e_ethdev_vf.c      |  8 ++---
>  drivers/net/ixgbe/ixgbe_ethdev.c       | 27 +++++++++++------
>  drivers/net/mlx4/mlx4.c                | 18 ++++++-----
>  drivers/net/mlx5/mlx5.h                |  4 +--
>  drivers/net/mlx5/mlx5_mac.c            | 16 ++++++----
>  drivers/net/qede/qede_ethdev.c         |  6 ++--
>  drivers/net/ring/rte_eth_ring.c        |  3 +-
>  drivers/net/virtio/virtio_ethdev.c     | 13 ++++----
>  lib/librte_ether/rte_ethdev.c          | 15 ++++++----
>  lib/librte_ether/rte_ethdev.h          |  2 +-
>  22 files changed, 162 insertions(+), 70 deletions(-)
> 
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.7.4

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v6 2/5] ethdev: added new function for xstats ID
  2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
  2017-04-13 14:59  3%                 ` [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID Kuba Kozak
@ 2017-04-13 14:59  4%                 ` Kuba Kozak
  2017-04-13 16:21  0%                 ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Van Haaren, Harry
  2017-04-20 20:31  0%                 ` Thomas Monjalon
  3 siblings, 0 replies; 200+ results
From: Kuba Kozak @ 2017-04-13 14:59 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, deepak.k.jain, Kuba Kozak

Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.

doc: added release note

Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
---
 doc/guides/rel_notes/release_17_05.rst |  2 ++
 lib/librte_ether/rte_ethdev.c          | 44 ++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 21 ++++++++++++++++
 lib/librte_ether/rte_ether_version.map |  1 +
 4 files changed, 68 insertions(+)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 5b77226..dae4261 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -423,6 +423,8 @@ API Changes
   * Added new functions ``rte_eth_xstats_get_all`` and ``rte_eth_xstats_get_names_all to provide backward compatibility for
     ``rte_eth_xstats_get`` and ``rte_eth_xstats_get_names``
 
+  * Added new function ``rte_eth_xstats_get_id_by_name``
+
 ABI Changes
 -----------
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0adc1d0..ef30883 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1476,6 +1476,50 @@ struct rte_eth_dev *
 }
 
 int
+rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id)
+{
+	int cnt_xstats, idx_xstat;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	if (!id) {
+		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
+		return -1;
+	}
+
+	if (!xstat_name) {
+		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
+		return -1;
+	}
+
+	/* Get count */
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
+	if (cnt_xstats  < 0) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
+		return -1;
+	}
+
+	/* Get id-name lookup table */
+	struct rte_eth_xstat_name xstats_names[cnt_xstats];
+
+	if (cnt_xstats != rte_eth_xstats_get_names(
+			port_id, xstats_names, cnt_xstats, NULL)) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
+		return -1;
+	}
+
+	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
+		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
+			*id = idx_xstat;
+			return 0;
+		};
+	}
+
+	return -EINVAL;
+}
+
+int
 rte_eth_xstats_get_names_v1607(uint8_t port_id,
 	struct rte_eth_xstat_name *xstats_names,
 	unsigned int size)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8c94b88..058c435 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2346,6 +2346,27 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  */
 void rte_eth_stats_reset(uint8_t port_id);
 
+
+/**
+ * Gets the ID of a statistic from its name.
+ *
+ * This function searches for the statistics using string compares, and
+ * as such should not be used on the fast-path. For fast-path retrieval of
+ * specific statistics, store the ID as provided in *id* from this function,
+ * and pass the ID to rte_eth_xstats_get()
+ *
+ * @param port_id The port to look up statistics from
+ * @param xstat_name The name of the statistic to return
+ * @param[out] id A pointer to an app-supplied uint64_t which should be
+ *                set to the ID of the stat if the stat exists.
+ * @return
+ *    0 on success
+ *    -ENODEV for invalid port_id,
+ *    -EINVAL if the xstat_name doesn't exist in port_id
+ */
+int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id);
+
 /**
  * Retrieve all extended statistics of an Ethernet device.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index a404434..7c41617 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -161,6 +161,7 @@ DPDK_17.05 {
 	rte_eth_find_next;
 	rte_eth_xstats_get;
 	rte_eth_xstats_get_all;
+	rte_eth_xstats_get_id_by_name;
 	rte_eth_xstats_get_names;
 	rte_eth_xstats_get_names_all;
 
-- 
1.9.1

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID
  2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
@ 2017-04-13 14:59  3%                 ` Kuba Kozak
  2017-04-13 16:23  0%                   ` Van Haaren, Harry
  2017-04-13 14:59  4%                 ` [dpdk-dev] [PATCH v6 2/5] ethdev: added new function for xstats ID Kuba Kozak
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ results
From: Kuba Kozak @ 2017-04-13 14:59 UTC (permalink / raw)
  To: dev
  Cc: harry.van.haaren, deepak.k.jain, Jacek Piasecki, Kuba Kozak,
	Tomasz Kulasek

From: Jacek Piasecki <jacekx.piasecki@intel.com>

Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping  managed
by the application.
Changed existing functions rte_eth_xstats_get_names and
rte_eth_xstats_get to use a new list of arguments: array of ids
and array of values. ABI versioning mechanism was used to
support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the
previous ones (respectively rte_eth_xstats_get and
rte_eth_xstats_get_names) but use new API inside.

test-pmd: add support for new xstats API retrieving by id in
testpmd application: xstats_get() and
xstats_get_names() call with modified parameters.

doc: add description for modified xstats API
Documentation change for modified extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 app/proc_info/main.c                    |   6 +-
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 173 ++++++++++++--
 doc/guides/rel_notes/release_17_05.rst  |   6 +
 lib/librte_ether/rte_ethdev.c           | 386 +++++++++++++++++++++++---------
 lib/librte_ether/rte_ethdev.h           | 144 +++++++++++-
 lib/librte_ether/rte_ether_version.map  |   4 +
 7 files changed, 596 insertions(+), 142 deletions(-)

diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index d576b42..9f5e219 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -358,7 +358,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 	int len, ret, i;
 	static const char *nic_stats_border = "########################";
 
-	len = rte_eth_xstats_get_names(port_id, NULL, 0);
+	len = rte_eth_xstats_get_names_all(port_id, NULL, 0);
 	if (len < 0) {
 		printf("Cannot get xstats count\n");
 		return;
@@ -375,7 +375,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 		free(xstats);
 		return;
 	}
-	if (len != rte_eth_xstats_get_names(
+	if (len != rte_eth_xstats_get_names_all(
 			port_id, xstats_names, len)) {
 		printf("Cannot get xstat names\n");
 		goto err;
@@ -385,7 +385,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 			   port_id);
 	printf("%s############################\n",
 			   nic_stats_border);
-	ret = rte_eth_xstats_get(port_id, xstats, len);
+	ret = rte_eth_xstats_get_all(port_id, xstats, len);
 	if (ret < 0 || ret > len) {
 		printf("Cannot get xstats\n");
 		goto err;
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 4d873cd..ef07925 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -264,9 +264,9 @@ struct rss_type_info {
 void
 nic_xstats_display(portid_t port_id)
 {
-	struct rte_eth_xstat *xstats;
 	int cnt_xstats, idx_xstat;
 	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
 	if (!rte_eth_dev_is_valid_port(port_id)) {
@@ -275,7 +275,7 @@ struct rss_type_info {
 	}
 
 	/* Get count */
-	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
 	if (cnt_xstats  < 0) {
 		printf("Error: Cannot get count of xstats\n");
 		return;
@@ -288,23 +288,24 @@ struct rss_type_info {
 		return;
 	}
 	if (cnt_xstats != rte_eth_xstats_get_names(
-			port_id, xstats_names, cnt_xstats)) {
+			port_id, xstats_names, cnt_xstats, NULL)) {
 		printf("Error: Cannot get xstats lookup\n");
 		free(xstats_names);
 		return;
 	}
 
 	/* Get stats themselves */
-	xstats = malloc(sizeof(struct rte_eth_xstat) * cnt_xstats);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * cnt_xstats);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		free(xstats_names);
 		return;
 	}
-	if (cnt_xstats != rte_eth_xstats_get(port_id, xstats, cnt_xstats)) {
+	if (cnt_xstats != rte_eth_xstats_get(port_id, NULL, values,
+			cnt_xstats)) {
 		printf("Error: Unable to get xstats\n");
 		free(xstats_names);
-		free(xstats);
+		free(values);
 		return;
 	}
 
@@ -312,9 +313,9 @@ struct rss_type_info {
 	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++)
 		printf("%s: %"PRIu64"\n",
 			xstats_names[idx_xstat].name,
-			xstats[idx_xstat].value);
+			values[idx_xstat]);
 	free(xstats_names);
-	free(xstats);
+	free(values);
 }
 
 void
diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index e48c121..a1a758b 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -334,24 +334,21 @@ The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK
 Extended Statistics API
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-The extended statistics API allows each individual PMD to expose a unique set
-of statistics. Accessing these from application programs is done via two
-functions:
-
-* ``rte_eth_xstats_get``: Fills in an array of ``struct rte_eth_xstat``
-  with extended statistics.
-* ``rte_eth_xstats_get_names``: Fills in an array of
-  ``struct rte_eth_xstat_name`` with extended statistic name lookup
-  information.
-
-Each ``struct rte_eth_xstat`` contains an identifier and value pair, and
-each ``struct rte_eth_xstat_name`` contains a string. Each identifier
-within the ``struct rte_eth_xstat`` lookup array must have a corresponding
-entry in the ``struct rte_eth_xstat_name`` lookup array. Within the latter
-the index of the entry is the identifier the string is associated with.
-These identifiers, as well as the number of extended statistic exposed, must
-remain constant during runtime. Note that extended statistic identifiers are
+The extended statistics API allows a PMD to expose all statistics that are
+available to it, including statistics that are unique to the device.
+Each statistic has three properties ``name``, ``id`` and ``value``:
+
+* ``name``: A human readable string formatted by the scheme detailed below.
+* ``id``: An integer that represents only that statistic.
+* ``value``: A unsigned 64-bit integer that is the value of the statistic.
+
+Note that extended statistic identifiers are
 driver-specific, and hence might not be the same for different ports.
+The API consists of various ``rte_eth_xstats_*()`` functions, and allows an
+application to be flexible in how it retrieves statistics.
+
+Scheme for Human Readable Names
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 A naming scheme exists for the strings exposed to clients of the API. This is
 to allow scraping of the API for statistics of interest. The naming scheme uses
@@ -363,8 +360,8 @@ strings split by a single underscore ``_``. The scheme is as follows:
 * detail n
 * unit
 
-Examples of common statistics xstats strings, formatted to comply to the scheme
-proposed above:
+Examples of common statistics xstats strings, formatted to comply to the
+above scheme:
 
 * ``rx_bytes``
 * ``rx_crc_errors``
@@ -378,7 +375,7 @@ associated with the receive side of the NIC.  The second component ``packets``
 indicates that the unit of measure is packets.
 
 A more complicated example: ``tx_size_128_to_255_packets``. In this example,
-``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc are
+``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc. are
 more details, and ``packets`` indicates that this is a packet counter.
 
 Some additions in the metadata scheme are as follows:
@@ -392,3 +389,139 @@ Some additions in the metadata scheme are as follows:
 An example where queue numbers are used is as follows: ``tx_q7_bytes`` which
 indicates this statistic applies to queue number 7, and represents the number
 of transmitted bytes on that queue.
+
+API Design
+^^^^^^^^^^
+
+The xstats API uses the ``name``, ``id``, and ``value`` to allow performant
+lookup of specific statistics. Performant lookup means two things;
+
+* No string comparisons with the ``name`` of the statistic in fast-path
+* Allow requesting of only the statistics of interest
+
+The API ensures these requirements are met by mapping the ``name`` of the
+statistic to a unique ``id``, which is used as a key for lookup in the fast-path.
+The API allows applications to request an array of ``id`` values, so that the
+PMD only performs the required calculations. Expected usage is that the
+application scans the ``name`` of each statistic, and caches the ``id``
+if it has an interest in that statistic. On the fast-path, the integer can be used
+to retrieve the actual ``value`` of the statistic that the ``id`` represents.
+
+API Functions
+^^^^^^^^^^^^^
+
+The API is built out of a small number of functions, which can be used to
+retrieve the number of statistics and the names, IDs and values of those
+statistics.
+
+* ``rte_eth_xstats_get_names()``: returns the names of the statistics. When given a
+  ``NULL`` parameter the function returns the number of statistics that are available.
+
+* ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches
+  ``xstat_name``. If found, the ``id`` integer is set.
+
+* ``rte_eth_xstats_get()``: Fills in an array of ``uint64_t`` values
+  with matching the provided ``ids`` array. If the ``ids`` array is NULL, it
+  returns all statistics that are available.
+
+
+Application Usage
+^^^^^^^^^^^^^^^^^
+
+Imagine an application that wants to view the dropped packet count. If no
+packets are dropped, the application does not read any other metrics for
+performance reasons. If packets are dropped, the application has a particular
+set of statistics that it requests. This "set" of statistics allows the app to
+decide what next steps to perform. The following code-snippets show how the
+xstats API can be used to achieve this goal.
+
+First step is to get all statistics names and list them:
+
+.. code-block:: c
+
+    struct rte_eth_xstat_name *xstats_names;
+    uint64_t *values;
+    int len, i;
+
+    /* Get number of stats */
+    len = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
+    if (len < 0) {
+        printf("Cannot get xstats count\n");
+        goto err;
+    }
+
+    xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+    if (xstats_names == NULL) {
+        printf("Cannot allocate memory for xstat names\n");
+        goto err;
+    }
+
+    /* Retrieve xstats names, passing NULL for IDs to return all statistics */
+    if (len != rte_eth_xstats_get_names(port_id, xstats_names, NULL, len)) {
+        printf("Cannot get xstat names\n");
+        goto err;
+    }
+
+    values = malloc(sizeof(values) * len);
+    if (values == NULL) {
+        printf("Cannot allocate memory for xstats\n");
+        goto err;
+    }
+
+    /* Getting xstats values */
+    if (len != rte_eth_xstats_get(port_id, NULL, values, len)) {
+        printf("Cannot get xstat values\n");
+        goto err;
+    }
+
+    /* Print all xstats names and values */
+    for (i = 0; i < len; i++) {
+        printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]);
+    }
+
+The application has access to the names of all of the statistics that the PMD
+exposes. The application can decide which statistics are of interest, cache the
+ids of those statistics by looking up the name as follows:
+
+.. code-block:: c
+
+    uint64_t id;
+    uint64_t value;
+    const char *xstat_name = "rx_errors";
+
+    if(!rte_eth_xstats_get_id_by_name(port_id, xstat_name, &id)) {
+        rte_eth_xstats_get(port_id, &id, &value, 1);
+        printf("%s: %"PRIu64"\n", xstat_name, value);
+    }
+    else {
+        printf("Cannot find xstats with a given name\n");
+        goto err;
+    }
+
+The API provides flexibility to the application so that it can look up multiple
+statistics using an array containing multiple ``id`` numbers. This reduces the
+function call overhead of retrieving statistics, and makes lookup of multiple
+statistics simpler for the application.
+
+.. code-block:: c
+
+    #define APP_NUM_STATS 4
+    /* application cached these ids previously; see above */
+    uint64_t ids_array[APP_NUM_STATS] = {3,4,7,21};
+    uint64_t value_array[APP_NUM_STATS];
+
+    /* Getting multiple xstats values from array of IDs */
+    rte_eth_xstats_get(port_id, ids_array, value_array, APP_NUM_STATS);
+
+    uint32_t i;
+    for(i = 0; i < APP_NUM_STATS; i++) {
+        printf("%d: %"PRIu64"\n", ids_array[i], value_array[i]);
+    }
+
+
+This array lookup API for xstats allows the application create multiple
+"groups" of statistics, and look up the values of those IDs using a single API
+call. As an end result, the application is able to achieve its goal of
+monitoring a single statistic ("rx_errors" in this case), and if that shows
+packets being dropped, it can easily retrieve a "set" of statistics using the
+IDs array parameter to ``rte_eth_xstats_get`` function.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 4968b8f..5b77226 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -416,6 +416,12 @@ API Changes
   * The vhost public header file ``rte_virtio_net.h`` is renamed to
     ``rte_vhost.h``
 
+* **Reworked rte_ethdev library**
+
+  * Changed set of input parameters for ``rte_eth_xstats_get`` and ``rte_eth_xstats_get_names`` functions.
+
+  * Added new functions ``rte_eth_xstats_get_all`` and ``rte_eth_xstats_get_names_all to provide backward compatibility for
+    ``rte_eth_xstats_get`` and ``rte_eth_xstats_get_names``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 4e1e6dc..0adc1d0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1454,12 +1454,19 @@ struct rte_eth_dev *
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
+	if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+		count = (*dev->dev_ops->xstats_get_names_by_ids)(dev, NULL,
+				NULL, 0);
+		if (count < 0)
+			return count;
+	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
 			return count;
 	} else
 		count = 0;
+
 	count += RTE_NB_STATS;
 	count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
 		 RTE_NB_RXQ_STATS;
@@ -1469,150 +1476,323 @@ struct rte_eth_dev *
 }
 
 int
-rte_eth_xstats_get_names(uint8_t port_id,
+rte_eth_xstats_get_names_v1607(uint8_t port_id,
 	struct rte_eth_xstat_name *xstats_names,
-	unsigned size)
+	unsigned int size)
 {
-	struct rte_eth_dev *dev;
-	int cnt_used_entries;
-	int cnt_expected_entries;
-	int cnt_driver_entries;
-	uint32_t idx, id_queue;
-	uint16_t num_q;
+	return rte_eth_xstats_get_names(port_id, xstats_names, size, NULL);
+}
+VERSION_SYMBOL(rte_eth_xstats_get_names, _v1607, 16.07);
 
-	cnt_expected_entries = get_xstats_count(port_id);
-	if (xstats_names == NULL || cnt_expected_entries < 0 ||
-			(int)size < cnt_expected_entries)
-		return cnt_expected_entries;
+int
+rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids)
+{
+	/* Get all xstats */
+	if (!ids) {
+		struct rte_eth_dev *dev;
+		int cnt_used_entries;
+		int cnt_expected_entries;
+		int cnt_driver_entries;
+		uint32_t idx, id_queue;
+		uint16_t num_q;
 
-	/* port_id checked in get_xstats_count() */
-	dev = &rte_eth_devices[port_id];
-	cnt_used_entries = 0;
+		cnt_expected_entries = get_xstats_count(port_id);
+		if (xstats_names == NULL || cnt_expected_entries < 0 ||
+				(int)size < cnt_expected_entries)
+			return cnt_expected_entries;
 
-	for (idx = 0; idx < RTE_NB_STATS; idx++) {
-		snprintf(xstats_names[cnt_used_entries].name,
-			sizeof(xstats_names[0].name),
-			"%s", rte_stats_strings[idx].name);
-		cnt_used_entries++;
-	}
-	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+		/* port_id checked in get_xstats_count() */
+		dev = &rte_eth_devices[port_id];
+		cnt_used_entries = 0;
+
+		for (idx = 0; idx < RTE_NB_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
-				"rx_q%u%s",
-				id_queue, rte_rxq_stats_strings[idx].name);
+				"%s", rte_stats_strings[idx].name);
 			cnt_used_entries++;
 		}
+		num_q = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"rx_q%u%s",
+					id_queue,
+					rte_rxq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
 
+		}
+		num_q = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"tx_q%u%s",
+					id_queue,
+					rte_txq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+		}
+
+		if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries =
+				(*dev->dev_ops->xstats_get_names_by_ids)(
+				dev,
+				xstats_names + cnt_used_entries,
+				NULL,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+
+		} else if (dev->dev_ops->xstats_get_names != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+				dev,
+				xstats_names + cnt_used_entries,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+		}
+
+		return cnt_used_entries;
 	}
-	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
-			snprintf(xstats_names[cnt_used_entries].name,
-				sizeof(xstats_names[0].name),
-				"tx_q%u%s",
-				id_queue, rte_txq_stats_strings[idx].name);
-			cnt_used_entries++;
+	/* Get only xstats given by IDS */
+	else {
+		uint16_t len, i;
+		struct rte_eth_xstat_name *xstats_names_copy;
+
+		len = rte_eth_xstats_get_names_v1705(port_id, NULL, 0, NULL);
+
+		xstats_names_copy =
+				malloc(sizeof(struct rte_eth_xstat_name) * len);
+		if (!xstats_names_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			     "ERROR: can't allocate memory for values_copy\n");
+			free(xstats_names_copy);
+			return -1;
+		}
+
+		rte_eth_xstats_get_names_v1705(port_id, xstats_names_copy,
+				len, NULL);
+
+		for (i = 0; i < size; i++) {
+			if (ids[i] >= len) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			strcpy(xstats_names[i].name,
+					xstats_names_copy[ids[i]].name);
 		}
+		free(xstats_names_copy);
+		return size;
 	}
+}
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names, _v1705, 17.05);
 
-	if (dev->dev_ops->xstats_get_names != NULL) {
-		/* If there are any driver-specific xstats, append them
-		 * to end of list.
-		 */
-		cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
-			dev,
-			xstats_names + cnt_used_entries,
-			size - cnt_used_entries);
-		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
-		cnt_used_entries += cnt_driver_entries;
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get_names(uint8_t port_id,
+			struct rte_eth_xstat_name *xstats_names,
+			unsigned int size,
+			uint64_t *ids), rte_eth_xstats_get_names_v1705);
+
+/* retrieve ethdev extended statistics */
+int
+rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	return cnt_used_entries;
+	for (i = 0; i < n; i++) {
+		xstats[i].id = i;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
 }
+VERSION_SYMBOL(rte_eth_xstats_get, _v22, 2.2);
 
 /* retrieve ethdev extended statistics */
 int
-rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned n)
+rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n)
 {
-	struct rte_eth_stats eth_stats;
-	struct rte_eth_dev *dev;
-	unsigned count = 0, i, q;
-	signed xcount = 0;
-	uint64_t val, *stats_ptr;
-	uint16_t nb_rxqs, nb_txqs;
+	/* If need all xstats */
+	if (!ids) {
+		struct rte_eth_stats eth_stats;
+		struct rte_eth_dev *dev;
+		unsigned int count = 0, i, q;
+		signed int xcount = 0;
+		uint64_t val, *stats_ptr;
+		uint16_t nb_rxqs, nb_txqs;
 
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+		dev = &rte_eth_devices[port_id];
 
-	dev = &rte_eth_devices[port_id];
+		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		nb_txqs = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
 
-	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		/* Return generic statistics */
+		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+			(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* Return generic statistics */
-	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
-		(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* implemented by the driver */
-	if (dev->dev_ops->xstats_get != NULL) {
-		/* Retrieve the xstats from the driver at the end of the
-		 * xstats struct.
-		 */
-		xcount = (*dev->dev_ops->xstats_get)(dev,
-				     xstats ? xstats + count : NULL,
-				     (n > count) ? n - count : 0);
+		/* implemented by the driver */
+		if (dev->dev_ops->xstats_get_by_ids != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 */
+			xcount = (*dev->dev_ops->xstats_get_by_ids)(dev,
+					NULL,
+					values ? values + count : NULL,
+					(n > count) ? n - count : 0);
+
+			if (xcount < 0)
+				return xcount;
+		/* implemented by the driver */
+		} else if (dev->dev_ops->xstats_get != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 * Compatibility for PMD without xstats_get_by_ids
+			 */
+			unsigned int size = (n > count) ? n - count : 1;
+			struct rte_eth_xstat xstats[size];
 
-		if (xcount < 0)
-			return xcount;
-	}
+			xcount = (*dev->dev_ops->xstats_get)(dev,
+					values ? xstats : NULL,	size);
 
-	if (n < count + xcount || xstats == NULL)
-		return count + xcount;
+			if (xcount < 0)
+				return xcount;
 
-	/* now fill the xstats structure */
-	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+			if (values != NULL)
+				for (i = 0 ; i < (unsigned int)xcount; i++)
+					values[i + count] = xstats[i].value;
+		}
 
-	/* global stats */
-	for (i = 0; i < RTE_NB_STATS; i++) {
-		stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_stats_strings[i].offset);
-		val = *stats_ptr;
-		xstats[count++].value = val;
-	}
+		if (n < count + xcount || values == NULL)
+			return count + xcount;
 
-	/* per-rxq stats */
-	for (q = 0; q < nb_rxqs; q++) {
-		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+		/* now fill the xstats structure */
+		count = 0;
+		rte_eth_stats_get(port_id, &eth_stats);
+
+		/* global stats */
+		for (i = 0; i < RTE_NB_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_rxq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
+						rte_stats_strings[i].offset);
 			val = *stats_ptr;
-			xstats[count++].value = val;
+			values[count++] = val;
+		}
+
+		/* per-rxq stats */
+		for (q = 0; q < nb_rxqs; q++) {
+			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_rxq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		/* per-txq stats */
+		for (q = 0; q < nb_txqs; q++) {
+			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_txq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
 		}
+
+		return count + xcount;
 	}
+	/* Need only xstats given by IDS array */
+	else {
+		uint16_t i, size;
+		uint64_t *values_copy;
 
-	/* per-txq stats */
-	for (q = 0; q < nb_txqs; q++) {
-		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
-			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_txq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
-			val = *stats_ptr;
-			xstats[count++].value = val;
+		size = rte_eth_xstats_get_v1705(port_id, NULL, NULL, 0);
+
+		values_copy = malloc(sizeof(values_copy) * size);
+		if (!values_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			    "ERROR: can't allocate memory for values_copy\n");
+			return -1;
+		}
+
+		rte_eth_xstats_get_v1705(port_id, NULL, values_copy, size);
+
+		for (i = 0; i < n; i++) {
+			if (ids[i] >= size) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			values[i] = values_copy[ids[i]];
 		}
+		free(values_copy);
+		return n;
+	}
+}
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get, _v1705, 17.05);
+
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get(uint8_t port_id, uint64_t *ids,
+		uint64_t *values, unsigned int n), rte_eth_xstats_get_v1705);
+
+int
+rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	for (i = 0; i < count; i++)
+	for (i = 0; i < n; i++) {
 		xstats[i].id = i;
-	/* add an offset to driver-specific stats */
-	for ( ; i < count + xcount; i++)
-		xstats[i].id += count;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
+}
 
-	return count + xcount;
+int
+rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, n, NULL);
 }
 
 /* reset ethdev extended statistics */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d072538..8c94b88 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -186,6 +186,7 @@
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
+#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1118,6 +1119,10 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_by_ids_t)(struct rte_eth_dev *dev,
+		uint64_t *ids, uint64_t *values, unsigned int n);
+/**< @internal Get extended stats of an Ethernet device. */
+
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1125,6 +1130,17 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_names_by_ids_t)(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+/**< @internal Get names of extended stats of an Ethernet device. */
+
+typedef int (*eth_xstats_get_by_name_t)(struct rte_eth_dev *dev,
+		struct rte_eth_xstat_name *xstats_names,
+		struct rte_eth_xstat *xstat,
+		const char *name);
+/**< @internal Get xstat specified by name of an Ethernet device. */
+
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1563,6 +1579,12 @@ struct eth_dev_ops {
 	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
+	eth_xstats_get_by_ids_t    xstats_get_by_ids;
+	/**< Get extended device statistics by ID. */
+	eth_xstats_get_names_by_ids_t xstats_get_names_by_ids;
+	/**< Get name of extended device statistics by ID. */
+	eth_xstats_get_by_name_t   xstats_get_by_name;
+	/**< Get extended device statistics by name. */
 };
 
 /**
@@ -2325,7 +2347,32 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
 void rte_eth_stats_reset(uint8_t port_id);
 
 /**
- * Retrieve names of extended statistics of an Ethernet device.
+ * Retrieve all extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve names of all extended statistics of an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
@@ -2333,7 +2380,7 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *   An rte_eth_xstat_name array of at least *size* elements to
  *   be filled. If set to NULL, the function returns the required number
  *   of elements.
- * @param size
+ * @param n
  *   The size of the xstats_names array (number of elements).
  * @return
  *   - A positive value lower or equal to size: success. The return value
@@ -2344,9 +2391,8 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get_names(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names,
-		unsigned size);
+int rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
 
 /**
  * Retrieve extended statistics of an Ethernet device.
@@ -2370,8 +2416,92 @@ int rte_eth_xstats_get_names(uint8_t port_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-		unsigned n);
+int rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ids
+ *   A pointer to an ids array passed by application. This tells wich
+ *   statistics values function should retrieve. This parameter
+ *   can be set to NULL if n is 0. In this case function will retrieve
+ *   all avalible statistics.
+ * @param values
+ *   A pointer to a table to be filled with device statistics values.
+ * @param n
+ *   The size of the ids array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1607(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+
+/**
+ * Retrieve names of extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   An rte_eth_xstat_name array of at least *size* elements to
+ *   be filled. If set to NULL, the function returns the required number
+ *   of elements.
+ * @param ids
+ *   IDs array given by app to retrieve specific statistics
+ * @param size
+ *   The size of the xstats_names array (number of elements).
+ * @return
+ *   - A positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
+
+int rte_eth_xstats_get_names(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
 
 /**
  * Reset extended statistics of an Ethernet device.
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 0ea3856..a404434 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -159,5 +159,9 @@ DPDK_17.05 {
 	global:
 
 	rte_eth_find_next;
+	rte_eth_xstats_get;
+	rte_eth_xstats_get_all;
+	rte_eth_xstats_get_names;
+	rte_eth_xstats_get_names_all;
 
 } DPDK_17.02;
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-11 16:37  1%             ` [dpdk-dev] [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID Michal Jastrzebski
  2017-04-12  8:56  5%               ` Van Haaren, Harry
@ 2017-04-13 14:59  4%               ` Kuba Kozak
  2017-04-13 14:59  3%                 ` [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID Kuba Kozak
                                   ` (3 more replies)
  1 sibling, 4 replies; 200+ results
From: Kuba Kozak @ 2017-04-13 14:59 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, deepak.k.jain, Kuba Kozak

Extended xstats API in ethdev library to allow grouping of stats logically
so they can be retrieved per logical grouping  managed by the application.
Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
to use a new list of arguments: array of ids and array of values.
ABI versioning mechanism was used to support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the previous
ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
but use new API inside. Both functions marked as deprecated. 
Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
xstats ids by its names.
Extended functionality of proc_info application:
--xstats-name NAME: to display single xstat value by NAME
Updated test-pmd application to use new API.

v6 changes:
* patches arrangement in patchset
* fixes spelling bugs
* release notes

v5 changes:
* fix clang shared build compilation
* remove wrong versioning macros
* Makefile LIBABIVER 6 change 

v4 changes:
* documentation change after API modification
* fix xstats display for PMD without _by_ids() functions
* fix ABI validator errors

v3 changes:
* checkpatch fixes
* removed malloc bug in ethdev
* add new command to proc_info and IDs parsing
* merged testpmd and proc_info patch with library patch

Jacek Piasecki (3):
  ethdev: new xstats API add retrieving by ID
  net/e1000: new xstats API add ID support for e1000
  net/ixgbe: new xstats API add ID support for ixgbe

Kuba Kozak (2):
  ethdev: added new function for xstats ID
  proc-info: add support for new xstats API

 app/proc_info/main.c                    | 148 ++++++++++-
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 173 +++++++++++--
 doc/guides/rel_notes/release_17_05.rst  |   8 +
 drivers/net/e1000/igb_ethdev.c          |  92 ++++++-
 drivers/net/ixgbe/ixgbe_ethdev.c        | 179 +++++++++++++
 lib/librte_ether/rte_ethdev.c           | 430 ++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h           | 167 ++++++++++++-
 lib/librte_ether/rte_ether_version.map  |   5 +
 9 files changed, 1070 insertions(+), 151 deletions(-)

-- 
1.9.1

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix compilation issue with strict flags
  @ 2017-04-13  9:36  5% ` Van Haaren, Harry
  0 siblings, 0 replies; 200+ results
From: Van Haaren, Harry @ 2017-04-13  9:36 UTC (permalink / raw)
  To: Shahaf Shuler, thomas.monjalon; +Cc: adrien.mazarguil, nelio.laranjeiro, dev

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Shahaf Shuler
> Sent: Thursday, April 13, 2017 6:29 AM
> To: thomas.monjalon@6wind.com
> Cc: adrien.mazarguil@6wind.com; nelio.laranjeiro@6wind.com; dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] ethdev: fix compilation issue with strict flags
> 
> Compilation error seen while compiling mlx5 in debug mode
> under RHEL 7.3:
> 
> rte_ethdev.h:1670:7: error: type of bit-field 'state' is a GCC extension
>     [-Werror=pedantic]
> 
> Address it by removing the unnecessary bit-field width limitation.
> 
> Fixes: d52268a8b24b ("ethdev: expose device states")
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> ---
>  lib/librte_ether/rte_ethdev.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index d07253874..2d1bc12aa 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1667,7 +1667,7 @@ struct rte_eth_dev {
>  	 * received packets before passing them to the driver for transmission.
>  	 */
>  	struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
> -	enum rte_eth_dev_state state:8; /**< Flag indicating the port state */
> +	enum rte_eth_dev_state state; /**< Flag indicating the port state */
>  } __rte_cache_aligned;
> 
>  struct rte_eth_dev_sriov {


What is the guidelines of changing ABI of an @internal structure?

If I understand correctly, this @internal structure shouldn't be allocated in the app - so we can extend it at the end without breaking ABI.

Since the state is at the end of the struct, I think this change is safe.

ABI Validate tool with GCC 5.4.0 says ABI compatible, so

Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v4 2/3] doc: change type of return value of adding MAC addr
  2017-04-12  9:02  3% [dpdk-dev] [PATCH v3 0/3] MAC address fail to be added shouldn't be stored Wei Dai
  2017-04-12  9:02  5% ` [dpdk-dev] [PATCH v3 2/3] doc: change type of return value of adding MAC addr Wei Dai
  2017-04-13  8:21  3% ` [dpdk-dev] [PATCH v4 0/3] MAC address fail to be added shouldn't be stored Wei Dai
@ 2017-04-13  8:21  5% ` Wei Dai
  2 siblings, 0 replies; 200+ results
From: Wei Dai @ 2017-04-13  8:21 UTC (permalink / raw)
  To: thomas.monjalon, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, wenzhuo.lu, helin.zhang, konstantin.ananyev,
	jingjing.wu, jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin
  Cc: dev, Wei Dai

Add following lines in section of API change in release note.

If a MAC address fails to be added without this change, it is still stored
and may be regarded as a valid one. This may lead to errors in application.
The type of return value of eth_mac_addr_add_t in rte_ethdev.h is changed.
Any specific NIC also follows this change.

Signed-off-by: Wei Dai <wei.dai@intel.com>
---
 doc/guides/rel_notes/release_17_05.rst | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 4968b8f..c9ba484 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -438,6 +438,13 @@ ABI Changes
 * The ``rte_cryptodev_info.sym`` structure has new field ``max_nb_sessions_per_qp``
   to support drivers which may support limited number of sessions per queue_pair.
 
+* **Return if the MAC address is added successfully or not.**
+
+  If a MAC address fails to be added without this change, it is still stored
+  and may be regarded as a valid one. This may lead to errors in application.
+  The type of return value of eth_mac_addr_add_t in rte_ethdev.h is changed.
+  Any specific NIC also follows this change.
+
 
 Removed Items
 -------------
-- 
2.7.4

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v4 0/3] MAC address fail to be added shouldn't be stored
  2017-04-12  9:02  3% [dpdk-dev] [PATCH v3 0/3] MAC address fail to be added shouldn't be stored Wei Dai
  2017-04-12  9:02  5% ` [dpdk-dev] [PATCH v3 2/3] doc: change type of return value of adding MAC addr Wei Dai
@ 2017-04-13  8:21  3% ` Wei Dai
  2017-04-13 13:54  0%   ` Ananyev, Konstantin
  2017-04-13  8:21  5% ` [dpdk-dev] [PATCH v4 2/3] doc: change type of return value of adding MAC addr Wei Dai
  2 siblings, 1 reply; 200+ results
From: Wei Dai @ 2017-04-13  8:21 UTC (permalink / raw)
  To: thomas.monjalon, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, wenzhuo.lu, helin.zhang, konstantin.ananyev,
	jingjing.wu, jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin
  Cc: dev, Wei Dai

Current ethdev always stores MAC address even it fails to be added.
Other function may regard the failed MAC address valid and lead to
some errors. So There is a need to check if the addr is added
successfully or not and discard it if it fails.

In 3rd patch, add a command "add_more_mac_addr port_id base_mac_addr count"
to add more than one MAC address one time.
This command can simplify the test for the first patch.
Normally a MAC address may fails to be added only after many MAC
addresses have been added.
Without this command, a tester may only trigger failed MAC address
by running many times of testpmd command 'mac_addr add' .

---
Changes
v4:
	1. rebase master branch
	2. follow code style
	
v3:
  1. Change return value for some specific NIC according to feedbacks
     from the community;
  2. Add ABI change in release note;
  3. Add more detailed commit message.

v2:
  fix warnings and erros from check-git-log.sh and checkpatch.pl

Wei Dai (3):
  ethdev: fix adding invalid MAC addr
  doc: change type of return value of adding MAC addr
  app/testpmd: add a command to add many MAC addrs

 app/test-pmd/cmdline.c                 | 55 ++++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst |  7 +++++
 drivers/net/bnx2x/bnx2x_ethdev.c       |  7 +++--
 drivers/net/bnxt/bnxt_ethdev.c         | 12 ++++----
 drivers/net/e1000/base/e1000_api.c     |  2 +-
 drivers/net/e1000/em_ethdev.c          |  6 ++--
 drivers/net/e1000/igb_ethdev.c         |  5 ++--
 drivers/net/enic/enic.h                |  2 +-
 drivers/net/enic/enic_ethdev.c         |  4 +--
 drivers/net/enic/enic_main.c           |  6 ++--
 drivers/net/fm10k/fm10k_ethdev.c       |  3 +-
 drivers/net/i40e/i40e_ethdev.c         | 11 +++----
 drivers/net/i40e/i40e_ethdev_vf.c      |  8 ++---
 drivers/net/ixgbe/ixgbe_ethdev.c       | 27 +++++++++++------
 drivers/net/mlx4/mlx4.c                | 18 ++++++-----
 drivers/net/mlx5/mlx5.h                |  4 +--
 drivers/net/mlx5/mlx5_mac.c            | 16 ++++++----
 drivers/net/qede/qede_ethdev.c         |  6 ++--
 drivers/net/ring/rte_eth_ring.c        |  3 +-
 drivers/net/virtio/virtio_ethdev.c     | 13 ++++----
 lib/librte_ether/rte_ethdev.c          | 15 ++++++----
 lib/librte_ether/rte_ethdev.h          |  2 +-
 22 files changed, 162 insertions(+), 70 deletions(-)

-- 
2.7.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
@ 2017-04-13  6:42  0%           ` Wang, Zhihong
  0 siblings, 0 replies; 200+ results
From: Wang, Zhihong @ 2017-04-13  6:42 UTC (permalink / raw)
  To: Richardson, Bruce, olivier.matz; +Cc: dev, Richardson, Bruce

Hi Bruce,

This patch changes the behavior and causes some existing code to
malfunction, e.g. bond_ethdev_stop() will get stuck here:

while (rte_ring_dequeue(port->rx_ring, &pkt) != -ENOENT)
		rte_pktmbuf_free(pkt);

Another example in test/test/virtual_pmd.c: virtual_ethdev_stop().


Thanks
Zhihong

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, March 29, 2017 9:10 PM
> To: olivier.matz@6wind.com
> Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return
> vals consistent
> 
> The bulk fns for rings returns 0 for all elements enqueued and negative
> for no space. Change that to make them consistent with the burst functions
> in returning the number of elements enqueued/dequeued, i.e. 0 or N.
> This change also allows the return value from enq/deq to be used directly
> without a branch for error checking.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  doc/guides/rel_notes/release_17_05.rst             |  11 +++
>  doc/guides/sample_app_ug/server_node_efd.rst       |   2 +-
>  examples/load_balancer/runtime.c                   |  16 ++-
>  .../client_server_mp/mp_client/client.c            |   8 +-
>  .../client_server_mp/mp_server/main.c              |   2 +-
>  examples/qos_sched/app_thread.c                    |   8 +-
>  examples/server_node_efd/node/node.c               |   2 +-
>  examples/server_node_efd/server/main.c             |   2 +-
>  lib/librte_mempool/rte_mempool_ring.c              |  12 ++-
>  lib/librte_ring/rte_ring.h                         | 109 +++++++--------------
>  test/test-pipeline/pipeline_hash.c                 |   2 +-
>  test/test-pipeline/runtime.c                       |   8 +-
>  test/test/test_ring.c                              |  46 +++++----
>  test/test/test_ring_perf.c                         |   8 +-
>  14 files changed, 106 insertions(+), 130 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_17_05.rst
> b/doc/guides/rel_notes/release_17_05.rst
> index 084b359..6da2612 100644
> --- a/doc/guides/rel_notes/release_17_05.rst
> +++ b/doc/guides/rel_notes/release_17_05.rst
> @@ -137,6 +137,17 @@ API Changes
>    * removed the build-time setting
> ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
>    * removed the function ``rte_ring_set_water_mark`` as part of a general
>      removal of watermarks support in the library.
> +  * changed the return value of the enqueue and dequeue bulk functions to
> +    match that of the burst equivalents. In all cases, ring functions which
> +    operate on multiple packets now return the number of elements
> enqueued
> +    or dequeued, as appropriate. The updated functions are:
> +
> +    - ``rte_ring_mp_enqueue_bulk``
> +    - ``rte_ring_sp_enqueue_bulk``
> +    - ``rte_ring_enqueue_bulk``
> +    - ``rte_ring_mc_dequeue_bulk``
> +    - ``rte_ring_sc_dequeue_bulk``
> +    - ``rte_ring_dequeue_bulk``
> 
>  ABI Changes
>  -----------
> diff --git a/doc/guides/sample_app_ug/server_node_efd.rst
> b/doc/guides/sample_app_ug/server_node_efd.rst
> index 9b69cfe..e3a63c8 100644
> --- a/doc/guides/sample_app_ug/server_node_efd.rst
> +++ b/doc/guides/sample_app_ug/server_node_efd.rst
> @@ -286,7 +286,7 @@ repeated infinitely.
> 
>          cl = &nodes[node];
>          if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
> -                cl_rx_buf[node].count) != 0){
> +                cl_rx_buf[node].count) != cl_rx_buf[node].count){
>              for (j = 0; j < cl_rx_buf[node].count; j++)
>                  rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
>              cl->stats.rx_drop += cl_rx_buf[node].count;
> diff --git a/examples/load_balancer/runtime.c
> b/examples/load_balancer/runtime.c
> index 6944325..82b10bc 100644
> --- a/examples/load_balancer/runtime.c
> +++ b/examples/load_balancer/runtime.c
> @@ -146,7 +146,7 @@ app_lcore_io_rx_buffer_to_send (
>  		(void **) lp->rx.mbuf_out[worker].array,
>  		bsz);
> 
> -	if (unlikely(ret == -ENOBUFS)) {
> +	if (unlikely(ret == 0)) {
>  		uint32_t k;
>  		for (k = 0; k < bsz; k ++) {
>  			struct rte_mbuf *m = lp-
> >rx.mbuf_out[worker].array[k];
> @@ -312,7 +312,7 @@ app_lcore_io_rx_flush(struct app_lcore_params_io
> *lp, uint32_t n_workers)
>  			(void **) lp->rx.mbuf_out[worker].array,
>  			lp->rx.mbuf_out[worker].n_mbufs);
> 
> -		if (unlikely(ret < 0)) {
> +		if (unlikely(ret == 0)) {
>  			uint32_t k;
>  			for (k = 0; k < lp->rx.mbuf_out[worker].n_mbufs; k
> ++) {
>  				struct rte_mbuf *pkt_to_free = lp-
> >rx.mbuf_out[worker].array[k];
> @@ -349,9 +349,8 @@ app_lcore_io_tx(
>  				(void **) &lp-
> >tx.mbuf_out[port].array[n_mbufs],
>  				bsz_rd);
> 
> -			if (unlikely(ret == -ENOENT)) {
> +			if (unlikely(ret == 0))
>  				continue;
> -			}
> 
>  			n_mbufs += bsz_rd;
> 
> @@ -505,9 +504,8 @@ app_lcore_worker(
>  			(void **) lp->mbuf_in.array,
>  			bsz_rd);
> 
> -		if (unlikely(ret == -ENOENT)) {
> +		if (unlikely(ret == 0))
>  			continue;
> -		}
> 
>  #if APP_WORKER_DROP_ALL_PACKETS
>  		for (j = 0; j < bsz_rd; j ++) {
> @@ -559,7 +557,7 @@ app_lcore_worker(
> 
>  #if APP_STATS
>  			lp->rings_out_iters[port] ++;
> -			if (ret == 0) {
> +			if (ret > 0) {
>  				lp->rings_out_count[port] += 1;
>  			}
>  			if (lp->rings_out_iters[port] == APP_STATS){
> @@ -572,7 +570,7 @@ app_lcore_worker(
>  			}
>  #endif
> 
> -			if (unlikely(ret == -ENOBUFS)) {
> +			if (unlikely(ret == 0)) {
>  				uint32_t k;
>  				for (k = 0; k < bsz_wr; k ++) {
>  					struct rte_mbuf *pkt_to_free = lp-
> >mbuf_out[port].array[k];
> @@ -609,7 +607,7 @@ app_lcore_worker_flush(struct
> app_lcore_params_worker *lp)
>  			(void **) lp->mbuf_out[port].array,
>  			lp->mbuf_out[port].n_mbufs);
> 
> -		if (unlikely(ret < 0)) {
> +		if (unlikely(ret == 0)) {
>  			uint32_t k;
>  			for (k = 0; k < lp->mbuf_out[port].n_mbufs; k ++) {
>  				struct rte_mbuf *pkt_to_free = lp-
> >mbuf_out[port].array[k];
> diff --git a/examples/multi_process/client_server_mp/mp_client/client.c
> b/examples/multi_process/client_server_mp/mp_client/client.c
> index d4f9ca3..dca9eb9 100644
> --- a/examples/multi_process/client_server_mp/mp_client/client.c
> +++ b/examples/multi_process/client_server_mp/mp_client/client.c
> @@ -276,14 +276,10 @@ main(int argc, char *argv[])
>  	printf("[Press Ctrl-C to quit ...]\n");
> 
>  	for (;;) {
> -		uint16_t i, rx_pkts = PKT_READ_SIZE;
> +		uint16_t i, rx_pkts;
>  		uint8_t port;
> 
> -		/* try dequeuing max possible packets first, if that fails, get
> the
> -		 * most we can. Loop body should only execute once,
> maximum */
> -		while (rx_pkts > 0 &&
> -				unlikely(rte_ring_dequeue_bulk(rx_ring,
> pkts, rx_pkts) != 0))
> -			rx_pkts =
> (uint16_t)RTE_MIN(rte_ring_count(rx_ring), PKT_READ_SIZE);
> +		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts,
> PKT_READ_SIZE);
> 
>  		if (unlikely(rx_pkts == 0)){
>  			if (need_flush)
> diff --git a/examples/multi_process/client_server_mp/mp_server/main.c
> b/examples/multi_process/client_server_mp/mp_server/main.c
> index a6dc12d..19c95b2 100644
> --- a/examples/multi_process/client_server_mp/mp_server/main.c
> +++ b/examples/multi_process/client_server_mp/mp_server/main.c
> @@ -227,7 +227,7 @@ flush_rx_queue(uint16_t client)
> 
>  	cl = &clients[client];
>  	if (rte_ring_enqueue_bulk(cl->rx_q, (void
> **)cl_rx_buf[client].buffer,
> -			cl_rx_buf[client].count) != 0){
> +			cl_rx_buf[client].count) == 0){
>  		for (j = 0; j < cl_rx_buf[client].count; j++)
>  			rte_pktmbuf_free(cl_rx_buf[client].buffer[j]);
>  		cl->stats.rx_drop += cl_rx_buf[client].count;
> diff --git a/examples/qos_sched/app_thread.c
> b/examples/qos_sched/app_thread.c
> index 70fdcdb..dab4594 100644
> --- a/examples/qos_sched/app_thread.c
> +++ b/examples/qos_sched/app_thread.c
> @@ -107,7 +107,7 @@ app_rx_thread(struct thread_conf **confs)
>  			}
> 
>  			if (unlikely(rte_ring_sp_enqueue_bulk(conf->rx_ring,
> -								(void
> **)rx_mbufs, nb_rx) != 0)) {
> +					(void **)rx_mbufs, nb_rx) == 0)) {
>  				for(i = 0; i < nb_rx; i++) {
>  					rte_pktmbuf_free(rx_mbufs[i]);
> 
> @@ -180,7 +180,7 @@ app_tx_thread(struct thread_conf **confs)
>  	while ((conf = confs[conf_idx])) {
>  		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void
> **)mbufs,
>  					burst_conf.qos_dequeue);
> -		if (likely(retval == 0)) {
> +		if (likely(retval != 0)) {
>  			app_send_packets(conf, mbufs,
> burst_conf.qos_dequeue);
> 
>  			conf->counter = 0; /* reset empty read loop counter
> */
> @@ -230,7 +230,9 @@ app_worker_thread(struct thread_conf **confs)
>  		nb_pkt = rte_sched_port_dequeue(conf->sched_port,
> mbufs,
>  					burst_conf.qos_dequeue);
>  		if (likely(nb_pkt > 0))
> -			while (rte_ring_sp_enqueue_bulk(conf->tx_ring,
> (void **)mbufs, nb_pkt) != 0);
> +			while (rte_ring_sp_enqueue_bulk(conf->tx_ring,
> +					(void **)mbufs, nb_pkt) == 0)
> +				; /* empty body */
> 
>  		conf_idx++;
>  		if (confs[conf_idx] == NULL)
> diff --git a/examples/server_node_efd/node/node.c
> b/examples/server_node_efd/node/node.c
> index a6c0c70..9ec6a05 100644
> --- a/examples/server_node_efd/node/node.c
> +++ b/examples/server_node_efd/node/node.c
> @@ -392,7 +392,7 @@ main(int argc, char *argv[])
>  		 */
>  		while (rx_pkts > 0 &&
>  				unlikely(rte_ring_dequeue_bulk(rx_ring,
> pkts,
> -					rx_pkts) != 0))
> +					rx_pkts) == 0))
>  			rx_pkts =
> (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
>  					PKT_READ_SIZE);
> 
> diff --git a/examples/server_node_efd/server/main.c
> b/examples/server_node_efd/server/main.c
> index 1a54d1b..3eb7fac 100644
> --- a/examples/server_node_efd/server/main.c
> +++ b/examples/server_node_efd/server/main.c
> @@ -247,7 +247,7 @@ flush_rx_queue(uint16_t node)
> 
>  	cl = &nodes[node];
>  	if (rte_ring_enqueue_bulk(cl->rx_q, (void
> **)cl_rx_buf[node].buffer,
> -			cl_rx_buf[node].count) != 0){
> +			cl_rx_buf[node].count) != cl_rx_buf[node].count){
>  		for (j = 0; j < cl_rx_buf[node].count; j++)
>  			rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
>  		cl->stats.rx_drop += cl_rx_buf[node].count;
> diff --git a/lib/librte_mempool/rte_mempool_ring.c
> b/lib/librte_mempool/rte_mempool_ring.c
> index b9aa64d..409b860 100644
> --- a/lib/librte_mempool/rte_mempool_ring.c
> +++ b/lib/librte_mempool/rte_mempool_ring.c
> @@ -42,26 +42,30 @@ static int
>  common_ring_mp_enqueue(struct rte_mempool *mp, void * const
> *obj_table,
>  		unsigned n)
>  {
> -	return rte_ring_mp_enqueue_bulk(mp->pool_data, obj_table, n);
> +	return rte_ring_mp_enqueue_bulk(mp->pool_data,
> +			obj_table, n) == 0 ? -ENOBUFS : 0;
>  }
> 
>  static int
>  common_ring_sp_enqueue(struct rte_mempool *mp, void * const
> *obj_table,
>  		unsigned n)
>  {
> -	return rte_ring_sp_enqueue_bulk(mp->pool_data, obj_table, n);
> +	return rte_ring_sp_enqueue_bulk(mp->pool_data,
> +			obj_table, n) == 0 ? -ENOBUFS : 0;
>  }
> 
>  static int
>  common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table,
> unsigned n)
>  {
> -	return rte_ring_mc_dequeue_bulk(mp->pool_data, obj_table, n);
> +	return rte_ring_mc_dequeue_bulk(mp->pool_data,
> +			obj_table, n) == 0 ? -ENOBUFS : 0;
>  }
> 
>  static int
>  common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table,
> unsigned n)
>  {
> -	return rte_ring_sc_dequeue_bulk(mp->pool_data, obj_table, n);
> +	return rte_ring_sc_dequeue_bulk(mp->pool_data,
> +			obj_table, n) == 0 ? -ENOBUFS : 0;
>  }
> 
>  static unsigned
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 906e8ae..34b438c 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -349,14 +349,10 @@ void rte_ring_dump(FILE *f, const struct rte_ring
> *r);
>   *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
>   *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from
> ring
>   * @return
> - *   Depend on the behavior value
> - *   if behavior = RTE_RING_QUEUE_FIXED
> - *   - 0: Success; objects enqueue.
> - *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is
> enqueued.
> - *   if behavior = RTE_RING_QUEUE_VARIABLE
> - *   - n: Actual number of objects enqueued.
> + *   Actual number of objects enqueued.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
>  			 unsigned n, enum rte_ring_queue_behavior
> behavior)
>  {
> @@ -388,7 +384,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void
> * const *obj_table,
>  		/* check that we have enough room in ring */
>  		if (unlikely(n > free_entries)) {
>  			if (behavior == RTE_RING_QUEUE_FIXED)
> -				return -ENOBUFS;
> +				return 0;
>  			else {
>  				/* No free entry available */
>  				if (unlikely(free_entries == 0))
> @@ -414,7 +410,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void
> * const *obj_table,
>  		rte_pause();
> 
>  	r->prod.tail = prod_next;
> -	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
> +	return n;
>  }
> 
>  /**
> @@ -430,14 +426,10 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r,
> void * const *obj_table,
>   *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
>   *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from
> ring
>   * @return
> - *   Depend on the behavior value
> - *   if behavior = RTE_RING_QUEUE_FIXED
> - *   - 0: Success; objects enqueue.
> - *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is
> enqueued.
> - *   if behavior = RTE_RING_QUEUE_VARIABLE
> - *   - n: Actual number of objects enqueued.
> + *   Actual number of objects enqueued.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
>  			 unsigned n, enum rte_ring_queue_behavior
> behavior)
>  {
> @@ -457,7 +449,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void *
> const *obj_table,
>  	/* check that we have enough room in ring */
>  	if (unlikely(n > free_entries)) {
>  		if (behavior == RTE_RING_QUEUE_FIXED)
> -			return -ENOBUFS;
> +			return 0;
>  		else {
>  			/* No free entry available */
>  			if (unlikely(free_entries == 0))
> @@ -474,7 +466,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void *
> const *obj_table,
>  	rte_smp_wmb();
> 
>  	r->prod.tail = prod_next;
> -	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
> +	return n;
>  }
> 
>  /**
> @@ -495,16 +487,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r,
> void * const *obj_table,
>   *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a
> ring
>   *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from
> ring
>   * @return
> - *   Depend on the behavior value
> - *   if behavior = RTE_RING_QUEUE_FIXED
> - *   - 0: Success; objects dequeued.
> - *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
> - *     dequeued.
> - *   if behavior = RTE_RING_QUEUE_VARIABLE
> - *   - n: Actual number of objects dequeued.
> + *   - Actual number of objects dequeued.
> + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
>   */
> 
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
>  		 unsigned n, enum rte_ring_queue_behavior behavior)
>  {
> @@ -536,7 +523,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void
> **obj_table,
>  		/* Set the actual entries for dequeue */
>  		if (n > entries) {
>  			if (behavior == RTE_RING_QUEUE_FIXED)
> -				return -ENOENT;
> +				return 0;
>  			else {
>  				if (unlikely(entries == 0))
>  					return 0;
> @@ -562,7 +549,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void
> **obj_table,
> 
>  	r->cons.tail = cons_next;
> 
> -	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
> +	return n;
>  }
> 
>  /**
> @@ -580,15 +567,10 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r,
> void **obj_table,
>   *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a
> ring
>   *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from
> ring
>   * @return
> - *   Depend on the behavior value
> - *   if behavior = RTE_RING_QUEUE_FIXED
> - *   - 0: Success; objects dequeued.
> - *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
> - *     dequeued.
> - *   if behavior = RTE_RING_QUEUE_VARIABLE
> - *   - n: Actual number of objects dequeued.
> + *   - Actual number of objects dequeued.
> + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
>  		 unsigned n, enum rte_ring_queue_behavior behavior)
>  {
> @@ -607,7 +589,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void
> **obj_table,
> 
>  	if (n > entries) {
>  		if (behavior == RTE_RING_QUEUE_FIXED)
> -			return -ENOENT;
> +			return 0;
>  		else {
>  			if (unlikely(entries == 0))
>  				return 0;
> @@ -623,7 +605,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void
> **obj_table,
>  	rte_smp_rmb();
> 
>  	r->cons.tail = cons_next;
> -	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
> +	return n;
>  }
> 
>  /**
> @@ -639,10 +621,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void
> **obj_table,
>   * @param n
>   *   The number of objects to add in the ring from the obj_table.
>   * @return
> - *   - 0: Success; objects enqueue.
> - *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is
> enqueued.
> + *   The number of objects enqueued, either 0 or n
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
>  			 unsigned n)
>  {
> @@ -659,10 +640,9 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void
> * const *obj_table,
>   * @param n
>   *   The number of objects to add in the ring from the obj_table.
>   * @return
> - *   - 0: Success; objects enqueued.
> - *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is
> enqueued.
> + *   The number of objects enqueued, either 0 or n
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
>  			 unsigned n)
>  {
> @@ -683,10 +663,9 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void *
> const *obj_table,
>   * @param n
>   *   The number of objects to add in the ring from the obj_table.
>   * @return
> - *   - 0: Success; objects enqueued.
> - *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is
> enqueued.
> + *   The number of objects enqueued, either 0 or n
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
>  		      unsigned n)
>  {
> @@ -713,7 +692,7 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void *
> const *obj_table,
>  static inline int __attribute__((always_inline))
>  rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
>  {
> -	return rte_ring_mp_enqueue_bulk(r, &obj, 1);
> +	return rte_ring_mp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
>  }
> 
>  /**
> @@ -730,7 +709,7 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
>  static inline int __attribute__((always_inline))
>  rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
>  {
> -	return rte_ring_sp_enqueue_bulk(r, &obj, 1);
> +	return rte_ring_sp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
>  }
> 
>  /**
> @@ -751,10 +730,7 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
>  static inline int __attribute__((always_inline))
>  rte_ring_enqueue(struct rte_ring *r, void *obj)
>  {
> -	if (r->prod.single)
> -		return rte_ring_sp_enqueue(r, obj);
> -	else
> -		return rte_ring_mp_enqueue(r, obj);
> +	return rte_ring_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
>  }
> 
>  /**
> @@ -770,11 +746,9 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
>   * @param n
>   *   The number of objects to dequeue from the ring to the obj_table.
>   * @return
> - *   - 0: Success; objects dequeued.
> - *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
> - *     dequeued.
> + *   The number of objects dequeued, either 0 or n
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
>  {
>  	return __rte_ring_mc_do_dequeue(r, obj_table, n,
> RTE_RING_QUEUE_FIXED);
> @@ -791,11 +765,9 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void
> **obj_table, unsigned n)
>   *   The number of objects to dequeue from the ring to the obj_table,
>   *   must be strictly positive.
>   * @return
> - *   - 0: Success; objects dequeued.
> - *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
> - *     dequeued.
> + *   The number of objects dequeued, either 0 or n
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
>  {
>  	return __rte_ring_sc_do_dequeue(r, obj_table, n,
> RTE_RING_QUEUE_FIXED);
> @@ -815,11 +787,9 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void
> **obj_table, unsigned n)
>   * @param n
>   *   The number of objects to dequeue from the ring to the obj_table.
>   * @return
> - *   - 0: Success; objects dequeued.
> - *   - -ENOENT: Not enough entries in the ring to dequeue, no object is
> - *     dequeued.
> + *   The number of objects dequeued, either 0 or n
>   */
> -static inline int __attribute__((always_inline))
> +static inline unsigned int __attribute__((always_inline))
>  rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
>  {
>  	if (r->cons.single)
> @@ -846,7 +816,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void
> **obj_table, unsigned n)
>  static inline int __attribute__((always_inline))
>  rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
>  {
> -	return rte_ring_mc_dequeue_bulk(r, obj_p, 1);
> +	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
>  }
> 
>  /**
> @@ -864,7 +834,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void
> **obj_p)
>  static inline int __attribute__((always_inline))
>  rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
>  {
> -	return rte_ring_sc_dequeue_bulk(r, obj_p, 1);
> +	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
>  }
> 
>  /**
> @@ -886,10 +856,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void
> **obj_p)
>  static inline int __attribute__((always_inline))
>  rte_ring_dequeue(struct rte_ring *r, void **obj_p)
>  {
> -	if (r->cons.single)
> -		return rte_ring_sc_dequeue(r, obj_p);
> -	else
> -		return rte_ring_mc_dequeue(r, obj_p);
> +	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
>  }
> 
>  /**
> diff --git a/test/test-pipeline/pipeline_hash.c b/test/test-
> pipeline/pipeline_hash.c
> index 10d2869..1ac0aa8 100644
> --- a/test/test-pipeline/pipeline_hash.c
> +++ b/test/test-pipeline/pipeline_hash.c
> @@ -547,6 +547,6 @@ app_main_loop_rx_metadata(void) {
>  				app.rings_rx[i],
>  				(void **) app.mbuf_rx.array,
>  				n_mbufs);
> -		} while (ret < 0);
> +		} while (ret == 0);
>  	}
>  }
> diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
> index 42a6142..4e20669 100644
> --- a/test/test-pipeline/runtime.c
> +++ b/test/test-pipeline/runtime.c
> @@ -98,7 +98,7 @@ app_main_loop_rx(void) {
>  				app.rings_rx[i],
>  				(void **) app.mbuf_rx.array,
>  				n_mbufs);
> -		} while (ret < 0);
> +		} while (ret == 0);
>  	}
>  }
> 
> @@ -123,7 +123,7 @@ app_main_loop_worker(void) {
>  			(void **) worker_mbuf->array,
>  			app.burst_size_worker_read);
> 
> -		if (ret == -ENOENT)
> +		if (ret == 0)
>  			continue;
> 
>  		do {
> @@ -131,7 +131,7 @@ app_main_loop_worker(void) {
>  				app.rings_tx[i ^ 1],
>  				(void **) worker_mbuf->array,
>  				app.burst_size_worker_write);
> -		} while (ret < 0);
> +		} while (ret == 0);
>  	}
>  }
> 
> @@ -152,7 +152,7 @@ app_main_loop_tx(void) {
>  			(void **) &app.mbuf_tx[i].array[n_mbufs],
>  			app.burst_size_tx_read);
> 
> -		if (ret == -ENOENT)
> +		if (ret == 0)
>  			continue;
> 
>  		n_mbufs += app.burst_size_tx_read;
> diff --git a/test/test/test_ring.c b/test/test/test_ring.c
> index 666a451..112433b 100644
> --- a/test/test/test_ring.c
> +++ b/test/test/test_ring.c
> @@ -117,20 +117,18 @@ test_ring_basic_full_empty(void * const src[], void
> *dst[])
>  		rand = RTE_MAX(rte_rand() % RING_SIZE, 1UL);
>  		printf("%s: iteration %u, random shift: %u;\n",
>  		    __func__, i, rand);
> -		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r,
> src,
> -		    rand));
> -		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst,
> rand));
> +		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand) !=
> 0);
> +		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) ==
> rand);
> 
>  		/* fill the ring */
> -		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r,
> src,
> -		    rsz));
> +		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz) != 0);
>  		TEST_RING_VERIFY(0 == rte_ring_free_count(r));
>  		TEST_RING_VERIFY(rsz == rte_ring_count(r));
>  		TEST_RING_VERIFY(rte_ring_full(r));
>  		TEST_RING_VERIFY(0 == rte_ring_empty(r));
> 
>  		/* empty the ring */
> -		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rsz));
> +		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) ==
> rsz);
>  		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
>  		TEST_RING_VERIFY(0 == rte_ring_count(r));
>  		TEST_RING_VERIFY(0 == rte_ring_full(r));
> @@ -171,37 +169,37 @@ test_ring_basic(void)
>  	printf("enqueue 1 obj\n");
>  	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 1);
>  	cur_src += 1;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("enqueue 2 objs\n");
>  	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 2);
>  	cur_src += 2;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("enqueue MAX_BULK objs\n");
>  	ret = rte_ring_sp_enqueue_bulk(r, cur_src, MAX_BULK);
>  	cur_src += MAX_BULK;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("dequeue 1 obj\n");
>  	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
>  	cur_dst += 1;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("dequeue 2 objs\n");
>  	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
>  	cur_dst += 2;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("dequeue MAX_BULK objs\n");
>  	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
>  	cur_dst += MAX_BULK;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	/* check data */
> @@ -217,37 +215,37 @@ test_ring_basic(void)
>  	printf("enqueue 1 obj\n");
>  	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 1);
>  	cur_src += 1;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("enqueue 2 objs\n");
>  	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 2);
>  	cur_src += 2;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("enqueue MAX_BULK objs\n");
>  	ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
>  	cur_src += MAX_BULK;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("dequeue 1 obj\n");
>  	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
>  	cur_dst += 1;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("dequeue 2 objs\n");
>  	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
>  	cur_dst += 2;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	printf("dequeue MAX_BULK objs\n");
>  	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
>  	cur_dst += MAX_BULK;
> -	if (ret != 0)
> +	if (ret == 0)
>  		goto fail;
> 
>  	/* check data */
> @@ -264,11 +262,11 @@ test_ring_basic(void)
>  	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
>  		ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
>  		cur_src += MAX_BULK;
> -		if (ret != 0)
> +		if (ret == 0)
>  			goto fail;
>  		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
>  		cur_dst += MAX_BULK;
> -		if (ret != 0)
> +		if (ret == 0)
>  			goto fail;
>  	}
> 
> @@ -294,25 +292,25 @@ test_ring_basic(void)
> 
>  	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
>  	cur_src += num_elems;
> -	if (ret != 0) {
> +	if (ret == 0) {
>  		printf("Cannot enqueue\n");
>  		goto fail;
>  	}
>  	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
>  	cur_src += num_elems;
> -	if (ret != 0) {
> +	if (ret == 0) {
>  		printf("Cannot enqueue\n");
>  		goto fail;
>  	}
>  	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
>  	cur_dst += num_elems;
> -	if (ret != 0) {
> +	if (ret == 0) {
>  		printf("Cannot dequeue\n");
>  		goto fail;
>  	}
>  	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
>  	cur_dst += num_elems;
> -	if (ret != 0) {
> +	if (ret == 0) {
>  		printf("Cannot dequeue2\n");
>  		goto fail;
>  	}
> diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
> index 320c20c..8ccbdef 100644
> --- a/test/test/test_ring_perf.c
> +++ b/test/test/test_ring_perf.c
> @@ -195,13 +195,13 @@ enqueue_bulk(void *p)
> 
>  	const uint64_t sp_start = rte_rdtsc();
>  	for (i = 0; i < iterations; i++)
> -		while (rte_ring_sp_enqueue_bulk(r, burst, size) != 0)
> +		while (rte_ring_sp_enqueue_bulk(r, burst, size) == 0)
>  			rte_pause();
>  	const uint64_t sp_end = rte_rdtsc();
> 
>  	const uint64_t mp_start = rte_rdtsc();
>  	for (i = 0; i < iterations; i++)
> -		while (rte_ring_mp_enqueue_bulk(r, burst, size) != 0)
> +		while (rte_ring_mp_enqueue_bulk(r, burst, size) == 0)
>  			rte_pause();
>  	const uint64_t mp_end = rte_rdtsc();
> 
> @@ -230,13 +230,13 @@ dequeue_bulk(void *p)
> 
>  	const uint64_t sc_start = rte_rdtsc();
>  	for (i = 0; i < iterations; i++)
> -		while (rte_ring_sc_dequeue_bulk(r, burst, size) != 0)
> +		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
>  			rte_pause();
>  	const uint64_t sc_end = rte_rdtsc();
> 
>  	const uint64_t mc_start = rte_rdtsc();
>  	for (i = 0; i < iterations; i++)
> -		while (rte_ring_mc_dequeue_bulk(r, burst, size) != 0)
> +		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
>  			rte_pause();
>  	const uint64_t mc_end = rte_rdtsc();
> 
> --
> 2.9.3

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v3 2/3] doc: change type of return value of adding MAC addr
  2017-04-12  9:02  3% [dpdk-dev] [PATCH v3 0/3] MAC address fail to be added shouldn't be stored Wei Dai
@ 2017-04-12  9:02  5% ` Wei Dai
  2017-04-13  8:21  3% ` [dpdk-dev] [PATCH v4 0/3] MAC address fail to be added shouldn't be stored Wei Dai
  2017-04-13  8:21  5% ` [dpdk-dev] [PATCH v4 2/3] doc: change type of return value of adding MAC addr Wei Dai
  2 siblings, 0 replies; 200+ results
From: Wei Dai @ 2017-04-12  9:02 UTC (permalink / raw)
  To: dev, thomas.monjalon, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, wenzhuo.lu, helin.zhang, konstantin.ananyev,
	jingjing.wu, jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin
  Cc: Wei Dai

Add following lines in section of API change in release note.

If a MAC address fails to be added without this change, it is still stored
and may be regarded as a valid one. This may lead to errors in application.
The type of return value of eth_mac_addr_add_t in rte_ethdev.h is changed.
Any specific NIC also follows this change.

Signed-off-by: Wei Dai <wei.dai@intel.com>
---
 doc/guides/rel_notes/release_17_05.rst | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index aa3e1e0..b3aac0c 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -408,6 +408,12 @@ ABI Changes
   The order and size of the fields in the ``mbuf`` structure changed,
   as described in the `New Features`_ section.
 
+* **Return if the MAC address is added successfully or not.**
+
+  If a MAC address fails to be added without this change, it is still stored
+  and may be regarded as a valid one. This may lead to errors in application.
+  The type of return value of eth_mac_addr_add_t in rte_ethdev.h is changed.
+  Any specific NIC also follows this change.
 
 Removed Items
 -------------
-- 
2.7.4

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v3 0/3] MAC address fail to be added shouldn't be stored
@ 2017-04-12  9:02  3% Wei Dai
  2017-04-12  9:02  5% ` [dpdk-dev] [PATCH v3 2/3] doc: change type of return value of adding MAC addr Wei Dai
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Wei Dai @ 2017-04-12  9:02 UTC (permalink / raw)
  To: dev, thomas.monjalon, harish.patil, rasesh.mody, stephen.hurd,
	ajit.khaparde, wenzhuo.lu, helin.zhang, konstantin.ananyev,
	jingjing.wu, jing.d.chen, adrien.mazarguil, nelio.laranjeiro,
	bruce.richardson, yuanhan.liu, maxime.coquelin
  Cc: Wei Dai

Current ethdev always stores MAC address even it fails to be added.
Other function may regard the failed MAC address valid and lead to
some errors. So There is a need to check if the addr is added
successfully or not and discard it if it fails.

In 3rd patch, add a command "add_more_mac_addr port_id base_mac_addr count"
to add more than one MAC address one time.
This command can simplify the test for the first patch.
Normally a MAC address may fails to be added only after many MAC
addresses have been added.
Without this command, a tester may only trigger failed MAC address
by running many times of testpmd command 'mac_addr add' .

---
Changes
v3:
  1. Change return value for some specific NIC according to feedbacks
     from the community;
  2. Add ABI change in release note;
  3. Add more detailed commit message.

v2:
  fix warnings and erros from check-git-log.sh and checkpatch.pl

Wei Dai (3):
  ethdev: fix adding invalid MAC addr
  doc: change type of return value of adding MAC addr
  app/testpmd: add a command to add many MAC addrs

 app/test-pmd/cmdline.c                 | 55 ++++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst |  6 ++++
 drivers/net/bnx2x/bnx2x_ethdev.c       |  7 +++--
 drivers/net/bnxt/bnxt_ethdev.c         | 12 ++++----
 drivers/net/e1000/base/e1000_api.c     |  2 +-
 drivers/net/e1000/em_ethdev.c          |  6 ++--
 drivers/net/e1000/igb_ethdev.c         |  5 ++--
 drivers/net/enic/enic.h                |  2 +-
 drivers/net/enic/enic_ethdev.c         |  4 +--
 drivers/net/enic/enic_main.c           |  6 ++--
 drivers/net/fm10k/fm10k_ethdev.c       |  3 +-
 drivers/net/i40e/i40e_ethdev.c         | 11 +++----
 drivers/net/i40e/i40e_ethdev_vf.c      |  8 ++---
 drivers/net/ixgbe/ixgbe_ethdev.c       | 27 +++++++++++------
 drivers/net/mlx4/mlx4.c                | 14 +++++----
 drivers/net/mlx5/mlx5.h                |  4 +--
 drivers/net/mlx5/mlx5_mac.c            | 12 +++++---
 drivers/net/qede/qede_ethdev.c         |  6 ++--
 drivers/net/ring/rte_eth_ring.c        |  3 +-
 drivers/net/virtio/virtio_ethdev.c     | 13 ++++----
 lib/librte_ether/rte_ethdev.c          | 15 ++++++----
 lib/librte_ether/rte_ethdev.h          |  2 +-
 22 files changed, 157 insertions(+), 66 deletions(-)

--
2.7.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID
  2017-04-11 16:37  1%             ` [dpdk-dev] [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID Michal Jastrzebski
@ 2017-04-12  8:56  5%               ` Van Haaren, Harry
  2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
  1 sibling, 0 replies; 200+ results
From: Van Haaren, Harry @ 2017-04-12  8:56 UTC (permalink / raw)
  To: Jastrzebski, MichalX K, dev
  Cc: Jain, Deepak K, Piasecki, JacekX, Kozak, KubaX, Kulasek, TomaszX

> From: Jastrzebski, MichalX K
> Sent: Tuesday, April 11, 2017 5:37 PM
> To: dev@dpdk.org
> Cc: Jain, Deepak K <deepak.k.jain@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>;
> Piasecki, JacekX <jacekx.piasecki@intel.com>; Kozak, KubaX <kubax.kozak@intel.com>; Kulasek,
> TomaszX <tomaszx.kulasek@intel.com>
> Subject: [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID
> 
> From: Jacek Piasecki <jacekx.piasecki@intel.com>
> 
> Extended xstats API in ethdev library to allow grouping of stats
> logically so they can be retrieved per logical grouping  managed
> by the application.
> Changed existing functions rte_eth_xstats_get_names and
> rte_eth_xstats_get to use a new list of arguments: array of ids
> and array of values. ABI versioning mechanism was used to
> support backward compatibility.
> Introduced two new functions rte_eth_xstats_get_all and
> rte_eth_xstats_get_names_all which keeps functionality of the
> previous ones (respectively rte_eth_xstats_get and
> rte_eth_xstats_get_names) but use new API inside.
> Both functions marked as deprecated.
> Introduced new function: rte_eth_xstats_get_id_by_name
> to retrieve xstats ids by its names.
> 
> test-pmd: add support for new xstats API retrieving by id in
> testpmd application: xstats_get() and
> xstats_get_names() call with modified parameters.
> 
> proc_info: add support for new xstats API retrieving by id to
> proc_info application. There is a new argument --xstats-ids
> in proc_info command line to retrieve statistics given by ids.
> E.g. --xstats-ids="1,3,5,7,8"
> 
> doc: add description for modified xstats API
> Documentation change for modified extended statistics API functions.
> The old API only allows retrieval of *all* of the NIC statistics
> at once. Given this requires a MMIO read PCI transaction per statistic
> it is an inefficient way of retrieving just a few key statistics.
> Often a monitoring agent only has an interest in a few key statistics,
> and the old API forces wasting CPU time and PCIe bandwidth in retrieving
> *all* statistics; even those that the application didn't explicitly
> show an interest in.
> The new, more flexible API allow retrieval of statistics per ID.
> If a PMD wishes, it can be implemented to read just the required
> NIC registers. As a result, the monitoring application no longer wastes
> PCIe bandwidth and CPU time.
> 
> Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
> Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>


As part of this patchset 3 functions were added to struct eth_dev_ops, to provide more flexible xstats() APIs.

This patchset uses symbol versioning to keep ABI stable. I have checked ABI using ./devtools/validate-abi.sh script for both GCC 5.4.0 and Clang 3.8.0. GCC indicates Compatible, while Clang says there is a single Medium issue, which I believe to be a false positive (details below).

The clang Medium issue is described as follows by the ABI report;
- struct rte_eth_dev :
  Change: Size of field dev_ops has been changed from 624 bytes to 648 bytes [HvH: due to adding 3 xstats function pointers to end of struct] 
  Effect: Previous accesses of applications and library functions to this field and fields at higher positions of the structure definition may be broken. 

The reason I believe this is a false positive is that the "dev_ops" field is defined in the rte_eth_dev struct as follows:
  const struct eth_dev_ops *dev_ops;

Any accesses made to dev_ops will be by this pointer-dereference, so the *size* of dev_ops *in* rte_eth_dev struct is still a pointer - it hasn't changed. Hence "accesses to this field and fields at higher positions of the structure" will not be changed - the pointer in the rte_eth_dev struct remains a pointer.


Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID
  2017-04-11 16:37  4%           ` [dpdk-dev] [PATCH v5 0/3] Extended xstats API in ethdev library to allow grouping of stats Michal Jastrzebski
@ 2017-04-11 16:37  1%             ` Michal Jastrzebski
  2017-04-12  8:56  5%               ` Van Haaren, Harry
  2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
  0 siblings, 2 replies; 200+ results
From: Michal Jastrzebski @ 2017-04-11 16:37 UTC (permalink / raw)
  To: dev
  Cc: deepak.k.jain, harry.van.haaren, Jacek Piasecki, Kuba Kozak,
	Tomasz Kulasek

From: Jacek Piasecki <jacekx.piasecki@intel.com>

Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping  managed
by the application.
Changed existing functions rte_eth_xstats_get_names and
rte_eth_xstats_get to use a new list of arguments: array of ids
and array of values. ABI versioning mechanism was used to
support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the
previous ones (respectively rte_eth_xstats_get and
rte_eth_xstats_get_names) but use new API inside.
Both functions marked as deprecated.
Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.

test-pmd: add support for new xstats API retrieving by id in
testpmd application: xstats_get() and
xstats_get_names() call with modified parameters.

proc_info: add support for new xstats API retrieving by id to
proc_info application. There is a new argument --xstats-ids
in proc_info command line to retrieve statistics given by ids.
E.g. --xstats-ids="1,3,5,7,8"

doc: add description for modified xstats API
Documentation change for modified extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 app/proc_info/main.c                    | 150 ++++++++++-
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 174 +++++++++++--
 lib/librte_ether/rte_ethdev.c           | 430 ++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h           | 170 ++++++++++++-
 lib/librte_ether/rte_ether_version.map  |   5 +
 6 files changed, 794 insertions(+), 154 deletions(-)

diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index d576b42..d0eae4b 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -86,6 +86,14 @@
 static uint32_t reset_xstats;
 /**< Enable memory info. */
 static uint32_t mem_info;
+/**< Enable displaying xstat name. */
+static uint32_t enable_xstats_name;
+static char *xstats_name;
+
+/**< Enable xstats by ids. */
+#define MAX_NB_XSTATS_IDS 1024
+static uint32_t nb_xstats_ids;
+static uint64_t xstats_ids[MAX_NB_XSTATS_IDS];
 
 /**< display usage */
 static void
@@ -97,8 +105,9 @@
 		"  --stats: to display port statistics, enabled by default\n"
 		"  --xstats: to display extended port statistics, disabled by "
 			"default\n"
-		"  --metrics: to display derived metrics of the ports, disabled by "
-			"default\n"
+		"  --xstats-name NAME: to display single xstat value by NAME\n"
+		"  --xstats-ids IDLIST: to display xstat values by id. "
+			"The argument is comma-separated list of xstat ids to print out.\n"
 		"  --stats-reset: to reset port statistics\n"
 		"  --xstats-reset: to reset port extended statistics\n"
 		"  --collectd-format: to print statistics to STDOUT in expected by collectd format\n"
@@ -132,6 +141,33 @@
 
 }
 
+/*
+ * Parse ids value list into array
+ */
+static int
+parse_xstats_ids(char *list, uint64_t *ids, int limit) {
+	int length;
+	char *token;
+	char *ctx = NULL;
+	char *endptr;
+
+	length = 0;
+	token = strtok_r(list, ",", &ctx);
+	while (token != NULL) {
+		ids[length] = strtoull(token, &endptr, 10);
+		if (*endptr != '\0')
+			return -EINVAL;
+
+		length++;
+		if (length >= limit)
+			return -E2BIG;
+
+		token = strtok_r(NULL, ",", &ctx);
+	}
+
+	return length;
+}
+
 static int
 proc_info_preparse_args(int argc, char **argv)
 {
@@ -178,7 +214,9 @@
 		{"xstats", 0, NULL, 0},
 		{"metrics", 0, NULL, 0},
 		{"xstats-reset", 0, NULL, 0},
+		{"xstats-name", required_argument, NULL, 1},
 		{"collectd-format", 0, NULL, 0},
+		{"xstats-ids", 1, NULL, 1},
 		{"host-id", 0, NULL, 0},
 		{NULL, 0, 0, 0}
 	};
@@ -224,7 +262,28 @@
 					MAX_LONG_OPT_SZ))
 				reset_xstats = 1;
 			break;
+		case 1:
+			/* Print xstat single value given by name*/
+			if (!strncmp(long_option[option_index].name,
+					"xstats-name", MAX_LONG_OPT_SZ)) {
+				enable_xstats_name = 1;
+				xstats_name = optarg;
+				printf("name:%s:%s\n",
+						long_option[option_index].name,
+						optarg);
+			} else if (!strncmp(long_option[option_index].name,
+					"xstats-ids",
+					MAX_LONG_OPT_SZ))	{
+				nb_xstats_ids = parse_xstats_ids(optarg,
+						xstats_ids, MAX_NB_XSTATS_IDS);
+
+				if (nb_xstats_ids <= 0) {
+					printf("xstats-id list parse error.\n");
+					return -1;
+				}
 
+			}
+			break;
 		default:
 			proc_info_usage(prgname);
 			return -1;
@@ -351,20 +410,82 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 }
 
 static void
+nic_xstats_by_name_display(uint8_t port_id, char *name)
+{
+	uint64_t id;
+
+	printf("###### NIC statistics for port %-2d, statistic name '%s':\n",
+			   port_id, name);
+
+	if (rte_eth_xstats_get_id_by_name(port_id, name, &id) == 0)
+		printf("%s: %"PRIu64"\n", name, id);
+	else
+		printf("Statistic not found...\n");
+
+}
+
+static void
+nic_xstats_by_ids_display(uint8_t port_id, uint64_t *ids, int len)
+{
+	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
+	int ret, i;
+	static const char *nic_stats_border = "########################";
+
+	values = malloc(sizeof(values) * len);
+	if (values == NULL) {
+		printf("Cannot allocate memory for xstats\n");
+		return;
+	}
+
+	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+	if (xstats_names == NULL) {
+		printf("Cannot allocate memory for xstat names\n");
+		free(values);
+		return;
+	}
+
+	if (len != rte_eth_xstats_get_names(
+			port_id, xstats_names, len, ids)) {
+		printf("Cannot get xstat names\n");
+		goto err;
+	}
+
+	printf("###### NIC extended statistics for port %-2d #########\n",
+			   port_id);
+	printf("%s############################\n", nic_stats_border);
+	ret = rte_eth_xstats_get(port_id, ids, values, len);
+	if (ret < 0 || ret > len) {
+		printf("Cannot get xstats\n");
+		goto err;
+	}
+
+	for (i = 0; i < len; i++)
+		printf("%s: %"PRIu64"\n",
+			xstats_names[i].name,
+			values[i]);
+
+	printf("%s############################\n", nic_stats_border);
+err:
+	free(values);
+	free(xstats_names);
+}
+
+static void
 nic_xstats_display(uint8_t port_id)
 {
 	struct rte_eth_xstat_name *xstats_names;
-	struct rte_eth_xstat *xstats;
+	uint64_t *values;
 	int len, ret, i;
 	static const char *nic_stats_border = "########################";
 
-	len = rte_eth_xstats_get_names(port_id, NULL, 0);
+	len = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
 	if (len < 0) {
 		printf("Cannot get xstats count\n");
 		return;
 	}
-	xstats = malloc(sizeof(xstats[0]) * len);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * len);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		return;
 	}
@@ -372,11 +493,11 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
 	if (xstats_names == NULL) {
 		printf("Cannot allocate memory for xstat names\n");
-		free(xstats);
+		free(values);
 		return;
 	}
 	if (len != rte_eth_xstats_get_names(
-			port_id, xstats_names, len)) {
+			port_id, xstats_names, len, NULL)) {
 		printf("Cannot get xstat names\n");
 		goto err;
 	}
@@ -385,7 +506,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 			   port_id);
 	printf("%s############################\n",
 			   nic_stats_border);
-	ret = rte_eth_xstats_get(port_id, xstats, len);
+	ret = rte_eth_xstats_get(port_id, NULL, values, len);
 	if (ret < 0 || ret > len) {
 		printf("Cannot get xstats\n");
 		goto err;
@@ -401,18 +522,18 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 						  xstats_names[i].name);
 			sprintf(buf, "PUTVAL %s/dpdkstat-port.%u/%s-%s N:%"
 				PRIu64"\n", host_id, port_id, counter_type,
-				xstats_names[i].name, xstats[i].value);
+				xstats_names[i].name, values[i]);
 			write(stdout_fd, buf, strlen(buf));
 		} else {
 			printf("%s: %"PRIu64"\n", xstats_names[i].name,
-			       xstats[i].value);
+					values[i]);
 		}
 	}
 
 	printf("%s############################\n",
 			   nic_stats_border);
 err:
-	free(xstats);
+	free(values);
 	free(xstats_names);
 }
 
@@ -551,6 +672,11 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 				nic_stats_clear(i);
 			else if (reset_xstats)
 				nic_xstats_clear(i);
+			else if (enable_xstats_name)
+				nic_xstats_by_name_display(i, xstats_name);
+			else if (nb_xstats_ids > 0)
+				nic_xstats_by_ids_display(i, xstats_ids,
+						nb_xstats_ids);
 			else if (enable_metrics)
 				metrics_display(i);
 		}
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 4d873cd..ef07925 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -264,9 +264,9 @@ struct rss_type_info {
 void
 nic_xstats_display(portid_t port_id)
 {
-	struct rte_eth_xstat *xstats;
 	int cnt_xstats, idx_xstat;
 	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
 	if (!rte_eth_dev_is_valid_port(port_id)) {
@@ -275,7 +275,7 @@ struct rss_type_info {
 	}
 
 	/* Get count */
-	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
 	if (cnt_xstats  < 0) {
 		printf("Error: Cannot get count of xstats\n");
 		return;
@@ -288,23 +288,24 @@ struct rss_type_info {
 		return;
 	}
 	if (cnt_xstats != rte_eth_xstats_get_names(
-			port_id, xstats_names, cnt_xstats)) {
+			port_id, xstats_names, cnt_xstats, NULL)) {
 		printf("Error: Cannot get xstats lookup\n");
 		free(xstats_names);
 		return;
 	}
 
 	/* Get stats themselves */
-	xstats = malloc(sizeof(struct rte_eth_xstat) * cnt_xstats);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * cnt_xstats);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		free(xstats_names);
 		return;
 	}
-	if (cnt_xstats != rte_eth_xstats_get(port_id, xstats, cnt_xstats)) {
+	if (cnt_xstats != rte_eth_xstats_get(port_id, NULL, values,
+			cnt_xstats)) {
 		printf("Error: Unable to get xstats\n");
 		free(xstats_names);
-		free(xstats);
+		free(values);
 		return;
 	}
 
@@ -312,9 +313,9 @@ struct rss_type_info {
 	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++)
 		printf("%s: %"PRIu64"\n",
 			xstats_names[idx_xstat].name,
-			xstats[idx_xstat].value);
+			values[idx_xstat]);
 	free(xstats_names);
-	free(xstats);
+	free(values);
 }
 
 void
diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index e48c121..3372f30 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -334,24 +334,20 @@ The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK
 Extended Statistics API
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-The extended statistics API allows each individual PMD to expose a unique set
-of statistics. Accessing these from application programs is done via two
-functions:
-
-* ``rte_eth_xstats_get``: Fills in an array of ``struct rte_eth_xstat``
-  with extended statistics.
-* ``rte_eth_xstats_get_names``: Fills in an array of
-  ``struct rte_eth_xstat_name`` with extended statistic name lookup
-  information.
-
-Each ``struct rte_eth_xstat`` contains an identifier and value pair, and
-each ``struct rte_eth_xstat_name`` contains a string. Each identifier
-within the ``struct rte_eth_xstat`` lookup array must have a corresponding
-entry in the ``struct rte_eth_xstat_name`` lookup array. Within the latter
-the index of the entry is the identifier the string is associated with.
-These identifiers, as well as the number of extended statistic exposed, must
-remain constant during runtime. Note that extended statistic identifiers are
-driver-specific, and hence might not be the same for different ports.
+The extended statistics API allows a PMD to expose all statistics that are
+available to it, including statistics that are unique to the device.
+Each statistic has three properties ``name``, ``id`` and ``value``:
+
+* ``name``: A human readable string formatted by the scheme detailed below.
+* ``id``: An integer that represents only that statistic.
+* ``value``: A unsigned 64-bit integer that is the value of the statistic.
+
+The API consists of various ``rte_eth_xstats_*()`` functions, and allows an
+application to be flexible in how it retrieves statistics.
+
+
+Scheme for Human Readable Names
+_______________________________
 
 A naming scheme exists for the strings exposed to clients of the API. This is
 to allow scraping of the API for statistics of interest. The naming scheme uses
@@ -363,8 +359,8 @@ strings split by a single underscore ``_``. The scheme is as follows:
 * detail n
 * unit
 
-Examples of common statistics xstats strings, formatted to comply to the scheme
-proposed above:
+Examples of common statistics xstats strings, formatted to comply to the
+above scheme:
 
 * ``rx_bytes``
 * ``rx_crc_errors``
@@ -378,7 +374,7 @@ associated with the receive side of the NIC.  The second component ``packets``
 indicates that the unit of measure is packets.
 
 A more complicated example: ``tx_size_128_to_255_packets``. In this example,
-``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc are
+``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc. are
 more details, and ``packets`` indicates that this is a packet counter.
 
 Some additions in the metadata scheme are as follows:
@@ -392,3 +388,139 @@ Some additions in the metadata scheme are as follows:
 An example where queue numbers are used is as follows: ``tx_q7_bytes`` which
 indicates this statistic applies to queue number 7, and represents the number
 of transmitted bytes on that queue.
+
+API Design
+__________
+
+The xstats API uses the ``name``, ``id``, and ``value`` to allow performant
+lookup of specific statistics. Performant lookup means two things;
+
+* No string comparisons with the ``name`` of the statistic in fast-path
+* Allow requesting of only the statistics of interest
+
+The API ensures these requirements are met by mapping the ``name`` of the
+statistic to a unique ``id``, which is used as a key for lookup in the fast-path.
+The API allows applications to request an array of ``id`` values, so that the
+PMD only performs the required calculations. Expected usage is that the
+application scans the ``name`` of each statistic, and caches the ``id``
+if it has an interest in that statistic. On the fast-path, the integer can be used
+to retrieve the actual ``value`` of the statistic that the ``id`` represents.
+
+API Functions
+_____________
+
+The API is built out of a small number of functions, which can be used to
+retrieve the number of statistics and the names, IDs and values of those
+statistics.
+
+* ``rte_eth_xstats_get_names()``: returns the names of the statistics. When given a
+  ``NULL`` parameter the function returns the number of statistics that are available.
+
+* ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches
+  ``xstat_name``. If found, the ``id`` integer is set.
+
+* ``rte_eth_xstats_get()``: Fills in an array of ``uint64_t`` values
+  with matching the provided ``ids`` array. If the ``ids`` array is NULL, it
+  returns all statistics that are available.
+
+
+Application Usage
+_________________
+
+Imagine an application that wants to view the dropped packet count. If no
+packets are dropped, the application does not read any other metrics for
+performance reasons. If packets are dropped, the application has a particular
+set of statistics that it requests. This "set" of statistics allows the app to
+decide what next steps to perform. The following code-snippets show how the
+xstats API can be used to achieve this goal.
+
+First step is to get all statistics names and list them:
+
+.. code-block:: c
+
+    struct rte_eth_xstat_name *xstats_names;
+    uint64_t *values;
+    int len, i;
+
+    /* Get number of stats */
+    len = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
+    if (len < 0) {
+        printf("Cannot get xstats count\n");
+        goto err;
+    }
+
+    xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+    if (xstats_names == NULL) {
+        printf("Cannot allocate memory for xstat names\n");
+        goto err;
+    }
+
+    /* Retrieve xstats names, passing NULL for IDs to return all statistics */
+    if (len != rte_eth_xstats_get_names(port_id, xstats_names, NULL, len)) {
+        printf("Cannot get xstat names\n");
+        goto err;
+    }
+
+    values = malloc(sizeof(values) * len);
+    if (values == NULL) {
+        printf("Cannot allocate memory for xstats\n");
+        goto err;
+    }
+
+    /* Getting xstats values */
+    if (len != rte_eth_xstats_get(port_id, NULL, values, len)) {
+        printf("Cannot get xstat values\n");
+        goto err;
+    }
+
+    /* Print all xstats names and values */
+    for (i = 0; i < len; i++) {
+        printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]);
+    }
+
+The application has access to the names of all of the statistics that the PMD
+exposes. The application can decide which statistics are of interest, cache the
+ids of those statistics by looking up the name as follows:
+
+.. code-block:: c
+
+    uint64_t id;
+    uint64_t value;
+    const char *xstat_name = "rx_errors";
+
+    if(!rte_eth_xstats_get_id_by_name(port_id, xstat_name, &id)) {
+        rte_eth_xstats_get(port_id, &id, &value, 1);
+        printf("%s: %"PRIu64"\n", xstat_name, value);
+    }
+    else {
+        printf("Cannot find xstats with a given name\n");
+        goto err;
+    }
+
+The API provides flexibility to the application so that it can look up multiple
+statistics using an array containing multiple ``id`` numbers. This reduces the
+function call overhead of retrieving statistics, and makes lookup of multiple
+statistics simpler for the application.
+
+.. code-block:: c
+
+    #define APP_NUM_STATS 4
+    /* application cached these ids previously; see above */
+    uint64_t ids_array[APP_NUM_STATS] = {3,4,7,21};
+    uint64_t value_array[APP_NUM_STATS];
+
+    /* Getting multiple xstats values from array of IDs */
+    rte_eth_xstats_get(port_id, ids_array, value_array, APP_NUM_STATS);
+
+    uint32_t i;
+    for(i = 0; i < APP_NUM_STATS; i++) {
+        printf("%d: %"PRIu64"\n", ids_array[i], value_array[i]);
+    }
+
+
+This array lookup API for xstats allows the application create multiple
+"groups" of statistics, and look up the values of those IDs using a single API
+call. As an end result, the application is able to achieve its goal of
+monitoring a single statistic ("rx_errors" in this case), and if that shows
+packets being dropped, it can easily retrieve a "set" of statistics using the
+IDs array parameter to ``rte_eth_xstats_get`` function.
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 4e1e6dc..0653cf7 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1454,12 +1454,19 @@ struct rte_eth_dev *
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
+	if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+		count = (*dev->dev_ops->xstats_get_names_by_ids)(dev, NULL,
+				NULL, 0);
+		if (count < 0)
+			return count;
+	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
 			return count;
 	} else
 		count = 0;
+
 	count += RTE_NB_STATS;
 	count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
 		 RTE_NB_RXQ_STATS;
@@ -1469,150 +1476,367 @@ struct rte_eth_dev *
 }
 
 int
-rte_eth_xstats_get_names(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names,
-	unsigned size)
+rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id)
 {
-	struct rte_eth_dev *dev;
-	int cnt_used_entries;
-	int cnt_expected_entries;
-	int cnt_driver_entries;
-	uint32_t idx, id_queue;
-	uint16_t num_q;
+	int cnt_xstats, idx_xstat;
 
-	cnt_expected_entries = get_xstats_count(port_id);
-	if (xstats_names == NULL || cnt_expected_entries < 0 ||
-			(int)size < cnt_expected_entries)
-		return cnt_expected_entries;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 
-	/* port_id checked in get_xstats_count() */
-	dev = &rte_eth_devices[port_id];
-	cnt_used_entries = 0;
+	if (!id) {
+		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
+		return -1;
+	}
 
-	for (idx = 0; idx < RTE_NB_STATS; idx++) {
-		snprintf(xstats_names[cnt_used_entries].name,
-			sizeof(xstats_names[0].name),
-			"%s", rte_stats_strings[idx].name);
-		cnt_used_entries++;
+	if (!xstat_name) {
+		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
+		return -1;
 	}
-	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+
+	/* Get count */
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
+	if (cnt_xstats  < 0) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
+		return -1;
+	}
+
+	/* Get id-name lookup table */
+	struct rte_eth_xstat_name xstats_names[cnt_xstats];
+
+	if (cnt_xstats != rte_eth_xstats_get_names(
+			port_id, xstats_names, cnt_xstats, NULL)) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
+		return -1;
+	}
+
+	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
+		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
+			*id = idx_xstat;
+			return 0;
+		};
+	}
+
+	return -EINVAL;
+}
+
+int
+rte_eth_xstats_get_names_v1607(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names,
+	unsigned int size)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, size, NULL);
+}
+VERSION_SYMBOL(rte_eth_xstats_get_names, _v1607, 16.07);
+
+int
+rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids)
+{
+	/* Get all xstats */
+	if (!ids) {
+		struct rte_eth_dev *dev;
+		int cnt_used_entries;
+		int cnt_expected_entries;
+		int cnt_driver_entries;
+		uint32_t idx, id_queue;
+		uint16_t num_q;
+
+		cnt_expected_entries = get_xstats_count(port_id);
+		if (xstats_names == NULL || cnt_expected_entries < 0 ||
+				(int)size < cnt_expected_entries)
+			return cnt_expected_entries;
+
+		/* port_id checked in get_xstats_count() */
+		dev = &rte_eth_devices[port_id];
+		cnt_used_entries = 0;
+
+		for (idx = 0; idx < RTE_NB_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
-				"rx_q%u%s",
-				id_queue, rte_rxq_stats_strings[idx].name);
+				"%s", rte_stats_strings[idx].name);
 			cnt_used_entries++;
 		}
+		num_q = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"rx_q%u%s",
+					id_queue,
+					rte_rxq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+
+		}
+		num_q = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"tx_q%u%s",
+					id_queue,
+					rte_txq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+		}
+
+		if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries =
+				(*dev->dev_ops->xstats_get_names_by_ids)(
+				dev,
+				xstats_names + cnt_used_entries,
+				NULL,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+
+		} else if (dev->dev_ops->xstats_get_names != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+				dev,
+				xstats_names + cnt_used_entries,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+		}
 
+		return cnt_used_entries;
 	}
-	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
-			snprintf(xstats_names[cnt_used_entries].name,
-				sizeof(xstats_names[0].name),
-				"tx_q%u%s",
-				id_queue, rte_txq_stats_strings[idx].name);
-			cnt_used_entries++;
+	/* Get only xstats given by IDS */
+	else {
+		uint16_t len, i;
+		struct rte_eth_xstat_name *xstats_names_copy;
+
+		len = rte_eth_xstats_get_names_v1705(port_id, NULL, 0, NULL);
+
+		xstats_names_copy =
+				malloc(sizeof(struct rte_eth_xstat_name) * len);
+		if (!xstats_names_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			     "ERROR: can't allocate memory for values_copy\n");
+			free(xstats_names_copy);
+			return -1;
 		}
+
+		rte_eth_xstats_get_names_v1705(port_id, xstats_names_copy,
+				len, NULL);
+
+		for (i = 0; i < size; i++) {
+			if (ids[i] >= len) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			strcpy(xstats_names[i].name,
+					xstats_names_copy[ids[i]].name);
+		}
+		free(xstats_names_copy);
+		return size;
 	}
+}
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names, _v1705, 17.05);
 
-	if (dev->dev_ops->xstats_get_names != NULL) {
-		/* If there are any driver-specific xstats, append them
-		 * to end of list.
-		 */
-		cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
-			dev,
-			xstats_names + cnt_used_entries,
-			size - cnt_used_entries);
-		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
-		cnt_used_entries += cnt_driver_entries;
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get_names(uint8_t port_id,
+			struct rte_eth_xstat_name *xstats_names,
+			unsigned int size,
+			uint64_t *ids), rte_eth_xstats_get_names_v1705);
+
+/* retrieve ethdev extended statistics */
+int
+rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	return cnt_used_entries;
+	for (i = 0; i < n; i++) {
+		xstats[i].id = i;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
 }
+VERSION_SYMBOL(rte_eth_xstats_get, _v22, 2.2);
 
 /* retrieve ethdev extended statistics */
 int
-rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned n)
+rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n)
 {
-	struct rte_eth_stats eth_stats;
-	struct rte_eth_dev *dev;
-	unsigned count = 0, i, q;
-	signed xcount = 0;
-	uint64_t val, *stats_ptr;
-	uint16_t nb_rxqs, nb_txqs;
+	/* If need all xstats */
+	if (!ids) {
+		struct rte_eth_stats eth_stats;
+		struct rte_eth_dev *dev;
+		unsigned int count = 0, i, q;
+		signed int xcount = 0;
+		uint64_t val, *stats_ptr;
+		uint16_t nb_rxqs, nb_txqs;
 
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+		dev = &rte_eth_devices[port_id];
 
-	dev = &rte_eth_devices[port_id];
+		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		nb_txqs = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
 
-	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		/* Return generic statistics */
+		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+			(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* Return generic statistics */
-	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
-		(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* implemented by the driver */
-	if (dev->dev_ops->xstats_get != NULL) {
-		/* Retrieve the xstats from the driver at the end of the
-		 * xstats struct.
-		 */
-		xcount = (*dev->dev_ops->xstats_get)(dev,
-				     xstats ? xstats + count : NULL,
-				     (n > count) ? n - count : 0);
+		/* implemented by the driver */
+		if (dev->dev_ops->xstats_get_by_ids != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 */
+			xcount = (*dev->dev_ops->xstats_get_by_ids)(dev,
+					NULL,
+					values ? values + count : NULL,
+					(n > count) ? n - count : 0);
+
+			if (xcount < 0)
+				return xcount;
+		/* implemented by the driver */
+		} else if (dev->dev_ops->xstats_get != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 * Compatibility for PMD without xstats_get_by_ids
+			 */
+			unsigned int size = (n > count) ? n - count : 1;
+			struct rte_eth_xstat xstats[size];
 
-		if (xcount < 0)
-			return xcount;
-	}
+			xcount = (*dev->dev_ops->xstats_get)(dev,
+					values ? xstats : NULL,	size);
 
-	if (n < count + xcount || xstats == NULL)
-		return count + xcount;
+			if (xcount < 0)
+				return xcount;
+
+			if (values != NULL)
+				for (i = 0 ; i < (unsigned int)xcount; i++)
+					values[i + count] = xstats[i].value;
+		}
 
-	/* now fill the xstats structure */
-	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+		if (n < count + xcount || values == NULL)
+			return count + xcount;
 
-	/* global stats */
-	for (i = 0; i < RTE_NB_STATS; i++) {
-		stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_stats_strings[i].offset);
-		val = *stats_ptr;
-		xstats[count++].value = val;
-	}
+		/* now fill the xstats structure */
+		count = 0;
+		rte_eth_stats_get(port_id, &eth_stats);
 
-	/* per-rxq stats */
-	for (q = 0; q < nb_rxqs; q++) {
-		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+		/* global stats */
+		for (i = 0; i < RTE_NB_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_rxq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
+						rte_stats_strings[i].offset);
 			val = *stats_ptr;
-			xstats[count++].value = val;
+			values[count++] = val;
 		}
+
+		/* per-rxq stats */
+		for (q = 0; q < nb_rxqs; q++) {
+			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_rxq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		/* per-txq stats */
+		for (q = 0; q < nb_txqs; q++) {
+			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_txq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		return count + xcount;
 	}
+	/* Need only xstats given by IDS array */
+	else {
+		uint16_t i, size;
+		uint64_t *values_copy;
 
-	/* per-txq stats */
-	for (q = 0; q < nb_txqs; q++) {
-		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
-			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_txq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
-			val = *stats_ptr;
-			xstats[count++].value = val;
+		size = rte_eth_xstats_get_v1705(port_id, NULL, NULL, 0);
+
+		values_copy = malloc(sizeof(values_copy) * size);
+		if (!values_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			    "ERROR: can't allocate memory for values_copy\n");
+			return -1;
+		}
+
+		rte_eth_xstats_get_v1705(port_id, NULL, values_copy, size);
+
+		for (i = 0; i < n; i++) {
+			if (ids[i] >= size) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			values[i] = values_copy[ids[i]];
 		}
+		free(values_copy);
+		return n;
+	}
+}
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get, _v1705, 17.05);
+
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get(uint8_t port_id, uint64_t *ids,
+		uint64_t *values, unsigned int n), rte_eth_xstats_get_v1705);
+
+__rte_deprecated int
+rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	for (i = 0; i < count; i++)
+	for (i = 0; i < n; i++) {
 		xstats[i].id = i;
-	/* add an offset to driver-specific stats */
-	for ( ; i < count + xcount; i++)
-		xstats[i].id += count;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
+}
 
-	return count + xcount;
+__rte_deprecated int
+rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, n, NULL);
 }
 
 /* reset ethdev extended statistics */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d072538..0ef9d20 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -186,6 +186,7 @@
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
+#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1118,6 +1119,10 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_by_ids_t)(struct rte_eth_dev *dev,
+		uint64_t *ids, uint64_t *values, unsigned int n);
+/**< @internal Get extended stats of an Ethernet device. */
+
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1125,6 +1130,17 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_names_by_ids_t)(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+/**< @internal Get names of extended stats of an Ethernet device. */
+
+typedef int (*eth_xstats_get_by_name_t)(struct rte_eth_dev *dev,
+		struct rte_eth_xstat_name *xstats_names,
+		struct rte_eth_xstat *xstat,
+		const char *name);
+/**< @internal Get xstat specified by name of an Ethernet device. */
+
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1466,8 +1482,8 @@ struct eth_dev_ops {
 	eth_stats_reset_t          stats_reset;   /**< Reset generic device statistics. */
 	eth_xstats_get_t           xstats_get;    /**< Get extended device statistics. */
 	eth_xstats_reset_t         xstats_reset;  /**< Reset extended device statistics. */
-	eth_xstats_get_names_t     xstats_get_names;
-	/**< Get names of extended statistics. */
+	eth_xstats_get_names_t	   xstats_get_names;
+	/**< Get names of extended device statistics. */
 	eth_queue_stats_mapping_set_t queue_stats_mapping_set;
 	/**< Configure per queue stat counter mapping. */
 
@@ -1563,6 +1579,12 @@ struct eth_dev_ops {
 	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
+	eth_xstats_get_by_ids_t    xstats_get_by_ids;
+	/**< Get extended device statistics by ID. */
+	eth_xstats_get_names_by_ids_t xstats_get_names_by_ids;
+	/**< Get name of extended device statistics by ID. */
+	eth_xstats_get_by_name_t   xstats_get_by_name;
+	/**< Get extended device statistics by name. */
 };
 
 /**
@@ -2324,8 +2346,55 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  */
 void rte_eth_stats_reset(uint8_t port_id);
 
+
 /**
- * Retrieve names of extended statistics of an Ethernet device.
+* Gets the ID of a statistic from its name.
+*
+* Note this function searches for the statistics using string compares, and
+* as such should not be used on the fast-path. For fast-path retrieval of
+* specific statistics, store the ID as provided in *id* from this function,
+* and pass the ID to rte_eth_xstats_get()
+*
+* @param port_id The port to look up statistics from
+* @param xstat_name The name of the statistic to return
+* @param[out] id A pointer to an app-supplied uint64_t which should be
+*                set to the ID of the stat if the stat exists.
+* @return
+*    0 on success
+*    -ENODEV for invalid port_id,
+*    -EINVAL if the xstat_name doesn't exist in port_id
+*/
+int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id);
+
+/**
+ * Retrieve all extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+__rte_deprecated
+int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve names of all extended statistics of an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
@@ -2333,7 +2402,7 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *   An rte_eth_xstat_name array of at least *size* elements to
  *   be filled. If set to NULL, the function returns the required number
  *   of elements.
- * @param size
+ * @param n
  *   The size of the xstats_names array (number of elements).
  * @return
  *   - A positive value lower or equal to size: success. The return value
@@ -2344,9 +2413,8 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get_names(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names,
-		unsigned size);
+__rte_deprecated int rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
 
 /**
  * Retrieve extended statistics of an Ethernet device.
@@ -2370,8 +2438,92 @@ int rte_eth_xstats_get_names(uint8_t port_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-		unsigned n);
+int rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ids
+ *   A pointer to an ids array passed by application. This tells wich
+ *   statistics values function should retrieve. This parameter
+ *   can be set to NULL if n is 0. In this case function will retrieve
+ *   all avalible statistics.
+ * @param values
+ *   A pointer to a table to be filled with device statistics values.
+ * @param n
+ *   The size of the ids array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1607(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+
+/**
+ * Retrieve names of extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   An rte_eth_xstat_name array of at least *size* elements to
+ *   be filled. If set to NULL, the function returns the required number
+ *   of elements.
+ * @param ids
+ *   IDs array given by app to retrieve specific statistics
+ * @param size
+ *   The size of the xstats_names array (number of elements).
+ * @return
+ *   - A positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
+
+int rte_eth_xstats_get_names(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
 
 /**
  * Reset extended statistics of an Ethernet device.
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 0ea3856..f4d0136 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -159,5 +159,10 @@ DPDK_17.05 {
 	global:
 
 	rte_eth_find_next;
+	rte_eth_xstats_get_names;
+	rte_eth_xstats_get;
+	rte_eth_xstats_get_all;
+	rte_eth_xstats_get_names_all;
+	rte_eth_xstats_get_id_by_name;
 
 } DPDK_17.02;
-- 
1.9.1

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v5 0/3] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-10 17:59  1%         ` [dpdk-dev] [PATCH v4 1/3] ethdev: new xstats API add retrieving by ID Jacek Piasecki
@ 2017-04-11 16:37  4%           ` Michal Jastrzebski
  2017-04-11 16:37  1%             ` [dpdk-dev] [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID Michal Jastrzebski
  0 siblings, 1 reply; 200+ results
From: Michal Jastrzebski @ 2017-04-11 16:37 UTC (permalink / raw)
  To: dev; +Cc: deepak.k.jain, harry.van.haaren, Jacek Piasecki

From: Jacek Piasecki <jacekx.piasecki@intel.com>

Extended xstats API in ethdev library to allow grouping of stats logically
so they can be retrieved per logical grouping  managed by the application.
Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
to use a new list of arguments: array of ids and array of values.
ABI versioning mechanism was used to support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the previous
ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
but use new API inside. Both functions marked as deprecated. 
Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
xstats ids by its names.
Extended functionality of proc_info application:
--xstats-name NAME: to display single xstat value by NAME
Updated test-pmd application to use new API.

v5 changes:
* fix clang shared build compilation
* remove wrong versioning macros
* Makefile LIBABIVER 6 change 

v4 changes:
* documentation change after API modification
* fix xstats display for PMD without _by_ids() functions
* fix ABI validator errors

v3 changes:
* checkpatch fixes
* removed malloc bug in ethdev
* add new command to proc_info and IDs parsing
* merged testpmd and proc_info patch with library patch

Jacek Piasecki (3):
  ethdev: new xstats API add retrieving by ID
  net/e1000: new xstats API add ID support for e1000
  net/ixgbe: new xstats API add ID support for ixgbe

 app/proc_info/main.c                    | 150 ++++++++++-
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 174 +++++++++++--
 drivers/net/e1000/igb_ethdev.c          |  92 ++++++-
 drivers/net/ixgbe/ixgbe_ethdev.c        | 179 +++++++++++++
 lib/librte_ether/rte_ethdev.c           | 430 ++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h           | 170 ++++++++++++-
 lib/librte_ether/rte_ether_version.map  |   5 +
 8 files changed, 1063 insertions(+), 156 deletions(-)

-- 
1.9.1

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 1/3] ethdev: new xstats API add retrieving by ID
  2017-04-10 17:59  4%       ` [dpdk-dev] [PATCH v4 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
@ 2017-04-10 17:59  1%         ` Jacek Piasecki
  2017-04-11 16:37  4%           ` [dpdk-dev] [PATCH v5 0/3] Extended xstats API in ethdev library to allow grouping of stats Michal Jastrzebski
  0 siblings, 1 reply; 200+ results
From: Jacek Piasecki @ 2017-04-10 17:59 UTC (permalink / raw)
  To: dev
  Cc: harry.van.haaren, deepak.k.jain, Jacek Piasecki, Kuba Kozak,
	Tomasz Kulasek

Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping  managed
by the application.
Changed existing functions rte_eth_xstats_get_names and
rte_eth_xstats_get to use a new list of arguments: array of ids
and array of values. ABI versioning mechanism was used to
support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the
previous ones (respectively rte_eth_xstats_get and
rte_eth_xstats_get_names) but use new API inside.
Both functions marked as deprecated.
Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.

test-pmd: add support for new xstats API retrieving by id in
testpmd application: xstats_get() and
xstats_get_names() call with modified parameters.

proc_info: add support for new xstats API retrieving by id to
proc_info application. There is a new argument --xstats-ids
in proc_info command line to retrieve statistics given by ids.
E.g. --xstats-ids="1,3,5,7,8"

doc: add description for modified xstats API
Documentation change for modified extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 app/proc_info/main.c                    | 150 ++++++++++-
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 174 +++++++++++--
 lib/librte_ether/Makefile               |   2 +-
 lib/librte_ether/rte_ethdev.c           | 426 ++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h           | 176 ++++++++++++-
 lib/librte_ether/rte_ether_version.map  |   5 +
 7 files changed, 797 insertions(+), 155 deletions(-)

diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index d576b42..d0eae4b 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -86,6 +86,14 @@
 static uint32_t reset_xstats;
 /**< Enable memory info. */
 static uint32_t mem_info;
+/**< Enable displaying xstat name. */
+static uint32_t enable_xstats_name;
+static char *xstats_name;
+
+/**< Enable xstats by ids. */
+#define MAX_NB_XSTATS_IDS 1024
+static uint32_t nb_xstats_ids;
+static uint64_t xstats_ids[MAX_NB_XSTATS_IDS];
 
 /**< display usage */
 static void
@@ -97,8 +105,9 @@
 		"  --stats: to display port statistics, enabled by default\n"
 		"  --xstats: to display extended port statistics, disabled by "
 			"default\n"
-		"  --metrics: to display derived metrics of the ports, disabled by "
-			"default\n"
+		"  --xstats-name NAME: to display single xstat value by NAME\n"
+		"  --xstats-ids IDLIST: to display xstat values by id. "
+			"The argument is comma-separated list of xstat ids to print out.\n"
 		"  --stats-reset: to reset port statistics\n"
 		"  --xstats-reset: to reset port extended statistics\n"
 		"  --collectd-format: to print statistics to STDOUT in expected by collectd format\n"
@@ -132,6 +141,33 @@
 
 }
 
+/*
+ * Parse ids value list into array
+ */
+static int
+parse_xstats_ids(char *list, uint64_t *ids, int limit) {
+	int length;
+	char *token;
+	char *ctx = NULL;
+	char *endptr;
+
+	length = 0;
+	token = strtok_r(list, ",", &ctx);
+	while (token != NULL) {
+		ids[length] = strtoull(token, &endptr, 10);
+		if (*endptr != '\0')
+			return -EINVAL;
+
+		length++;
+		if (length >= limit)
+			return -E2BIG;
+
+		token = strtok_r(NULL, ",", &ctx);
+	}
+
+	return length;
+}
+
 static int
 proc_info_preparse_args(int argc, char **argv)
 {
@@ -178,7 +214,9 @@
 		{"xstats", 0, NULL, 0},
 		{"metrics", 0, NULL, 0},
 		{"xstats-reset", 0, NULL, 0},
+		{"xstats-name", required_argument, NULL, 1},
 		{"collectd-format", 0, NULL, 0},
+		{"xstats-ids", 1, NULL, 1},
 		{"host-id", 0, NULL, 0},
 		{NULL, 0, 0, 0}
 	};
@@ -224,7 +262,28 @@
 					MAX_LONG_OPT_SZ))
 				reset_xstats = 1;
 			break;
+		case 1:
+			/* Print xstat single value given by name*/
+			if (!strncmp(long_option[option_index].name,
+					"xstats-name", MAX_LONG_OPT_SZ)) {
+				enable_xstats_name = 1;
+				xstats_name = optarg;
+				printf("name:%s:%s\n",
+						long_option[option_index].name,
+						optarg);
+			} else if (!strncmp(long_option[option_index].name,
+					"xstats-ids",
+					MAX_LONG_OPT_SZ))	{
+				nb_xstats_ids = parse_xstats_ids(optarg,
+						xstats_ids, MAX_NB_XSTATS_IDS);
+
+				if (nb_xstats_ids <= 0) {
+					printf("xstats-id list parse error.\n");
+					return -1;
+				}
 
+			}
+			break;
 		default:
 			proc_info_usage(prgname);
 			return -1;
@@ -351,20 +410,82 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 }
 
 static void
+nic_xstats_by_name_display(uint8_t port_id, char *name)
+{
+	uint64_t id;
+
+	printf("###### NIC statistics for port %-2d, statistic name '%s':\n",
+			   port_id, name);
+
+	if (rte_eth_xstats_get_id_by_name(port_id, name, &id) == 0)
+		printf("%s: %"PRIu64"\n", name, id);
+	else
+		printf("Statistic not found...\n");
+
+}
+
+static void
+nic_xstats_by_ids_display(uint8_t port_id, uint64_t *ids, int len)
+{
+	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
+	int ret, i;
+	static const char *nic_stats_border = "########################";
+
+	values = malloc(sizeof(values) * len);
+	if (values == NULL) {
+		printf("Cannot allocate memory for xstats\n");
+		return;
+	}
+
+	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+	if (xstats_names == NULL) {
+		printf("Cannot allocate memory for xstat names\n");
+		free(values);
+		return;
+	}
+
+	if (len != rte_eth_xstats_get_names(
+			port_id, xstats_names, len, ids)) {
+		printf("Cannot get xstat names\n");
+		goto err;
+	}
+
+	printf("###### NIC extended statistics for port %-2d #########\n",
+			   port_id);
+	printf("%s############################\n", nic_stats_border);
+	ret = rte_eth_xstats_get(port_id, ids, values, len);
+	if (ret < 0 || ret > len) {
+		printf("Cannot get xstats\n");
+		goto err;
+	}
+
+	for (i = 0; i < len; i++)
+		printf("%s: %"PRIu64"\n",
+			xstats_names[i].name,
+			values[i]);
+
+	printf("%s############################\n", nic_stats_border);
+err:
+	free(values);
+	free(xstats_names);
+}
+
+static void
 nic_xstats_display(uint8_t port_id)
 {
 	struct rte_eth_xstat_name *xstats_names;
-	struct rte_eth_xstat *xstats;
+	uint64_t *values;
 	int len, ret, i;
 	static const char *nic_stats_border = "########################";
 
-	len = rte_eth_xstats_get_names(port_id, NULL, 0);
+	len = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
 	if (len < 0) {
 		printf("Cannot get xstats count\n");
 		return;
 	}
-	xstats = malloc(sizeof(xstats[0]) * len);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * len);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		return;
 	}
@@ -372,11 +493,11 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
 	if (xstats_names == NULL) {
 		printf("Cannot allocate memory for xstat names\n");
-		free(xstats);
+		free(values);
 		return;
 	}
 	if (len != rte_eth_xstats_get_names(
-			port_id, xstats_names, len)) {
+			port_id, xstats_names, len, NULL)) {
 		printf("Cannot get xstat names\n");
 		goto err;
 	}
@@ -385,7 +506,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 			   port_id);
 	printf("%s############################\n",
 			   nic_stats_border);
-	ret = rte_eth_xstats_get(port_id, xstats, len);
+	ret = rte_eth_xstats_get(port_id, NULL, values, len);
 	if (ret < 0 || ret > len) {
 		printf("Cannot get xstats\n");
 		goto err;
@@ -401,18 +522,18 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 						  xstats_names[i].name);
 			sprintf(buf, "PUTVAL %s/dpdkstat-port.%u/%s-%s N:%"
 				PRIu64"\n", host_id, port_id, counter_type,
-				xstats_names[i].name, xstats[i].value);
+				xstats_names[i].name, values[i]);
 			write(stdout_fd, buf, strlen(buf));
 		} else {
 			printf("%s: %"PRIu64"\n", xstats_names[i].name,
-			       xstats[i].value);
+					values[i]);
 		}
 	}
 
 	printf("%s############################\n",
 			   nic_stats_border);
 err:
-	free(xstats);
+	free(values);
 	free(xstats_names);
 }
 
@@ -551,6 +672,11 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 				nic_stats_clear(i);
 			else if (reset_xstats)
 				nic_xstats_clear(i);
+			else if (enable_xstats_name)
+				nic_xstats_by_name_display(i, xstats_name);
+			else if (nb_xstats_ids > 0)
+				nic_xstats_by_ids_display(i, xstats_ids,
+						nb_xstats_ids);
 			else if (enable_metrics)
 				metrics_display(i);
 		}
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 4d873cd..ef07925 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -264,9 +264,9 @@ struct rss_type_info {
 void
 nic_xstats_display(portid_t port_id)
 {
-	struct rte_eth_xstat *xstats;
 	int cnt_xstats, idx_xstat;
 	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
 	if (!rte_eth_dev_is_valid_port(port_id)) {
@@ -275,7 +275,7 @@ struct rss_type_info {
 	}
 
 	/* Get count */
-	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
 	if (cnt_xstats  < 0) {
 		printf("Error: Cannot get count of xstats\n");
 		return;
@@ -288,23 +288,24 @@ struct rss_type_info {
 		return;
 	}
 	if (cnt_xstats != rte_eth_xstats_get_names(
-			port_id, xstats_names, cnt_xstats)) {
+			port_id, xstats_names, cnt_xstats, NULL)) {
 		printf("Error: Cannot get xstats lookup\n");
 		free(xstats_names);
 		return;
 	}
 
 	/* Get stats themselves */
-	xstats = malloc(sizeof(struct rte_eth_xstat) * cnt_xstats);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * cnt_xstats);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		free(xstats_names);
 		return;
 	}
-	if (cnt_xstats != rte_eth_xstats_get(port_id, xstats, cnt_xstats)) {
+	if (cnt_xstats != rte_eth_xstats_get(port_id, NULL, values,
+			cnt_xstats)) {
 		printf("Error: Unable to get xstats\n");
 		free(xstats_names);
-		free(xstats);
+		free(values);
 		return;
 	}
 
@@ -312,9 +313,9 @@ struct rss_type_info {
 	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++)
 		printf("%s: %"PRIu64"\n",
 			xstats_names[idx_xstat].name,
-			xstats[idx_xstat].value);
+			values[idx_xstat]);
 	free(xstats_names);
-	free(xstats);
+	free(values);
 }
 
 void
diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index e48c121..3372f30 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -334,24 +334,20 @@ The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK
 Extended Statistics API
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-The extended statistics API allows each individual PMD to expose a unique set
-of statistics. Accessing these from application programs is done via two
-functions:
-
-* ``rte_eth_xstats_get``: Fills in an array of ``struct rte_eth_xstat``
-  with extended statistics.
-* ``rte_eth_xstats_get_names``: Fills in an array of
-  ``struct rte_eth_xstat_name`` with extended statistic name lookup
-  information.
-
-Each ``struct rte_eth_xstat`` contains an identifier and value pair, and
-each ``struct rte_eth_xstat_name`` contains a string. Each identifier
-within the ``struct rte_eth_xstat`` lookup array must have a corresponding
-entry in the ``struct rte_eth_xstat_name`` lookup array. Within the latter
-the index of the entry is the identifier the string is associated with.
-These identifiers, as well as the number of extended statistic exposed, must
-remain constant during runtime. Note that extended statistic identifiers are
-driver-specific, and hence might not be the same for different ports.
+The extended statistics API allows a PMD to expose all statistics that are
+available to it, including statistics that are unique to the device.
+Each statistic has three properties ``name``, ``id`` and ``value``:
+
+* ``name``: A human readable string formatted by the scheme detailed below.
+* ``id``: An integer that represents only that statistic.
+* ``value``: A unsigned 64-bit integer that is the value of the statistic.
+
+The API consists of various ``rte_eth_xstats_*()`` functions, and allows an
+application to be flexible in how it retrieves statistics.
+
+
+Scheme for Human Readable Names
+_______________________________
 
 A naming scheme exists for the strings exposed to clients of the API. This is
 to allow scraping of the API for statistics of interest. The naming scheme uses
@@ -363,8 +359,8 @@ strings split by a single underscore ``_``. The scheme is as follows:
 * detail n
 * unit
 
-Examples of common statistics xstats strings, formatted to comply to the scheme
-proposed above:
+Examples of common statistics xstats strings, formatted to comply to the
+above scheme:
 
 * ``rx_bytes``
 * ``rx_crc_errors``
@@ -378,7 +374,7 @@ associated with the receive side of the NIC.  The second component ``packets``
 indicates that the unit of measure is packets.
 
 A more complicated example: ``tx_size_128_to_255_packets``. In this example,
-``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc are
+``tx`` indicates transmission, ``size``  is the first detail, ``128`` etc. are
 more details, and ``packets`` indicates that this is a packet counter.
 
 Some additions in the metadata scheme are as follows:
@@ -392,3 +388,139 @@ Some additions in the metadata scheme are as follows:
 An example where queue numbers are used is as follows: ``tx_q7_bytes`` which
 indicates this statistic applies to queue number 7, and represents the number
 of transmitted bytes on that queue.
+
+API Design
+__________
+
+The xstats API uses the ``name``, ``id``, and ``value`` to allow performant
+lookup of specific statistics. Performant lookup means two things;
+
+* No string comparisons with the ``name`` of the statistic in fast-path
+* Allow requesting of only the statistics of interest
+
+The API ensures these requirements are met by mapping the ``name`` of the
+statistic to a unique ``id``, which is used as a key for lookup in the fast-path.
+The API allows applications to request an array of ``id`` values, so that the
+PMD only performs the required calculations. Expected usage is that the
+application scans the ``name`` of each statistic, and caches the ``id``
+if it has an interest in that statistic. On the fast-path, the integer can be used
+to retrieve the actual ``value`` of the statistic that the ``id`` represents.
+
+API Functions
+_____________
+
+The API is built out of a small number of functions, which can be used to
+retrieve the number of statistics and the names, IDs and values of those
+statistics.
+
+* ``rte_eth_xstats_get_names()``: returns the names of the statistics. When given a
+  ``NULL`` parameter the function returns the number of statistics that are available.
+
+* ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches
+  ``xstat_name``. If found, the ``id`` integer is set.
+
+* ``rte_eth_xstats_get()``: Fills in an array of ``uint64_t`` values
+  with matching the provided ``ids`` array. If the ``ids`` array is NULL, it
+  returns all statistics that are available.
+
+
+Application Usage
+_________________
+
+Imagine an application that wants to view the dropped packet count. If no
+packets are dropped, the application does not read any other metrics for
+performance reasons. If packets are dropped, the application has a particular
+set of statistics that it requests. This "set" of statistics allows the app to
+decide what next steps to perform. The following code-snippets show how the
+xstats API can be used to achieve this goal.
+
+First step is to get all statistics names and list them:
+
+.. code-block:: c
+
+    struct rte_eth_xstat_name *xstats_names;
+    uint64_t *values;
+    int len, i;
+
+    /* Get number of stats */
+    len = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
+    if (len < 0) {
+        printf("Cannot get xstats count\n");
+        goto err;
+    }
+
+    xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+    if (xstats_names == NULL) {
+        printf("Cannot allocate memory for xstat names\n");
+        goto err;
+    }
+
+    /* Retrieve xstats names, passing NULL for IDs to return all statistics */
+    if (len != rte_eth_xstats_get_names(port_id, xstats_names, NULL, len)) {
+        printf("Cannot get xstat names\n");
+        goto err;
+    }
+
+    values = malloc(sizeof(values) * len);
+    if (values == NULL) {
+        printf("Cannot allocate memory for xstats\n");
+        goto err;
+    }
+
+    /* Getting xstats values */
+    if (len != rte_eth_xstats_get(port_id, NULL, values, len)) {
+        printf("Cannot get xstat values\n");
+        goto err;
+    }
+
+    /* Print all xstats names and values */
+    for (i = 0; i < len; i++) {
+        printf("%s: %"PRIu64"\n", xstats_names[i].name, values[i]);
+    }
+
+The application has access to the names of all of the statistics that the PMD
+exposes. The application can decide which statistics are of interest, cache the
+ids of those statistics by looking up the name as follows:
+
+.. code-block:: c
+
+    uint64_t id;
+    uint64_t value;
+    const char *xstat_name = "rx_errors";
+
+    if(!rte_eth_xstats_get_id_by_name(port_id, xstat_name, &id)) {
+        rte_eth_xstats_get(port_id, &id, &value, 1);
+        printf("%s: %"PRIu64"\n", xstat_name, value);
+    }
+    else {
+        printf("Cannot find xstats with a given name\n");
+        goto err;
+    }
+
+The API provides flexibility to the application so that it can look up multiple
+statistics using an array containing multiple ``id`` numbers. This reduces the
+function call overhead of retrieving statistics, and makes lookup of multiple
+statistics simpler for the application.
+
+.. code-block:: c
+
+    #define APP_NUM_STATS 4
+    /* application cached these ids previously; see above */
+    uint64_t ids_array[APP_NUM_STATS] = {3,4,7,21};
+    uint64_t value_array[APP_NUM_STATS];
+
+    /* Getting multiple xstats values from array of IDs */
+    rte_eth_xstats_get(port_id, ids_array, value_array, APP_NUM_STATS);
+
+    uint32_t i;
+    for(i = 0; i < APP_NUM_STATS; i++) {
+        printf("%d: %"PRIu64"\n", ids_array[i], value_array[i]);
+    }
+
+
+This array lookup API for xstats allows the application create multiple
+"groups" of statistics, and look up the values of those IDs using a single API
+call. As an end result, the application is able to achieve its goal of
+monitoring a single statistic ("rx_errors" in this case), and if that shows
+packets being dropped, it can easily retrieve a "set" of statistics using the
+IDs array parameter to ``rte_eth_xstats_get`` function.
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 066114b..5bbf721 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)
 
 EXPORT_MAP := rte_ether_version.map
 
-LIBABIVER := 6
+LIBABIVER := 7
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 4e1e6dc..e5bab9c 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1454,12 +1454,19 @@ struct rte_eth_dev *
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
+	if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+		count = (*dev->dev_ops->xstats_get_names_by_ids)(dev, NULL,
+				NULL, 0);
+		if (count < 0)
+			return count;
+	}
 	if (dev->dev_ops->xstats_get_names != NULL) {
 		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
 		if (count < 0)
 			return count;
 	} else
 		count = 0;
+
 	count += RTE_NB_STATS;
 	count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
 		 RTE_NB_RXQ_STATS;
@@ -1469,150 +1476,363 @@ struct rte_eth_dev *
 }
 
 int
-rte_eth_xstats_get_names(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names,
-	unsigned size)
+rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id)
 {
-	struct rte_eth_dev *dev;
-	int cnt_used_entries;
-	int cnt_expected_entries;
-	int cnt_driver_entries;
-	uint32_t idx, id_queue;
-	uint16_t num_q;
+	int cnt_xstats, idx_xstat;
 
-	cnt_expected_entries = get_xstats_count(port_id);
-	if (xstats_names == NULL || cnt_expected_entries < 0 ||
-			(int)size < cnt_expected_entries)
-		return cnt_expected_entries;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 
-	/* port_id checked in get_xstats_count() */
-	dev = &rte_eth_devices[port_id];
-	cnt_used_entries = 0;
+	if (!id) {
+		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
+		return -1;
+	}
+
+	if (!xstat_name) {
+		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
+		return -1;
+	}
+
+	/* Get count */
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0, NULL);
+	if (cnt_xstats  < 0) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
+		return -1;
+	}
+
+	/* Get id-name lookup table */
+	struct rte_eth_xstat_name xstats_names[cnt_xstats];
+
+	if (cnt_xstats != rte_eth_xstats_get_names(
+			port_id, xstats_names, cnt_xstats, NULL)) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
+		return -1;
+	}
 
-	for (idx = 0; idx < RTE_NB_STATS; idx++) {
-		snprintf(xstats_names[cnt_used_entries].name,
-			sizeof(xstats_names[0].name),
-			"%s", rte_stats_strings[idx].name);
-		cnt_used_entries++;
+	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
+		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
+			*id = idx_xstat;
+			return 0;
+		};
 	}
-	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+
+	return -EINVAL;
+}
+
+int
+rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids)
+{
+	/* Get all xstats */
+	if (!ids) {
+		struct rte_eth_dev *dev;
+		int cnt_used_entries;
+		int cnt_expected_entries;
+		int cnt_driver_entries;
+		uint32_t idx, id_queue;
+		uint16_t num_q;
+
+		cnt_expected_entries = get_xstats_count(port_id);
+		if (xstats_names == NULL || cnt_expected_entries < 0 ||
+				(int)size < cnt_expected_entries)
+			return cnt_expected_entries;
+
+		/* port_id checked in get_xstats_count() */
+		dev = &rte_eth_devices[port_id];
+		cnt_used_entries = 0;
+
+		for (idx = 0; idx < RTE_NB_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
-				"rx_q%u%s",
-				id_queue, rte_rxq_stats_strings[idx].name);
+				"%s", rte_stats_strings[idx].name);
 			cnt_used_entries++;
 		}
+		num_q = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"rx_q%u%s",
+					id_queue,
+					rte_rxq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+
+		}
+		num_q = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"tx_q%u%s",
+					id_queue,
+					rte_txq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+		}
+
+		if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries =
+				(*dev->dev_ops->xstats_get_names_by_ids)(
+				dev,
+				xstats_names + cnt_used_entries,
+				NULL,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+
+		} else if (dev->dev_ops->xstats_get_names != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+				dev,
+				xstats_names + cnt_used_entries,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+		}
 
+		return cnt_used_entries;
 	}
-	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
-			snprintf(xstats_names[cnt_used_entries].name,
-				sizeof(xstats_names[0].name),
-				"tx_q%u%s",
-				id_queue, rte_txq_stats_strings[idx].name);
-			cnt_used_entries++;
+	/* Get only xstats given by IDS */
+	else {
+		uint16_t len, i;
+		struct rte_eth_xstat_name *xstats_names_copy;
+
+		len = rte_eth_xstats_get_names_v1705(port_id, NULL, 0, NULL);
+
+		xstats_names_copy =
+				malloc(sizeof(struct rte_eth_xstat_name) * len);
+		if (!xstats_names_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			     "ERROR: can't allocate memory for values_copy\n");
+			free(xstats_names_copy);
+			return -1;
+		}
+
+		rte_eth_xstats_get_names_v1705(port_id, xstats_names_copy,
+				len, NULL);
+
+		for (i = 0; i < size; i++) {
+			if (ids[i] >= len) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			strcpy(xstats_names[i].name,
+					xstats_names_copy[ids[i]].name);
 		}
+		free(xstats_names_copy);
+		return size;
 	}
+}
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get_names(uint8_t port_id,
+			struct rte_eth_xstat_name *xstats_names,
+			unsigned int size,
+			uint64_t *ids), rte_eth_xstats_get_names_v1705);
 
-	if (dev->dev_ops->xstats_get_names != NULL) {
-		/* If there are any driver-specific xstats, append them
-		 * to end of list.
-		 */
-		cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
-			dev,
-			xstats_names + cnt_used_entries,
-			size - cnt_used_entries);
-		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
-		cnt_used_entries += cnt_driver_entries;
+int
+rte_eth_xstats_get_names_v1607(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names,
+	unsigned int size)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, size, NULL);
+}
+VERSION_SYMBOL(rte_eth_xstats_get_names, _v1607, 16.07);
+
+/* retrieve ethdev extended statistics */
+int
+rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	return cnt_used_entries;
+	for (i = 0; i < n; i++) {
+		xstats[i].id = i;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
 }
+VERSION_SYMBOL(rte_eth_xstats_get, _v22, 2.2);
 
 /* retrieve ethdev extended statistics */
 int
-rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned n)
+rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n)
 {
-	struct rte_eth_stats eth_stats;
-	struct rte_eth_dev *dev;
-	unsigned count = 0, i, q;
-	signed xcount = 0;
-	uint64_t val, *stats_ptr;
-	uint16_t nb_rxqs, nb_txqs;
+	/* If need all xstats */
+	if (!ids) {
+		struct rte_eth_stats eth_stats;
+		struct rte_eth_dev *dev;
+		unsigned int count = 0, i, q;
+		signed int xcount = 0;
+		uint64_t val, *stats_ptr;
+		uint16_t nb_rxqs, nb_txqs;
 
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+		dev = &rte_eth_devices[port_id];
 
-	dev = &rte_eth_devices[port_id];
+		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		nb_txqs = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
 
-	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		/* Return generic statistics */
+		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+			(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* Return generic statistics */
-	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
-		(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* implemented by the driver */
-	if (dev->dev_ops->xstats_get != NULL) {
-		/* Retrieve the xstats from the driver at the end of the
-		 * xstats struct.
-		 */
-		xcount = (*dev->dev_ops->xstats_get)(dev,
-				     xstats ? xstats + count : NULL,
-				     (n > count) ? n - count : 0);
+		/* implemented by the driver */
+		if (dev->dev_ops->xstats_get_by_ids != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 */
+			xcount = (*dev->dev_ops->xstats_get_by_ids)(dev,
+					NULL,
+					values ? values + count : NULL,
+					(n > count) ? n - count : 0);
+
+			if (xcount < 0)
+				return xcount;
+		/* implemented by the driver */
+		} else if (dev->dev_ops->xstats_get != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 * Compatibility for PMD without xstats_get_by_ids
+			 */
+			unsigned int size = (n > count) ? n - count : 1;
+			struct rte_eth_xstat xstats[size];
 
-		if (xcount < 0)
-			return xcount;
-	}
+			xcount = (*dev->dev_ops->xstats_get)(dev,
+					values ? xstats : NULL,	size);
 
-	if (n < count + xcount || xstats == NULL)
-		return count + xcount;
+			if (xcount < 0)
+				return xcount;
+
+			if (values != NULL)
+				for (i = 0 ; i < (unsigned int)xcount; i++)
+					values[i + count] = xstats[i].value;
+		}
 
-	/* now fill the xstats structure */
-	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+		if (n < count + xcount || values == NULL)
+			return count + xcount;
 
-	/* global stats */
-	for (i = 0; i < RTE_NB_STATS; i++) {
-		stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_stats_strings[i].offset);
-		val = *stats_ptr;
-		xstats[count++].value = val;
-	}
+		/* now fill the xstats structure */
+		count = 0;
+		rte_eth_stats_get(port_id, &eth_stats);
 
-	/* per-rxq stats */
-	for (q = 0; q < nb_rxqs; q++) {
-		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+		/* global stats */
+		for (i = 0; i < RTE_NB_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_rxq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
+						rte_stats_strings[i].offset);
 			val = *stats_ptr;
-			xstats[count++].value = val;
+			values[count++] = val;
 		}
+
+		/* per-rxq stats */
+		for (q = 0; q < nb_rxqs; q++) {
+			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_rxq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		/* per-txq stats */
+		for (q = 0; q < nb_txqs; q++) {
+			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_txq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		return count + xcount;
 	}
+	/* Need only xstats given by IDS array */
+	else {
+		uint16_t i, size;
+		uint64_t *values_copy;
 
-	/* per-txq stats */
-	for (q = 0; q < nb_txqs; q++) {
-		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
-			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_txq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
-			val = *stats_ptr;
-			xstats[count++].value = val;
+		size = rte_eth_xstats_get_v1705(port_id, NULL, NULL, 0);
+
+		values_copy = malloc(sizeof(values_copy) * size);
+		if (!values_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			    "ERROR: can't allocate memory for values_copy\n");
+			return -1;
 		}
+
+		rte_eth_xstats_get_v1705(port_id, NULL, values_copy, size);
+
+		for (i = 0; i < n; i++) {
+			if (ids[i] >= size) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			values[i] = values_copy[ids[i]];
+		}
+		free(values_copy);
+		return n;
 	}
+}
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get(uint8_t port_id, uint64_t *ids,
+		uint64_t *values, unsigned int n), rte_eth_xstats_get_v1705);
+
+__rte_deprecated int
+rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
 
-	for (i = 0; i < count; i++)
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
+	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
+
+	for (i = 0; i < n; i++) {
 		xstats[i].id = i;
-	/* add an offset to driver-specific stats */
-	for ( ; i < count + xcount; i++)
-		xstats[i].id += count;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
+}
 
-	return count + xcount;
+__rte_deprecated int
+rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, n, NULL);
 }
 
 /* reset ethdev extended statistics */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d072538..e4f410a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -186,6 +186,7 @@
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
+#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1118,6 +1119,10 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_by_ids_t)(struct rte_eth_dev *dev,
+		uint64_t *ids, uint64_t *values, unsigned int n);
+/**< @internal Get extended stats of an Ethernet device. */
+
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1125,6 +1130,17 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_names_by_ids_t)(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+/**< @internal Get names of extended stats of an Ethernet device. */
+
+typedef int (*eth_xstats_get_by_name_t)(struct rte_eth_dev *dev,
+		struct rte_eth_xstat_name *xstats_names,
+		struct rte_eth_xstat *xstat,
+		const char *name);
+/**< @internal Get xstat specified by name of an Ethernet device. */
+
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1466,8 +1482,8 @@ struct eth_dev_ops {
 	eth_stats_reset_t          stats_reset;   /**< Reset generic device statistics. */
 	eth_xstats_get_t           xstats_get;    /**< Get extended device statistics. */
 	eth_xstats_reset_t         xstats_reset;  /**< Reset extended device statistics. */
-	eth_xstats_get_names_t     xstats_get_names;
-	/**< Get names of extended statistics. */
+	eth_xstats_get_names_t	   xstats_get_names;
+	/**< Get names of extended device statistics. */
 	eth_queue_stats_mapping_set_t queue_stats_mapping_set;
 	/**< Configure per queue stat counter mapping. */
 
@@ -1563,6 +1579,12 @@ struct eth_dev_ops {
 	eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device clock. */
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
+	eth_xstats_get_by_ids_t    xstats_get_by_ids;
+	/**< Get extended device statistics by ID. */
+	eth_xstats_get_names_by_ids_t xstats_get_names_by_ids;
+	/**< Get name of extended device statistics by ID. */
+	eth_xstats_get_by_name_t   xstats_get_by_name;
+	/**< Get extended device statistics by name. */
 };
 
 /**
@@ -2324,8 +2346,57 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  */
 void rte_eth_stats_reset(uint8_t port_id);
 
+
 /**
- * Retrieve names of extended statistics of an Ethernet device.
+* Gets the ID of a statistic from its name.
+*
+* Note this function searches for the statistics using string compares, and
+* as such should not be used on the fast-path. For fast-path retrieval of
+* specific statistics, store the ID as provided in *id* from this function,
+* and pass the ID to rte_eth_xstats_get()
+*
+* @param port_id The port to look up statistics from
+* @param xstat_name The name of the statistic to return
+* @param[out] id A pointer to an app-supplied uint64_t which should be
+*                set to the ID of the stat if the stat exists.
+* @return
+*    0 on success
+*    -ENODEV for invalid port_id,
+*    -EINVAL if the xstat_name doesn't exist in port_id
+*/
+int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id);
+
+/**
+ * Retrieve all extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+__rte_deprecated
+int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_all, _v1705, 17.05);
+
+
+/**
+ * Retrieve names of all extended statistics of an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
@@ -2333,7 +2404,7 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *   An rte_eth_xstat_name array of at least *size* elements to
  *   be filled. If set to NULL, the function returns the required number
  *   of elements.
- * @param size
+ * @param n
  *   The size of the xstats_names array (number of elements).
  * @return
  *   - A positive value lower or equal to size: success. The return value
@@ -2344,9 +2415,10 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get_names(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names,
-		unsigned size);
+__rte_deprecated int rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names_all, _v1705, 17.05);
+
 
 /**
  * Retrieve extended statistics of an Ethernet device.
@@ -2370,8 +2442,94 @@ int rte_eth_xstats_get_names(uint8_t port_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-		unsigned n);
+int rte_eth_xstats_get_v22(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ids
+ *   A pointer to an ids array passed by application. This tells wich
+ *   statistics values function should retrieve. This parameter
+ *   can be set to NULL if n is 0. In this case function will retrieve
+ *   all avalible statistics.
+ * @param values
+ *   A pointer to a table to be filled with device statistics values.
+ * @param n
+ *   The size of the ids array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get, _v1705, 17.05);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1607(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+
+/**
+ * Retrieve names of extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   An rte_eth_xstat_name array of at least *size* elements to
+ *   be filled. If set to NULL, the function returns the required number
+ *   of elements.
+ * @param ids
+ *   IDs array given by app to retrieve specific statistics
+ * @param size
+ *   The size of the xstats_names array (number of elements).
+ * @return
+ *   - A positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
+
+int rte_eth_xstats_get_names(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size,
+	uint64_t *ids);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names, _v1705, 17.05);
 
 /**
  * Reset extended statistics of an Ethernet device.
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 0ea3856..f4d0136 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -159,5 +159,10 @@ DPDK_17.05 {
 	global:
 
 	rte_eth_find_next;
+	rte_eth_xstats_get_names;
+	rte_eth_xstats_get;
+	rte_eth_xstats_get_all;
+	rte_eth_xstats_get_names_all;
+	rte_eth_xstats_get_id_by_name;
 
 } DPDK_17.02;
-- 
1.9.1

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v4 0/3] Extended xstats API in ethdev library to allow grouping of stats
  2017-04-03 12:09  1%     ` [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id Jacek Piasecki
  2017-04-03 12:37  0%       ` Van Haaren, Harry
  2017-04-04 15:03  0%       ` Thomas Monjalon
@ 2017-04-10 17:59  4%       ` Jacek Piasecki
  2017-04-10 17:59  1%         ` [dpdk-dev] [PATCH v4 1/3] ethdev: new xstats API add retrieving by ID Jacek Piasecki
  2 siblings, 1 reply; 200+ results
From: Jacek Piasecki @ 2017-04-10 17:59 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, deepak.k.jain, Jacek Piasecki

Extended xstats API in ethdev library to allow grouping of stats logically
so they can be retrieved per logical grouping  managed by the application.
Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
to use a new list of arguments: array of ids and array of values.
ABI versioning mechanism was used to support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the previous
ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
but use new API inside. Both functions marked as deprecated. 
Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
xstats ids by its names.
Extended functionality of proc_info application:
--xstats-name NAME: to display single xstat value by NAME
Updated test-pmd application to use new API.

v4 changes:
documentation change after API modification
fix xstats display for PMD without _by_ids() functions
fix ABI validator errors

v3 changes:
checkpatch fixes
removed malloc bug in ethdev
add new command to proc_info and IDs parsing
merged testpmd and proc_info patch with library patch

Jacek Piasecki (3):
  ethdev: new xstats API add retrieving by ID
  net/e1000: new xstats API add ID support for e1000
  net/ixgbe: new xstats API add ID support for ixgbe

 app/proc_info/main.c                    | 150 ++++++++++-
 app/test-pmd/config.c                   |  19 +-
 doc/guides/prog_guide/poll_mode_drv.rst | 174 +++++++++++--
 drivers/net/e1000/igb_ethdev.c          |  92 ++++++-
 drivers/net/ixgbe/ixgbe_ethdev.c        | 179 ++++++++++++++
 lib/librte_ether/Makefile               |   2 +-
 lib/librte_ether/rte_ethdev.c           | 426 ++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h           | 176 ++++++++++++-
 lib/librte_ether/rte_ether_version.map  |   5 +
 9 files changed, 1066 insertions(+), 157 deletions(-)

-- 
1.9.1

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/2] eal: add rte_cpu_is_supported to map
  2017-04-04 15:38  3% ` [dpdk-dev] [PATCH 1/2] eal: add rte_cpu_is_supported to map Aaron Conole
@ 2017-04-06 20:58  0%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2017-04-06 20:58 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev

2017-04-04 11:38, Aaron Conole:
> This function is now part of the public ABI, so should be
> advertised as such.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>

Fixes: 37e97ad2c56a ("eal: do not panic when CPU is not supported")

Series applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] ring: fix build with icc
@ 2017-04-05 15:03  4% Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2017-04-05 15:03 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Thomas Monjalon, dev, Ferruh Yigit

build error:
In file included from .../lib/librte_ring/rte_ring.c(90):
.../lib/librte_ring/rte_ring.h(162):
error #1366: a reduction in alignment without the "packed" attribute
is ignored
  } __rte_cache_aligned;
      ^

Alignment attribute moved to first element of the struct

Fixes: a6619414e0a9 ("ring: make struct and macros type agnostic")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 lib/librte_ring/rte_ring.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 6642e18..28b7b2a 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -147,7 +147,7 @@ struct rte_ring {
 	 * compatibility requirements, it could be changed to RTE_RING_NAMESIZE
 	 * next time the ABI changes
 	 */
-	char name[RTE_MEMZONE_NAMESIZE];    /**< Name of the ring. */
+	char name[RTE_MEMZONE_NAMESIZE] __rte_cache_aligned; /**< Name of the ring. */
 	int flags;               /**< Flags supplied at creation. */
 	const struct rte_memzone *memzone;
 			/**< Memzone, if any, containing the rte_ring */
@@ -159,7 +159,7 @@ struct rte_ring {
 
 	/** Ring consumer status. */
 	struct rte_ring_headtail cons __rte_aligned(CONS_ALIGN);
-} __rte_cache_aligned;
+};
 
 #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer". */
 #define RING_F_SC_DEQ 0x0002 /**< The default dequeue is "single-consumer". */
-- 
2.9.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] mbuf: bump library version
  2017-04-05 10:00 14% [dpdk-dev] [PATCH] mbuf: bump library version Olivier Matz
@ 2017-04-05 11:41  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2017-04-05 11:41 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

2017-04-05 12:00, Olivier Matz:
> The reorganization of the mbuf structure induces an ABI breakage.
> Bump the library version, and update the documentation accordingly.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] mbuf: bump library version
@ 2017-04-05 10:00 14% Olivier Matz
  2017-04-05 11:41  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2017-04-05 10:00 UTC (permalink / raw)
  To: dev

The reorganization of the mbuf structure induces an ABI breakage.
Bump the library version, and update the documentation accordingly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/deprecation.rst   |  7 -------
 doc/guides/rel_notes/release_17_05.rst | 21 ++++++++++++++++++++-
 lib/librte_mbuf/Makefile               |  2 +-
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 9708b3941..fcc2e865d 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -75,13 +75,6 @@ Deprecation Notices
   ``rte_pmd_ixgbe_bypass_wd_timeout_show``, ``rte_pmd_ixgbe_bypass_ver_show``,
   ``rte_pmd_ixgbe_bypass_wd_reset``.
 
-* ABI changes are planned for 17.05 in the ``rte_mbuf`` structure: some fields
-  may be reordered to facilitate the writing of ``data_off``, ``refcnt``, and
-  ``nb_segs`` in one operation, because some platforms have an overhead if the
-  store address is not naturally aligned. Other mbuf fields, such as the
-  ``port`` field, may be moved or removed as part of this mbuf work. A
-  ``timestamp`` will also be added.
-
 * The mbuf flags PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT are deprecated and
   are respectively replaced by PKT_RX_VLAN_STRIPPED and
   PKT_RX_QINQ_STRIPPED, that are better described. The old flags and
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index d5d520573..9b078e550 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -41,6 +41,20 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Reorganized the mbuf structure.**
+
+  * Align fields to facilitate the writing of ``data_off``, ``refcnt``, and
+    ``nb_segs`` in one operation.
+  * Use 2 bytes for port and number of segments.
+  * Move the sequence number in the second cache line.
+  * Add a timestamp field.
+  * Set default value for ``refcnt``, ``next`` and ``nb_segs`` at mbuf free.
+
+* **Added mbuf raw free API**
+
+  Moved ``rte_mbuf_raw_free()`` and ``rte_pktmbuf_prefree_seg()`` functions to
+  the public API.
+
 * **Added free Tx mbuf on demand API.**
 
   Added a new function ``rte_eth_tx_done_cleanup()`` which allows an application
@@ -366,6 +380,11 @@ ABI Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Reorganized the mbuf structure.**
+
+  The order and size of the fields in the ``mbuf`` structure changed,
+  as described in the `New Features`_ section.
+
 
 Removed Items
 -------------
@@ -413,7 +432,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_kni.so.2
      librte_kvargs.so.1
      librte_lpm.so.2
-     librte_mbuf.so.2
+   + librte_mbuf.so.3
      librte_mempool.so.2
      librte_meter.so.1
      librte_net.so.1
diff --git a/lib/librte_mbuf/Makefile b/lib/librte_mbuf/Makefile
index 956902ab4..548273054 100644
--- a/lib/librte_mbuf/Makefile
+++ b/lib/librte_mbuf/Makefile
@@ -38,7 +38,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 
 EXPORT_MAP := rte_mbuf_version.map
 
-LIBABIVER := 2
+LIBABIVER := 3
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MBUF) := rte_mbuf.c rte_mbuf_ptype.c
-- 
2.11.0

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH v8 1/7] net/ark: stub PMD for Atomic Rules Arkville
  2017-03-29 21:32  3%   ` [dpdk-dev] [PATCH v7 1/7] net/ark: stub PMD for Atomic Rules Arkville Ed Czeck
@ 2017-04-04 19:50  3%     ` Ed Czeck
  0 siblings, 0 replies; 200+ results
From: Ed Czeck @ 2017-04-04 19:50 UTC (permalink / raw)
  To: dev; +Cc: john.miller, shepard.siegel, ferruh.yigit, stephen, Ed Czeck

Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time

v8:
* Update to allow patch application from dpdk-next-net

v6:
* Address review comments
* Unify messaging, logging and debug macros to ark_logs.h

v5:
* Address comments from Ferruh Yigit <ferruh.yigit@intel.com>
* Added documentation on driver args
* Makefile fixes
* Safe argument processing
* vdev args to dev args

v4:
* Address issues report from review
* Add internal comments on driver arg
* provide a bare-bones dev init to avoid compiler warnings

v3:
* Split large patch into several smaller ones

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
---
 MAINTAINERS                              |   8 +
 config/common_base                       |  10 ++
 config/defconfig_arm-armv7a-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                  | 296 +++++++++++++++++++++++++++++++
 doc/guides/nics/index.rst                |   1 +
 drivers/net/Makefile                     |   2 +
 drivers/net/ark/Makefile                 |  55 ++++++
 drivers/net/ark/ark_ethdev.c             | 288 ++++++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h             |  41 +++++
 drivers/net/ark/ark_global.h             | 110 ++++++++++++
 drivers/net/ark/ark_logs.h               | 119 +++++++++++++
 drivers/net/ark/rte_pmd_ark_version.map  |   4 +
 mk/rte.app.mk                            |   1 +
 13 files changed, 936 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/ark_logs.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 7d8d95e..bb92d37 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -281,6 +281,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ARK
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index d0f1a8b..5b749c5 100644
--- a/config/common_base
+++ b/config/common_base
@@ -364,6 +364,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+#
 # Compile WRS accelerated virtual port (AVP) guest PMD driver
 #
 CONFIG_RTE_LIBRTE_AVP_PMD=n
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..064ed11
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,296 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+The ARK PMD also supports optional user extensions, through dynamic linking.
+The ARK PMD user extensions are a feature of Arkville’s DPDK
+net/ark poll mode driver, allowing users to add their
+own code to extend the net/ark functionality without
+having to make source code changes to the driver. One motivation for
+this capability is that while DPDK provides a rich set of functions
+to interact with NIC-like capabilities (e.g. MAC addresses and statistics),
+the Arkville RTL IP does not include a MAC.  Users can supply their
+own MAC or custom FPGA applications, which may require control from
+the PMD.  The user extension is the means providing the control
+between the user's FPGA application and the existing DPDK features via
+the PMD.
+
+Device Parameters
+-------------------
+
+The ARK PMD supports device parameters that are used for packet
+routing and for internal packet generation and packet checking.  This
+section describes the supported parameters.  These features are
+primarily used for diagnostics, testing, and performance verification
+under the guidance of an Arkville specialist.  The nominal use of
+Arkville does not require any configuration using these parameters.
+
+"Pkt_dir"
+
+The Packet Director controls connectivity between Arkville's internal
+hardware components. The features of the Pkt_dir are only used for
+diagnostics and testing; it is not intended for nominal use.  The full
+set of features are not published at this level.
+
+Format:
+Pkt_dir=0x00110F10
+
+"Pkt_gen"
+
+The packet generator parameter takes a file as its argument.  The file
+contains configuration parameters used internally for regression
+testing and are not intended to be published at this level.  The
+packet generator is an internal Arkville hardware component.
+
+Format:
+Pkt_gen=./config/pg.conf
+
+"Pkt_chkr"
+
+The packet checker parameter takes a file as its argument.  The file
+contains configuration parameters used internally for regression
+testing and are not intended to be published at this level.  The
+packet checker is an internal Arkville hardware component.
+
+Format:
+Pkt_chkr=./config/pc.conf
+
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_PAD_TX** (default y):  When enabled TX
+     packets are padded to 60 bytes to support downstream MACS.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK
+Release Notes*.  ARM and PowerPC architectures are not supported at this time.
+
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 4537113..3305e80 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     avp
     bnx2x
     bnxt
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 7a2cb00..2fb7ca5 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -36,6 +36,8 @@ core-libs += librte_net librte_kvargs
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
 DEPDIRS-af_packet = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
+DEPDIRS-ark = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_AVP_PMD) += avp
 DEPDIRS-avp = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..a4e03ab
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,55 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS) -Werror
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+
+# this lib depends upon:
+LDLIBS += -lpthread
+LDLIBS += -ldl
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..fa5c9aa
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,288 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+
+#include "ark_global.h"
+#include "ark_logs.h"
+#include "ark_ethdev.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(struct ark_adapter *ark, const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+				 struct rte_eth_dev_info *dev_info);
+
+/*
+ * The packet generator is a functional block used to generate packet
+ * patterns for testing.  It is not intended for nominal use.
+ */
+#define ARK_PKTGEN_ARG "Pkt_gen"
+
+/*
+ * The packet checker is a functional block used to verify packet
+ * patterns for testing.  It is not intended for nominal use.
+ */
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+
+/*
+ * The packet director is used to select the internal ingress and
+ * egress packets paths during testing.  It is not intended for
+ * nominal use.
+ */
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+/* Devinfo configurations */
+#define ARK_RX_MAX_QUEUE (4096 * 4)
+#define ARK_RX_MIN_QUEUE (512)
+#define ARK_RX_MAX_PKT_LEN ((16 * 1024) - 128)
+#define ARK_RX_MIN_BUFSIZE (1024)
+
+#define ARK_TX_MAX_QUEUE (4096 * 4)
+#define ARK_TX_MIN_QUEUE (256)
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	NULL
+};
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
+	},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_infos_get = eth_ark_dev_info_get,
+};
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret = -1;
+
+	ark->eth_dev = dev;
+
+	PMD_FUNC_LOG(DEBUG, "\n");
+
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	ark->bar0 = (uint8_t *)pci_dev->mem_resource[0].addr;
+	ark->a_bar = (uint8_t *)pci_dev->mem_resource[2].addr;
+
+	dev->dev_ops = &ark_eth_dev_ops;
+	dev->data->dev_flags |= RTE_ETH_DEV_DETACHABLE;
+
+	if (pci_dev->device.devargs)
+		ret = eth_ark_check_args(ark, pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+	return ret;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	PMD_FUNC_LOG(DEBUG, "\n");
+	return 0;
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+		     struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_rx_pktlen = ARK_RX_MAX_PKT_LEN;
+	dev_info->min_rx_bufsize = ARK_RX_MIN_BUFSIZE;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_RX_MAX_QUEUE,
+		.nb_min = ARK_RX_MIN_QUEUE,
+		.nb_align = ARK_RX_MIN_QUEUE}; /* power of 2 */
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_TX_MAX_QUEUE,
+		.nb_min = ARK_TX_MIN_QUEUE,
+		.nb_align = ARK_TX_MIN_QUEUE}; /* power of 2 */
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
+				ETH_LINK_SPEED_10G |
+				ETH_LINK_SPEED_25G |
+				ETH_LINK_SPEED_40G |
+				ETH_LINK_SPEED_50G |
+				ETH_LINK_SPEED_100G);
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+		   void *extra_args)
+{
+	PMD_FUNC_LOG(DEBUG, "key = %s, value = %s\n",
+		    key, value);
+	struct ark_adapter *ark =
+		(struct ark_adapter *)extra_args;
+
+	ark->pkt_dir_v = strtol(value, NULL, 16);
+	PMD_FUNC_LOG(DEBUG, "pkt_dir_v = 0x%x\n", ark->pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	PMD_FUNC_LOG(DEBUG, "key = %s, value = %s\n",
+		    key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[ARK_MAX_ARG_LEN];
+	int  size = 0;
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		size += strlen(line);
+		if (size >= ARK_MAX_ARG_LEN) {
+			PMD_DRV_LOG(ERR, "Unable to parse file %s args, "
+				    "parameter list is too long\n", value);
+			fclose(file);
+			return -1;
+		}
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	PMD_FUNC_LOG(DEBUG, "file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(struct ark_adapter *ark, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	ark->pkt_gen_args[0] = 0;
+	ark->pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		PMD_FUNC_LOG(DEBUG, "**** Arg passed to PMD = %s:%s\n",
+			     pair->key,
+			     pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTDIR_ARG,
+			       &process_pktdir_arg,
+			       ark) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+		return -1;
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTGEN_ARG,
+			       &process_file_args,
+			       ark->pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+		return -1;
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTCHKR_ARG,
+			       &process_file_args,
+			       ark->pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+		return -1;
+	}
+
+	PMD_DRV_LOG(INFO, "packet director set to 0x%x\n", ark->pkt_dir_v);
+
+	return 0;
+}
+
+RTE_PMD_REGISTER_PCI(net_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(net_ark, pci_id_ark_map);
+RTE_PMD_REGISTER_PARAM_STRING(net_ark,
+			      ARK_PKTGEN_ARG "=<filename> "
+			      ARK_PKTCHKR_ARG "=<filename> "
+			      ARK_PKTDIR_ARG "=<bitmap>");
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..9f8d32f
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,41 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+#define ARK_DEV_TO_PCI(eth_dev)			\
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..21449c3
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,110 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/* Maximum length of arg list in bytes */
+#define ARK_MAX_ARG_LEN 256
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	int num_ports;
+
+	/* Packet generator/checker args */
+	char pkt_gen_args[ARK_MAX_ARG_LEN];
+	char pkt_chkr_args[ARK_MAX_ARG_LEN];
+	uint32_t pkt_dir_v;
+
+	/* eth device */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* Application Bar */
+	uint8_t *a_bar;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/ark_logs.h b/drivers/net/ark/ark_logs.h
new file mode 100644
index 0000000..8aff296
--- /dev/null
+++ b/drivers/net/ark/ark_logs.h
@@ -0,0 +1,119 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+
+/* Configuration option to pad TX packets to 60 bytes */
+#ifdef RTE_LIBRTE_ARK_PAD_TX
+#define ARK_TX_PAD_TO_60   1
+#else
+#define ARK_TX_PAD_TO_60   0
+#endif
+
+/* system camel case definition changed to upper case */
+#define PRIU32 PRIu32
+#define PRIU64 PRIu64
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIU32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIU64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+
+
+#define PMD_DRV_LOG(level, fmt, args...)	\
+	RTE_LOG(level, PMD, fmt, ## args)
+
+/* Conditional trace definitions */
+#define ARK_TRACE_ON(level, fmt, ...)		\
+	RTE_LOG(level, PMD, fmt, ##__VA_ARGS__)
+
+/* This pattern allows compiler check arguments even if disabled  */
+#define ARK_TRACE_OFF(level, fmt, ...)					\
+	do {if (0) RTE_LOG(level, PMD, fmt, ##__VA_ARGS__); }		\
+	while (0)
+
+
+/* tracing including the function name */
+#define ARK_FUNC_ON(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+
+/* tracing including the function name */
+#define ARK_FUNC_OFF(level, fmt, args...)				\
+	do { if (0) RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args); } \
+	while (0)
+
+
+/* Debug macro for tracing full behavior, function tracing and messages*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define PMD_FUNC_LOG(level, fmt, ...) ARK_FUNC_ON(level, fmt, ##__VA_ARGS__)
+#define PMD_DEBUG_LOG(level, fmt, ...) ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define PMD_FUNC_LOG(level, fmt, ...) ARK_FUNC_OFF(level, fmt, ##__VA_ARGS__)
+#define PMD_DEBUG_LOG(level, fmt, ...) ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+
+/* Debug macro for reporting FPGA statistics */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define PMD_STATS_LOG(level, fmt, ...) ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define PMD_STATS_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+
+/* Debug macro for RX path */
+#ifdef RTE_LIBRTE_ARK_DEBUG_RX
+#define ARK_RX_DEBUG 1
+#define PMD_RX_LOG(level, fmt, ...)  ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define ARK_RX_DEBUG 0
+#define PMD_RX_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for TX path */
+#ifdef RTE_LIBRTE_ARK_DEBUG_TX
+#define ARK_TX_DEBUG       1
+#define PMD_TX_LOG(level, fmt, ...)  ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define ARK_TX_DEBUG       0
+#define PMD_TX_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..1062e04
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 9c3a753..3829c60 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -104,6 +104,7 @@ _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_RING)   += -lrte_mempool_ring
 _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_STACK)  += -lrte_mempool_stack
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_AVP_PMD)        += -lrte_pmd_avp
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id
  2017-04-04 15:45  0%         ` Van Haaren, Harry
@ 2017-04-04 16:18  0%           ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2017-04-04 16:18 UTC (permalink / raw)
  To: Van Haaren, Harry
  Cc: dev, Kozak, KubaX, Kulasek, TomaszX, Piasecki, JacekX,
	Jastrzebski, MichalX K

2017-04-04 15:45, Van Haaren, Harry:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2017-04-03 14:09, Jacek Piasecki:
> > > From: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
> > >
> > > Extended xstats API in ethdev library to allow grouping of stats
> > > logically so they can be retrieved per logical grouping – managed
> > > by the application.
> > > Changed existing functions rte_eth_xstats_get_names and
> > > rte_eth_xstats_get to use a new list of arguments: array of ids
> > > and array of values. ABI versioning mechanism was used to
> > > support backward compatibility.
> > > Introduced two new functions rte_eth_xstats_get_all and
> > > rte_eth_xstats_get_names_all which keeps functionality of the
> > > previous ones (respectively rte_eth_xstats_get and
> > > rte_eth_xstats_get_names) but use new API inside.
> > 
> > Sorry, I still do not understand why we should complicate the API.
> > What is not possible with the existing API?
> 
> 
> The current API only allows retrieval of *all* of the NIC statistics at once. Given this requires a MMIO read PCI transaction per statistic it is an inefficient way of retrieving just a few key statistics. My understanding is that often a monitoring agent only has an interest in a few key statistics, and the current API forces wasting CPU time and PCIe bandwidth in retrieving *all* statistics; even those that the application didn't explicitly show an interest in.
> 
> The more flexible API as implemented in this patchset allow retrieval of statistics per ID. If a PMD wishes, it can be implemented to read just the required NIC registers. As a result, the monitoring application no longer wastes PCIe bandwidth and CPU time.

Thanks for the explanation.
It has never been explained before.

> > The v1 was submitted in the last days of the proposal deadline,
> > v2 in the last minutes of integration deadline,
> > and v3 is submitted after the deadline.
> > 
> > Given it is late and it is still difficult to understand the benefit,
> > I think it won't make the release 17.05.
> 
> 
> All in all, the value add to DPDK of this patchset is to enable applications request statistics of interest to them, and to allow PMDs implement the statistic functions more efficiently if they wish. As a bonus, the ethdev and eventdev xstats APIs will have a consistent design, as eventdev already uses this optimized ID based method.
> 
> Unless there are serious concerns about the current API (which should have been flagged between a v1 and now), I don't see a reason to not update the API to use this improved method. If there are concerns about how to update applications to the new API, that can be addressed in a documentation patch if the community feels there is value in that?

I have commented on the need of explanation 3 days after the v1.
There was no answer.
So the review stopped at this point.
Then one month later (last Thursday), a v2 appears which
"replaced grouping mechanism to use mechanism based on IDs".
So you cannot say it "should have been flagged between a v1 and now".

Just because of the lack of communication, I do not want to spend these
days reviewing the API. It needs time and it will wait.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id
  2017-04-04 15:03  0%       ` Thomas Monjalon
@ 2017-04-04 15:45  0%         ` Van Haaren, Harry
  2017-04-04 16:18  0%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Van Haaren, Harry @ 2017-04-04 15:45 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Kozak, KubaX, Kulasek, TomaszX, Piasecki, JacekX,
	Jastrzebski, MichalX K

> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, April 4, 2017 4:04 PM
> To: Piasecki, JacekX <jacekx.piasecki@intel.com>; Jastrzebski, MichalX K
> <michalx.k.jastrzebski@intel.com>
> Cc: dev@dpdk.org; Van Haaren, Harry <harry.van.haaren@intel.com>; Kozak, KubaX
> <kubax.kozak@intel.com>; Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id
> 
> 2017-04-03 14:09, Jacek Piasecki:
> > From: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
> >
> > Extended xstats API in ethdev library to allow grouping of stats
> > logically so they can be retrieved per logical grouping – managed
> > by the application.
> > Changed existing functions rte_eth_xstats_get_names and
> > rte_eth_xstats_get to use a new list of arguments: array of ids
> > and array of values. ABI versioning mechanism was used to
> > support backward compatibility.
> > Introduced two new functions rte_eth_xstats_get_all and
> > rte_eth_xstats_get_names_all which keeps functionality of the
> > previous ones (respectively rte_eth_xstats_get and
> > rte_eth_xstats_get_names) but use new API inside.
> 
> Sorry, I still do not understand why we should complicate the API.
> What is not possible with the existing API?

The current API only allows retrieval of *all* of the NIC statistics at once. Given this requires a MMIO read PCI transaction per statistic it is an inefficient way of retrieving just a few key statistics. My understanding is that often a monitoring agent only has an interest in a few key statistics, and the current API forces wasting CPU time and PCIe bandwidth in retrieving *all* statistics; even those that the application didn't explicitly show an interest in.

The more flexible API as implemented in this patchset allow retrieval of statistics per ID. If a PMD wishes, it can be implemented to read just the required NIC registers. As a result, the monitoring application no longer wastes PCIe bandwidth and CPU time.

> The v1 was submitted in the last days of the proposal deadline,
> v2 in the last minutes of integration deadline,
> and v3 is submitted after the deadline.
> 
> Given it is late and it is still difficult to understand the benefit,
> I think it won't make the release 17.05.

All in all, the value add to DPDK of this patchset is to enable applications request statistics of interest to them, and to allow PMDs implement the statistic functions more efficiently if they wish. As a bonus, the ethdev and eventdev xstats APIs will have a consistent design, as eventdev already uses this optimized ID based method.

Unless there are serious concerns about the current API (which should have been flagged between a v1 and now), I don't see a reason to not update the API to use this improved method. If there are concerns about how to update applications to the new API, that can be addressed in a documentation patch if the community feels there is value in that?

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH 1/2] eal: add rte_cpu_is_supported to map
  @ 2017-04-04 15:38  3% ` Aaron Conole
  2017-04-06 20:58  0%   ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Aaron Conole @ 2017-04-04 15:38 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This function is now part of the public ABI, so should be
advertised as such.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 7 +++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 +++++++
 2 files changed, 14 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 67f2ffb..82f0f9f 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -182,3 +182,10 @@ DPDK_17.02 {
 	rte_bus_unregister;
 
 } DPDK_16.11;
+
+DPDK_17.05 {
+	global;
+
+	rte_cpu_is_supported;
+
+} DPDK_17.02;
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 9c134b4..461f15d 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -186,3 +186,10 @@ DPDK_17.02 {
 	rte_bus_unregister;
 
 } DPDK_16.11;
+
+DPDK_17.05 {
+	global;
+
+	rte_cpu_is_supported;
+
+} DPDK_17.02;
-- 
2.9.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id
  2017-04-03 12:09  1%     ` [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id Jacek Piasecki
  2017-04-03 12:37  0%       ` Van Haaren, Harry
@ 2017-04-04 15:03  0%       ` Thomas Monjalon
  2017-04-04 15:45  0%         ` Van Haaren, Harry
  2017-04-10 17:59  4%       ` [dpdk-dev] [PATCH v4 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
  2 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2017-04-04 15:03 UTC (permalink / raw)
  To: Jacek Piasecki, Michal Jastrzebski
  Cc: dev, harry.van.haaren, Kuba Kozak, Tomasz Kulasek

2017-04-03 14:09, Jacek Piasecki:
> From: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
> 
> Extended xstats API in ethdev library to allow grouping of stats
> logically so they can be retrieved per logical grouping – managed
> by the application.
> Changed existing functions rte_eth_xstats_get_names and
> rte_eth_xstats_get to use a new list of arguments: array of ids
> and array of values. ABI versioning mechanism was used to
> support backward compatibility.
> Introduced two new functions rte_eth_xstats_get_all and
> rte_eth_xstats_get_names_all which keeps functionality of the
> previous ones (respectively rte_eth_xstats_get and
> rte_eth_xstats_get_names) but use new API inside.

Sorry, I still do not understand why we should complicate the API.
What is not possible with the existing API?

The v1 was submitted in the last days of the proposal deadline,
v2 in the last minutes of integration deadline,
and v3 is submitted after the deadline.

Given it is late and it is still difficult to understand the benefit,
I think it won't make the release 17.05.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: update deprecation notice and release_17_05 for ABI change
  2017-04-03 15:32  4% ` Mcnamara, John
@ 2017-04-04  9:59  4%   ` De Lara Guarch, Pablo
  0 siblings, 0 replies; 200+ results
From: De Lara Guarch, Pablo @ 2017-04-04  9:59 UTC (permalink / raw)
  To: Mcnamara, John, akhil.goyal, dev; +Cc: nhorman



> -----Original Message-----
> From: Mcnamara, John
> Sent: Monday, April 03, 2017 4:33 PM
> To: akhil.goyal@nxp.com; dev@dpdk.org
> Cc: nhorman@tuxdriver.com; De Lara Guarch, Pablo
> Subject: RE: [dpdk-dev] [PATCH] doc: update deprecation notice and
> release_17_05 for ABI change
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of
> akhil.goyal@nxp.com
> > Sent: Monday, April 3, 2017 11:52 AM
> > To: dev@dpdk.org
> > Cc: nhorman@tuxdriver.com; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>; Akhil Goyal <akhil.goyal@nxp.com>
> > Subject: [dpdk-dev] [PATCH] doc: update deprecation notice and
> > release_17_05 for ABI change
> >
> > From: Akhil Goyal <akhil.goyal@nxp.com>
> >
> > Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
> 
> 
> Acked-by: John McNamara <john.mcnamara@intel.com>

Applied to dpdk-next-crypto.
Thanks,

Pablo 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: update deprecation notice and release_17_05 for ABI change
  2017-04-03 10:52  9% [dpdk-dev] [PATCH] doc: update deprecation notice and release_17_05 for ABI change akhil.goyal
@ 2017-04-03 15:32  4% ` Mcnamara, John
  2017-04-04  9:59  4%   ` De Lara Guarch, Pablo
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2017-04-03 15:32 UTC (permalink / raw)
  To: akhil.goyal, dev; +Cc: nhorman, De Lara Guarch, Pablo



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of akhil.goyal@nxp.com
> Sent: Monday, April 3, 2017 11:52 AM
> To: dev@dpdk.org
> Cc: nhorman@tuxdriver.com; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Akhil Goyal <akhil.goyal@nxp.com>
> Subject: [dpdk-dev] [PATCH] doc: update deprecation notice and
> release_17_05 for ABI change
> 
> From: Akhil Goyal <akhil.goyal@nxp.com>
> 
> Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>


Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id
  2017-04-03 12:09  1%     ` [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id Jacek Piasecki
@ 2017-04-03 12:37  0%       ` Van Haaren, Harry
  2017-04-04 15:03  0%       ` Thomas Monjalon
  2017-04-10 17:59  4%       ` [dpdk-dev] [PATCH v4 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
  2 siblings, 0 replies; 200+ results
From: Van Haaren, Harry @ 2017-04-03 12:37 UTC (permalink / raw)
  To: Piasecki, JacekX, dev
  Cc: Jastrzebski, MichalX K, Kozak, KubaX, Kulasek, TomaszX

> From: Piasecki, JacekX
> Sent: Monday, April 3, 2017 1:10 PM
> To: dev@dpdk.org
> Cc: Van Haaren, Harry <harry.van.haaren@intel.com>; Jastrzebski, MichalX K
> <michalx.k.jastrzebski@intel.com>; Piasecki, JacekX <jacekx.piasecki@intel.com>; Kozak, KubaX
> <kubax.kozak@intel.com>; Kulasek, TomaszX <tomaszx.kulasek@intel.com>
> Subject: [PATCH v3 1/3] add new xstats API retrieving by id
> 
> From: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
> 
> Extended xstats API in ethdev library to allow grouping of stats
> logically so they can be retrieved per logical grouping â€“ managed
> by the application.
> Changed existing functions rte_eth_xstats_get_names and
> rte_eth_xstats_get to use a new list of arguments: array of ids
> and array of values. ABI versioning mechanism was used to
> support backward compatibility.
> Introduced two new functions rte_eth_xstats_get_all and
> rte_eth_xstats_get_names_all which keeps functionality of the
> previous ones (respectively rte_eth_xstats_get and
> rte_eth_xstats_get_names) but use new API inside.
> Both functions marked as deprecated.
> Introduced new function: rte_eth_xstats_get_id_by_name
> to retrieve xstats ids by its names.
> 
> test-pmd: add support for new xstats API retrieving by id in
> testpmd application: xstats_get() and
> xstats_get_names() call with modified parameters.
> 
> proc_info: add support for new xstats API retrieving by id to
> proc_info application. There is a new argument --xstats-ids
> in proc_info command line to retrieve statistics given by ids.
> E.g. --xstats-ids="1,3,5,7,8"
> 
> Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
> Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
> ---
>  app/proc_info/main.c                   | 148 +++++++++++-
>  app/test-pmd/config.c                  |  19 +-
>  lib/librte_ether/Makefile              |   2 +-
>  lib/librte_ether/rte_ethdev.c          | 422 +++++++++++++++++++++++++--------
>  lib/librte_ether/rte_ethdev.h          | 176 +++++++++++++-
>  lib/librte_ether/rte_ether_version.map |  12 +
>  6 files changed, 646 insertions(+), 133 deletions(-)

I gather this patchset contains various changes at once to avoid breaking compile due to function changes. All 3 of patchset compiled and working with ixgbe,

Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: update deprecation notice and release_17_05 for ABI change
@ 2017-04-03 10:52  9% akhil.goyal
  2017-04-03 15:32  4% ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: akhil.goyal @ 2017-04-03 10:52 UTC (permalink / raw)
  To: dev; +Cc: nhorman, pablo.de.lara.guarch, Akhil Goyal

From: Akhil Goyal <akhil.goyal@nxp.com>

Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
---
 doc/guides/rel_notes/deprecation.rst   | 5 -----
 doc/guides/rel_notes/release_17_05.rst | 3 +++
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d6544ed..fd09cf2 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -108,11 +108,6 @@ Deprecation Notices
   A pointer to a rte_cryptodev_config structure will be added to the
   function prototype ``cryptodev_configure_t``, as a new parameter.
 
-* cryptodev: A new parameter ``max_nb_sessions_per_qp`` will be added to
-  ``rte_cryptodev_info.sym``. Some drivers may support limited number of
-  sessions per queue_pair. With this new parameter application will know
-  how many sessions can be mapped to each queue_pair of a device.
-
 * distributor: library API will be changed to incorporate a burst-oriented
   API. This will include a change to ``rte_distributor_create``
   to specify which type of instance to create (single or burst), and
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 9cd5a6f..46d4e80 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -144,6 +144,9 @@ ABI Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* The ``rte_cryptodev_info.sym`` structure has new field ``max_nb_sessions_per_qp``
+  to support drivers which may support limited number of sessions per queue_pair.
+
 
 Removed Items
 -------------
-- 
2.9.3

^ permalink raw reply	[relevance 9%]

* [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id
  2017-04-03 12:09  3%   ` [dpdk-dev] [PATCH v3 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
@ 2017-04-03 12:09  1%     ` Jacek Piasecki
  2017-04-03 12:37  0%       ` Van Haaren, Harry
                         ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Jacek Piasecki @ 2017-04-03 12:09 UTC (permalink / raw)
  To: dev
  Cc: harry.van.haaren, Michal Jastrzebski, Jacek Piasecki, Kuba Kozak,
	Tomasz Kulasek

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=Y, Size: 34874 bytes --]

From: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>

Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping – managed
by the application.
Changed existing functions rte_eth_xstats_get_names and
rte_eth_xstats_get to use a new list of arguments: array of ids
and array of values. ABI versioning mechanism was used to
support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the
previous ones (respectively rte_eth_xstats_get and
rte_eth_xstats_get_names) but use new API inside.
Both functions marked as deprecated.
Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.

test-pmd: add support for new xstats API retrieving by id in
testpmd application: xstats_get() and
xstats_get_names() call with modified parameters.

proc_info: add support for new xstats API retrieving by id to
proc_info application. There is a new argument --xstats-ids
in proc_info command line to retrieve statistics given by ids.
E.g. --xstats-ids="1,3,5,7,8"

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
---
 app/proc_info/main.c                   | 148 +++++++++++-
 app/test-pmd/config.c                  |  19 +-
 lib/librte_ether/Makefile              |   2 +-
 lib/librte_ether/rte_ethdev.c          | 422 +++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h          | 176 +++++++++++++-
 lib/librte_ether/rte_ether_version.map |  12 +
 6 files changed, 646 insertions(+), 133 deletions(-)

diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index ef2098d..3c9fe4a 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -83,6 +83,14 @@
 static uint32_t reset_xstats;
 /**< Enable memory info. */
 static uint32_t mem_info;
+/**< Enable displaying xstat name. */
+static uint32_t enable_xstats_name;
+static char *xstats_name;
+
+/**< Enable xstats by ids. */
+#define MAX_NB_XSTATS_IDS 1024
+static uint32_t nb_xstats_ids;
+static uint64_t xstats_ids[MAX_NB_XSTATS_IDS];
 
 /**< display usage */
 static void
@@ -94,6 +102,9 @@
 		"  --stats: to display port statistics, enabled by default\n"
 		"  --xstats: to display extended port statistics, disabled by "
 			"default\n"
+		"  --xstats-name NAME: to display single xstat value by NAME\n"
+		"  --xstats-ids IDLIST: to display xstat values by id. "
+			"The argument is comma-separated list of xstat ids to print out.\n"
 		"  --stats-reset: to reset port statistics\n"
 		"  --xstats-reset: to reset port extended statistics\n"
 		"  --collectd-format: to print statistics to STDOUT in expected by collectd format\n"
@@ -127,6 +138,33 @@
 
 }
 
+/*
+ * Parse ids value list into array
+ */
+static int
+parse_xstats_ids(char *list, uint64_t *ids, int limit) {
+	int length;
+	char *token;
+	char *ctx = NULL;
+	char *endptr;
+
+	length = 0;
+	token = strtok_r(list, ",", &ctx);
+	while (token != NULL) {
+		ids[length] = strtoull(token, &endptr, 10);
+		if (*endptr != '\0')
+			return -EINVAL;
+
+		length++;
+		if (length >= limit)
+			return -E2BIG;
+
+		token = strtok_r(NULL, ",", &ctx);
+	}
+
+	return length;
+}
+
 static int
 proc_info_preparse_args(int argc, char **argv)
 {
@@ -172,7 +210,9 @@
 		{"stats-reset", 0, NULL, 0},
 		{"xstats", 0, NULL, 0},
 		{"xstats-reset", 0, NULL, 0},
+		{"xstats-name", required_argument, NULL, 1},
 		{"collectd-format", 0, NULL, 0},
+		{"xstats-ids", 1, NULL, 1},
 		{"host-id", 0, NULL, 0},
 		{NULL, 0, 0, 0}
 	};
@@ -214,7 +254,28 @@
 					MAX_LONG_OPT_SZ))
 				reset_xstats = 1;
 			break;
+		case 1:
+			/* Print xstat single value given by name*/
+			if (!strncmp(long_option[option_index].name,
+					"xstats-name", MAX_LONG_OPT_SZ)) {
+				enable_xstats_name = 1;
+				xstats_name = optarg;
+				printf("name:%s:%s\n",
+						long_option[option_index].name,
+						optarg);
+			} else if (!strncmp(long_option[option_index].name,
+					"xstats-ids",
+					MAX_LONG_OPT_SZ))	{
+				nb_xstats_ids = parse_xstats_ids(optarg,
+						xstats_ids, MAX_NB_XSTATS_IDS);
+
+				if (nb_xstats_ids <= 0) {
+					printf("xstats-id list parse error.\n");
+					return -1;
+				}
 
+			}
+			break;
 		default:
 			proc_info_usage(prgname);
 			return -1;
@@ -341,20 +402,82 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 }
 
 static void
+nic_xstats_by_name_display(uint8_t port_id, char *name)
+{
+	uint64_t id;
+
+	printf("###### NIC statistics for port %-2d, statistic name '%s':\n",
+			   port_id, name);
+
+	if (rte_eth_xstats_get_id_by_name(port_id, name, &id) == 0)
+		printf("%s: %"PRIu64"\n", name, id);
+	else
+		printf("Statistic not found...\n");
+
+}
+
+static void
+nic_xstats_by_ids_display(uint8_t port_id, uint64_t *ids, int len)
+{
+	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
+	int ret, i;
+	static const char *nic_stats_border = "########################";
+
+	values = malloc(sizeof(values) * len);
+	if (values == NULL) {
+		printf("Cannot allocate memory for xstats\n");
+		return;
+	}
+
+	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
+	if (xstats_names == NULL) {
+		printf("Cannot allocate memory for xstat names\n");
+		free(values);
+		return;
+	}
+
+	if (len != rte_eth_xstats_get_names(
+			port_id, xstats_names, ids, len)) {
+		printf("Cannot get xstat names\n");
+		goto err;
+	}
+
+	printf("###### NIC extended statistics for port %-2d #########\n",
+			   port_id);
+	printf("%s############################\n", nic_stats_border);
+	ret = rte_eth_xstats_get(port_id, ids, values, len);
+	if (ret < 0 || ret > len) {
+		printf("Cannot get xstats\n");
+		goto err;
+	}
+
+	for (i = 0; i < len; i++)
+		printf("%s: %"PRIu64"\n",
+			xstats_names[i].name,
+			values[i]);
+
+	printf("%s############################\n", nic_stats_border);
+err:
+	free(values);
+	free(xstats_names);
+}
+
+static void
 nic_xstats_display(uint8_t port_id)
 {
 	struct rte_eth_xstat_name *xstats_names;
-	struct rte_eth_xstat *xstats;
+	uint64_t *values;
 	int len, ret, i;
 	static const char *nic_stats_border = "########################";
 
-	len = rte_eth_xstats_get_names(port_id, NULL, 0);
+	len = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
 	if (len < 0) {
 		printf("Cannot get xstats count\n");
 		return;
 	}
-	xstats = malloc(sizeof(xstats[0]) * len);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * len);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		return;
 	}
@@ -362,11 +485,11 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * len);
 	if (xstats_names == NULL) {
 		printf("Cannot allocate memory for xstat names\n");
-		free(xstats);
+		free(values);
 		return;
 	}
 	if (len != rte_eth_xstats_get_names(
-			port_id, xstats_names, len)) {
+			port_id, xstats_names, NULL, len)) {
 		printf("Cannot get xstat names\n");
 		goto err;
 	}
@@ -375,7 +498,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 			   port_id);
 	printf("%s############################\n",
 			   nic_stats_border);
-	ret = rte_eth_xstats_get(port_id, xstats, len);
+	ret = rte_eth_xstats_get(port_id, NULL, values, len);
 	if (ret < 0 || ret > len) {
 		printf("Cannot get xstats\n");
 		goto err;
@@ -391,18 +514,18 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 						  xstats_names[i].name);
 			sprintf(buf, "PUTVAL %s/dpdkstat-port.%u/%s-%s N:%"
 				PRIu64"\n", host_id, port_id, counter_type,
-				xstats_names[i].name, xstats[i].value);
+				xstats_names[i].name, values[i]);
 			write(stdout_fd, buf, strlen(buf));
 		} else {
 			printf("%s: %"PRIu64"\n", xstats_names[i].name,
-			       xstats[i].value);
+					values[i]);
 		}
 	}
 
 	printf("%s############################\n",
 			   nic_stats_border);
 err:
-	free(xstats);
+	free(values);
 	free(xstats_names);
 }
 
@@ -480,6 +603,11 @@ static void collectd_resolve_cnt_type(char *cnt_type, size_t cnt_type_len,
 				nic_stats_clear(i);
 			else if (reset_xstats)
 				nic_xstats_clear(i);
+			else if (enable_xstats_name)
+				nic_xstats_by_name_display(i, xstats_name);
+			else if (nb_xstats_ids > 0)
+				nic_xstats_by_ids_display(i, xstats_ids,
+						nb_xstats_ids);
 		}
 	}
 
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 80491fc..c93e04a 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -264,9 +264,9 @@ struct rss_type_info {
 void
 nic_xstats_display(portid_t port_id)
 {
-	struct rte_eth_xstat *xstats;
 	int cnt_xstats, idx_xstat;
 	struct rte_eth_xstat_name *xstats_names;
+	uint64_t *values;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
 	if (!rte_eth_dev_is_valid_port(port_id)) {
@@ -275,7 +275,7 @@ struct rss_type_info {
 	}
 
 	/* Get count */
-	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
 	if (cnt_xstats  < 0) {
 		printf("Error: Cannot get count of xstats\n");
 		return;
@@ -288,23 +288,24 @@ struct rss_type_info {
 		return;
 	}
 	if (cnt_xstats != rte_eth_xstats_get_names(
-			port_id, xstats_names, cnt_xstats)) {
+			port_id, xstats_names, NULL, cnt_xstats)) {
 		printf("Error: Cannot get xstats lookup\n");
 		free(xstats_names);
 		return;
 	}
 
 	/* Get stats themselves */
-	xstats = malloc(sizeof(struct rte_eth_xstat) * cnt_xstats);
-	if (xstats == NULL) {
+	values = malloc(sizeof(values) * cnt_xstats);
+	if (values == NULL) {
 		printf("Cannot allocate memory for xstats\n");
 		free(xstats_names);
 		return;
 	}
-	if (cnt_xstats != rte_eth_xstats_get(port_id, xstats, cnt_xstats)) {
+	if (cnt_xstats != rte_eth_xstats_get(port_id, NULL, values,
+			cnt_xstats)) {
 		printf("Error: Unable to get xstats\n");
 		free(xstats_names);
-		free(xstats);
+		free(values);
 		return;
 	}
 
@@ -312,9 +313,9 @@ struct rss_type_info {
 	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++)
 		printf("%s: %"PRIu64"\n",
 			xstats_names[idx_xstat].name,
-			xstats[idx_xstat].value);
+			values[idx_xstat]);
 	free(xstats_names);
-	free(xstats);
+	free(values);
 }
 
 void
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 066114b..5bbf721 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)
 
 EXPORT_MAP := rte_ether_version.map
 
-LIBABIVER := 6
+LIBABIVER := 7
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b796e7d..91524c4 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1449,7 +1449,8 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
 	if (dev->dev_ops->xstats_get_names != NULL) {
-		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
+		count = (*dev->dev_ops->xstats_get_names_by_ids)(dev, NULL,
+				NULL, 0);
 		if (count < 0)
 			return count;
 	} else
@@ -1463,150 +1464,363 @@ struct rte_eth_dev *
 }
 
 int
-rte_eth_xstats_get_names(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names,
-	unsigned size)
+rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id)
 {
-	struct rte_eth_dev *dev;
-	int cnt_used_entries;
-	int cnt_expected_entries;
-	int cnt_driver_entries;
-	uint32_t idx, id_queue;
-	uint16_t num_q;
+	int cnt_xstats, idx_xstat;
 
-	cnt_expected_entries = get_xstats_count(port_id);
-	if (xstats_names == NULL || cnt_expected_entries < 0 ||
-			(int)size < cnt_expected_entries)
-		return cnt_expected_entries;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 
-	/* port_id checked in get_xstats_count() */
-	dev = &rte_eth_devices[port_id];
-	cnt_used_entries = 0;
+	if (!id) {
+		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
+		return -1;
+	}
 
-	for (idx = 0; idx < RTE_NB_STATS; idx++) {
-		snprintf(xstats_names[cnt_used_entries].name,
-			sizeof(xstats_names[0].name),
-			"%s", rte_stats_strings[idx].name);
-		cnt_used_entries++;
+	if (!xstat_name) {
+		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
+		return -1;
+	}
+
+	/* Get count */
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
+	if (cnt_xstats  < 0) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
+		return -1;
 	}
-	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+
+	/* Get id-name lookup table */
+	struct rte_eth_xstat_name xstats_names[cnt_xstats];
+
+	if (cnt_xstats != rte_eth_xstats_get_names(
+			port_id, xstats_names, NULL, cnt_xstats)) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
+		return -1;
+	}
+
+	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
+		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
+			*id = idx_xstat;
+			return 0;
+		};
+	}
+
+	return -EINVAL;
+}
+
+int
+rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size)
+{
+	/* Get all xstats */
+	if (!ids) {
+		struct rte_eth_dev *dev;
+		int cnt_used_entries;
+		int cnt_expected_entries;
+		int cnt_driver_entries;
+		uint32_t idx, id_queue;
+		uint16_t num_q;
+
+		cnt_expected_entries = get_xstats_count(port_id);
+		if (xstats_names == NULL || cnt_expected_entries < 0 ||
+				(int)size < cnt_expected_entries)
+			return cnt_expected_entries;
+
+		/* port_id checked in get_xstats_count() */
+		dev = &rte_eth_devices[port_id];
+		cnt_used_entries = 0;
+
+		for (idx = 0; idx < RTE_NB_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
-				"rx_q%u%s",
-				id_queue, rte_rxq_stats_strings[idx].name);
+				"%s", rte_stats_strings[idx].name);
 			cnt_used_entries++;
 		}
+		num_q = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"rx_q%u%s",
+					id_queue,
+					rte_rxq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+
+		}
+		num_q = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"tx_q%u%s",
+					id_queue,
+					rte_txq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+		}
 
+		if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries =
+				(*dev->dev_ops->xstats_get_names_by_ids)(
+				dev,
+				xstats_names + cnt_used_entries,
+				NULL,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+
+		} else if (dev->dev_ops->xstats_get_names != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+				dev,
+				xstats_names + cnt_used_entries,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+		}
+
+		return cnt_used_entries;
 	}
-	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
-			snprintf(xstats_names[cnt_used_entries].name,
-				sizeof(xstats_names[0].name),
-				"tx_q%u%s",
-				id_queue, rte_txq_stats_strings[idx].name);
-			cnt_used_entries++;
+	/* Get only xstats given by IDS */
+	else {
+		uint16_t len, i;
+		struct rte_eth_xstat_name *xstats_names_copy;
+
+		len = rte_eth_xstats_get_names_v1705(port_id, NULL, NULL, 0);
+
+		xstats_names_copy =
+				malloc(sizeof(struct rte_eth_xstat_name) * len);
+		if (!xstats_names_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			     "ERROR: can't allocate memory for values_copy\n");
+			free(xstats_names_copy);
+			return -1;
+		}
+
+		rte_eth_xstats_get_names_v1705(port_id, xstats_names_copy,
+				NULL, len);
+
+		for (i = 0; i < size; i++) {
+			if (ids[i] >= len) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			strcpy(xstats_names[i].name,
+					xstats_names_copy[ids[i]].name);
 		}
+		free(xstats_names_copy);
+		return size;
 	}
+}
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get_names(uint8_t port_id,
+			struct rte_eth_xstat_name *xstats_names,
+			uint64_t *ids,
+			unsigned int size), rte_eth_xstats_get_names_v1705);
 
-	if (dev->dev_ops->xstats_get_names != NULL) {
-		/* If there are any driver-specific xstats, append them
-		 * to end of list.
-		 */
-		cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
-			dev,
-			xstats_names + cnt_used_entries,
-			size - cnt_used_entries);
-		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
-		cnt_used_entries += cnt_driver_entries;
+int
+rte_eth_xstats_get_names_v1702(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names,
+	unsigned int size)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, NULL, size);
+}
+VERSION_SYMBOL(rte_eth_xstats_get_names, _v1702, 17.02);
+
+/* retrieve ethdev extended statistics */
+int
+rte_eth_xstats_get_v1702(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	return cnt_used_entries;
+	for (i = 0; i < n; i++) {
+		xstats[i].id = i;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
 }
+VERSION_SYMBOL(rte_eth_xstats_get, _v1702, 17.02);
 
 /* retrieve ethdev extended statistics */
 int
-rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned n)
+rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n)
 {
-	struct rte_eth_stats eth_stats;
-	struct rte_eth_dev *dev;
-	unsigned count = 0, i, q;
-	signed xcount = 0;
-	uint64_t val, *stats_ptr;
-	uint16_t nb_rxqs, nb_txqs;
+	/* If need all xstats */
+	if (!ids) {
+		struct rte_eth_stats eth_stats;
+		struct rte_eth_dev *dev;
+		unsigned int count = 0, i, q;
+		signed int xcount = 0;
+		uint64_t val, *stats_ptr;
+		uint16_t nb_rxqs, nb_txqs;
 
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+		dev = &rte_eth_devices[port_id];
 
-	dev = &rte_eth_devices[port_id];
+		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		nb_txqs = RTE_MIN(dev->data->nb_tx_queues,
+				RTE_ETHDEV_QUEUE_STAT_CNTRS);
 
-	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		/* Return generic statistics */
+		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+			(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* Return generic statistics */
-	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
-		(nb_txqs * RTE_NB_TXQ_STATS);
 
-	/* implemented by the driver */
-	if (dev->dev_ops->xstats_get != NULL) {
-		/* Retrieve the xstats from the driver at the end of the
-		 * xstats struct.
-		 */
-		xcount = (*dev->dev_ops->xstats_get)(dev,
-				     xstats ? xstats + count : NULL,
-				     (n > count) ? n - count : 0);
+		/* implemented by the driver */
+		if (dev->dev_ops->xstats_get_by_ids != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 */
+			xcount = (*dev->dev_ops->xstats_get_by_ids)(dev,
+					NULL,
+					values ? values + count : NULL,
+					(n > count) ? n - count : 0);
+
+			if (xcount < 0)
+				return xcount;
+		/* implemented by the driver */
+		} else if (dev->dev_ops->xstats_get != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 * Compatibility for PMD without xstats_get_by_ids
+			 */
+			unsigned int size = (n > count) ? n - count : 1;
+			struct rte_eth_xstat xstats[size];
 
-		if (xcount < 0)
-			return xcount;
-	}
+			xcount = (*dev->dev_ops->xstats_get)(dev,
+					values ? xstats : NULL,	size);
 
-	if (n < count + xcount || xstats == NULL)
-		return count + xcount;
+			if (xcount < 0)
+				return xcount;
 
-	/* now fill the xstats structure */
-	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+			if (values != NULL)
+				for (i = 0 ; i < (unsigned int)xcount; i++)
+					values[i + count] = xstats[i].value;
+		}
 
-	/* global stats */
-	for (i = 0; i < RTE_NB_STATS; i++) {
-		stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_stats_strings[i].offset);
-		val = *stats_ptr;
-		xstats[count++].value = val;
-	}
+		if (n < count + xcount || values == NULL)
+			return count + xcount;
 
-	/* per-rxq stats */
-	for (q = 0; q < nb_rxqs; q++) {
-		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+		/* now fill the xstats structure */
+		count = 0;
+		rte_eth_stats_get(port_id, &eth_stats);
+
+		/* global stats */
+		for (i = 0; i < RTE_NB_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_rxq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
+						rte_stats_strings[i].offset);
 			val = *stats_ptr;
-			xstats[count++].value = val;
+			values[count++] = val;
 		}
+
+		/* per-rxq stats */
+		for (q = 0; q < nb_rxqs; q++) {
+			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_rxq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		/* per-txq stats */
+		for (q = 0; q < nb_txqs; q++) {
+			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+					    rte_txq_stats_strings[i].offset +
+					    q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		return count + xcount;
 	}
+	/* Need only xstats given by IDS array */
+	else {
+		uint16_t i, size;
+		uint64_t *values_copy;
 
-	/* per-txq stats */
-	for (q = 0; q < nb_txqs; q++) {
-		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
-			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_txq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
-			val = *stats_ptr;
-			xstats[count++].value = val;
+		size = rte_eth_xstats_get_v1705(port_id, NULL, NULL, 0);
+
+		values_copy = malloc(sizeof(values_copy) * size);
+		if (!values_copy) {
+			RTE_PMD_DEBUG_TRACE(
+			    "ERROR: can't allocate memory for values_copy\n");
+			return -1;
+		}
+
+		rte_eth_xstats_get_v1705(port_id, NULL, values_copy, size);
+
+		for (i = 0; i < n; i++) {
+			if (ids[i] >= size) {
+				RTE_PMD_DEBUG_TRACE(
+					"ERROR: id value isn't valid\n");
+				return -1;
+			}
+			values[i] = values_copy[ids[i]];
 		}
+		free(values_copy);
+		return n;
 	}
+}
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get(uint8_t port_id, uint64_t *ids,
+		uint64_t *values, unsigned int n), rte_eth_xstats_get_v1705);
+
+__rte_deprecated int
+rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
 
-	for (i = 0; i < count; i++)
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE(
+				"ERROR: Cannot allocate memory for xstats\n");
+		return -1;
+	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
+
+	for (i = 0; i < n; i++) {
 		xstats[i].id = i;
-	/* add an offset to driver-specific stats */
-	for ( ; i < count + xcount; i++)
-		xstats[i].id += count;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
+}
 
-	return count + xcount;
+__rte_deprecated int
+rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, NULL, n);
 }
 
 /* reset ethdev extended statistics */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b3ee872..91c9af6 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -186,6 +186,7 @@
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
+#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1118,6 +1119,10 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_by_ids_t)(struct rte_eth_dev *dev,
+		uint64_t *ids, uint64_t *values, unsigned int n);
+/**< @internal Get extended stats of an Ethernet device. */
+
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1125,6 +1130,17 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_names_by_ids_t)(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+/**< @internal Get names of extended stats of an Ethernet device. */
+
+typedef int (*eth_xstats_get_by_name_t)(struct rte_eth_dev *dev,
+		struct rte_eth_xstat_name *xstats_names,
+		struct rte_eth_xstat *xstat,
+		const char *name);
+/**< @internal Get xstat specified by name of an Ethernet device. */
+
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1459,9 +1475,15 @@ struct eth_dev_ops {
 	eth_stats_get_t            stats_get;     /**< Get generic device statistics. */
 	eth_stats_reset_t          stats_reset;   /**< Reset generic device statistics. */
 	eth_xstats_get_t           xstats_get;    /**< Get extended device statistics. */
+	eth_xstats_get_by_ids_t    xstats_get_by_ids;
+	/**< Get extended device statistics by ID. */
 	eth_xstats_reset_t         xstats_reset;  /**< Reset extended device statistics. */
-	eth_xstats_get_names_t     xstats_get_names;
-	/**< Get names of extended statistics. */
+	eth_xstats_get_names_t	   xstats_get_names;
+	/**< Get names of extended device statistics. */
+	eth_xstats_get_names_by_ids_t xstats_get_names_by_ids;
+	/**< Get name of extended device statistics by ID. */
+	eth_xstats_get_by_name_t   xstats_get_by_name;
+	/**< Get extended device statistics by name. */
 	eth_queue_stats_mapping_set_t queue_stats_mapping_set;
 	/**< Configure per queue stat counter mapping. */
 
@@ -2287,8 +2309,57 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  */
 void rte_eth_stats_reset(uint8_t port_id);
 
+
 /**
- * Retrieve names of extended statistics of an Ethernet device.
+* Gets the ID of a statistic from its name.
+*
+* Note this function searches for the statistics using string compares, and
+* as such should not be used on the fast-path. For fast-path retrieval of
+* specific statistics, store the ID as provided in *id* from this function,
+* and pass the ID to rte_eth_xstats_get()
+*
+* @param port_id The port to look up statistics from
+* @param xstat_name The name of the statistic to return
+* @param[out] id A pointer to an app-supplied uint64_t which should be
+*                set to the ID of the stat if the stat exists.
+* @return
+*    0 on success
+*    -ENODEV for invalid port_id,
+*    -EINVAL if the xstat_name doesn't exist in port_id
+*/
+int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id);
+
+/**
+ * Retrieve all extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+__rte_deprecated
+int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_all, _v1705, 17.05);
+
+
+/**
+ * Retrieve names of all extended statistics of an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
@@ -2296,7 +2367,7 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *   An rte_eth_xstat_name array of at least *size* elements to
  *   be filled. If set to NULL, the function returns the required number
  *   of elements.
- * @param size
+ * @param n
  *   The size of the xstats_names array (number of elements).
  * @return
  *   - A positive value lower or equal to size: success. The return value
@@ -2307,9 +2378,10 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get_names(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names,
-		unsigned size);
+__rte_deprecated int rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names_all, _v1705, 17.05);
+
 
 /**
  * Retrieve extended statistics of an Ethernet device.
@@ -2333,8 +2405,94 @@ int rte_eth_xstats_get_names(uint8_t port_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-		unsigned n);
+int rte_eth_xstats_get_v1702(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ids
+ *   A pointer to an ids array passed by application. This tells wich
+ *   statistics values function should retrieve. This parameter
+ *   can be set to NULL if n is 0. In this case function will retrieve
+ *   all avalible statistics.
+ * @param values
+ *   A pointer to a table to be filled with device statistics values.
+ * @param n
+ *   The size of the ids array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get, _v1705, 17.05);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1702(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+
+/**
+ * Retrieve names of extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   An rte_eth_xstat_name array of at least *size* elements to
+ *   be filled. If set to NULL, the function returns the required number
+ *   of elements.
+ * @param ids
+ *   IDs array given by app to retrieve specific statistics
+ * @param size
+ *   The size of the xstats_names array (number of elements).
+ * @return
+ *   - A positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+
+int rte_eth_xstats_get_names(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names, _v1705, 17.05);
 
 /**
  * Reset extended statistics of an Ethernet device.
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index c6c9d0d..8a1a330 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -154,3 +154,15 @@ DPDK_17.02 {
 	rte_flow_validate;
 
 } DPDK_16.11;
+
+DPDK_17.05 {
+	global:
+
+	rte_eth_xstats_get_names;
+	rte_eth_xstats_get;
+	rte_eth_xstats_get_all;
+	rte_eth_xstats_get_names_all;
+	rte_eth_xstats_get_id_by_name;
+
+} DPDK_17.02;
+
-- 
1.9.1

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v3 0/3] Extended xstats API in ethdev library to allow grouping of stats
  2017-03-30 21:50  1% ` [dpdk-dev] [PATCH v2 1/5] add new xstats API retrieving by id Michal Jastrzebski
@ 2017-04-03 12:09  3%   ` Jacek Piasecki
  2017-04-03 12:09  1%     ` [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id Jacek Piasecki
  0 siblings, 1 reply; 200+ results
From: Jacek Piasecki @ 2017-04-03 12:09 UTC (permalink / raw)
  To: dev; +Cc: harry.van.haaren, Jacek Piasecki

Extended xstats API in ethdev library to allow grouping of stats logically
so they can be retrieved per logical grouping  managed by the application.
Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
to use a new list of arguments: array of ids and array of values.
ABI versioning mechanism was used to support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the previous
ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
but use new API inside. Both functions marked as deprecated. 
Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
xstats ids by its names.
Extended functionality of proc_info application:
--xstats-name NAME: to display single xstat value by NAME
Updated test-pmd application to use new API.

v3 changes:
checkpatch fixes
removed malloc bug in ethdev
add new command to proc_info and IDs parsing
merged testpmd and proc_info patch with library patch

Jacek Piasecki (3):
  add new xstats API retrieving by id
  add new xstats API id support for e1000
  add new xstats API id support for ixgbe

 app/proc_info/main.c                   | 148 +++++++++++-
 app/test-pmd/config.c                  |  19 +-
 drivers/net/e1000/igb_ethdev.c         |  92 ++++++-
 drivers/net/ixgbe/ixgbe_ethdev.c       | 179 ++++++++++++++
 lib/librte_ether/Makefile              |   2 +-
 lib/librte_ether/rte_ethdev.c          | 422 +++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h          | 176 +++++++++++++-
 lib/librte_ether/rte_ether_version.map |  12 +
 8 files changed, 915 insertions(+), 135 deletions(-)

-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
                         ` (7 preceding siblings ...)
  2017-04-01  7:22  5%       ` [dpdk-dev] [PATCH v4 19/22] vhost: rename header file Yuanhan Liu
@ 2017-04-01  8:44  0%       ` Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  8:44 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng

On Sat, Apr 01, 2017 at 03:22:38PM +0800, Yuanhan Liu wrote:
> This patchset makes DPDK vhost library generic enough, so that we could
> build other vhost-user drivers on top of it. For example, SPDK (Storage
> Performance Development Kit) is trying to enable vhost-user SCSI.
> 
> The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
> the info about the virtio device (i.e. vring address, negotiated features,
> etc) and let the specific vhost-user driver to fetch them (by the API
> provided by DPDK vhost lib). With those info being provided, the vhost-user
> driver then could get/put vring entries, thus, it could exchange data
> between the guest and host.
> 
> The last patch demonstrates how to use these new APIs to implement a
> very simple vhost-user net driver, without any fancy features enabled.

Series applied to dpdk-next-virtio.

	--yliu
> 
> 
> Change log
> ==========
> 
> v2: - rebase
>     - updated release note
>     - updated API comments
>     - renamed rte_vhost_get_vhost_memory to rte_vhost_get_mem_table
> 
>     - added a new device callback: features_changed(), bascially for live
>       migration support
>     - introduced rte_vhost_driver_start() to start a specific driver
>     - misc fixes
> 
> v3: - rebaseon top of vhost-user socket fix
>     - fix reconnect
>     - fix shared build
>     - fix typos
> 
> v4: - rebase
>     - let rte_vhost_get.*_features() to return features by parameter and
>       return -1 on failure
>     - Follow the style of ring rework to update the release note: use one
>       entry for all vhost changes and add sub items for each change.
> 
> 
> Major API/ABI Changes summary
> =============================
> 
> - some renames
>   * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
>   * "rte_virtio_net.h"  ==> "rte_vhost.h"
> 
> - driver related APIs are bond with the socket file
>   * rte_vhost_driver_set_features(socket_file, features);
>   * rte_vhost_driver_get_features(socket_file, features);
>   * rte_vhost_driver_enable_features(socket_file, features)
>   * rte_vhost_driver_disable_features(socket_file, features)
>   * rte_vhost_driver_callback_register(socket_file, notify_ops);
>   * rte_vhost_driver_start(socket_file);
>     This function replaces rte_vhost_driver_session_start(). Check patch
>     18 for more information.
> 
> - new APIs to fetch guest and vring info
>   * rte_vhost_get_mem_table(vid, mem);
>   * rte_vhost_get_negotiated_features(vid);
>   * rte_vhost_get_vhost_vring(vid, vring_idx, vring);
> 
> - new exported structures 
>   * struct rte_vhost_vring
>   * struct rte_vhost_mem_region
>   * struct rte_vhost_memory
> 
> - a new device ops callback: features_changed().
> 
> 
> 	--yliu
> 
> ---
> Yuanhan Liu (22):
>   vhost: introduce driver features related APIs
>   net/vhost: remove feature related APIs
>   vhost: use new APIs to handle features
>   vhost: make notify ops per vhost driver
>   vhost: export guest memory regions
>   vhost: introduce API to fetch negotiated features
>   vhost: export vhost vring info
>   vhost: export API to translate gpa to vva
>   vhost: turn queue pair to vring
>   vhost: export the number of vrings
>   vhost: move the device ready check at proper place
>   vhost: drop the Rx and Tx queue macro
>   vhost: do not include net specific headers
>   vhost: rename device ops struct
>   vhost: rename virtio-net to vhost
>   vhost: add features changed callback
>   vhost: export APIs for live migration support
>   vhost: introduce API to start a specific driver
>   vhost: rename header file
>   vhost: workaround the build dependency on mbuf header
>   vhost: do not destroy device on repeat mem table message
>   examples/vhost: demonstrate the new generic vhost APIs
> 
>  doc/guides/prog_guide/vhost_lib.rst         |  42 +--
>  doc/guides/rel_notes/deprecation.rst        |   9 -
>  doc/guides/rel_notes/release_17_05.rst      |  43 +++
>  drivers/net/vhost/rte_eth_vhost.c           | 101 ++-----
>  drivers/net/vhost/rte_eth_vhost.h           |  32 +--
>  drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
>  examples/tep_termination/main.c             |  23 +-
>  examples/tep_termination/main.h             |   2 +
>  examples/tep_termination/vxlan_setup.c      |   2 +-
>  examples/vhost/Makefile                     |   2 +-
>  examples/vhost/main.c                       | 100 +++++--
>  examples/vhost/main.h                       |  32 ++-
>  examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++
>  lib/librte_vhost/Makefile                   |   4 +-
>  lib/librte_vhost/fd_man.c                   |   9 +-
>  lib/librte_vhost/fd_man.h                   |   2 +-
>  lib/librte_vhost/rte_vhost.h                | 427 ++++++++++++++++++++++++++++
>  lib/librte_vhost/rte_vhost_version.map      |  16 +-
>  lib/librte_vhost/rte_virtio_net.h           | 208 --------------
>  lib/librte_vhost/socket.c                   | 229 ++++++++++++---
>  lib/librte_vhost/vhost.c                    | 230 ++++++++-------
>  lib/librte_vhost/vhost.h                    | 113 +++++---
>  lib/librte_vhost/vhost_user.c               | 121 ++++----
>  lib/librte_vhost/vhost_user.h               |   2 +-
>  lib/librte_vhost/virtio_net.c               |  71 ++---
>  25 files changed, 1541 insertions(+), 687 deletions(-)
>  create mode 100644 examples/vhost/virtio_net.c
>  create mode 100644 lib/librte_vhost/rte_vhost.h
>  delete mode 100644 lib/librte_vhost/rte_virtio_net.h
> 
> -- 
> 1.9.0

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v4 19/22] vhost: rename header file
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
                         ` (6 preceding siblings ...)
  2017-04-01  7:22  3%       ` [dpdk-dev] [PATCH v4 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
@ 2017-04-01  7:22  5%       ` Yuanhan Liu
  2017-04-01  8:44  0%       ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/rel_notes/deprecation.rst   |   9 -
 doc/guides/rel_notes/release_17_05.rst |   3 +
 drivers/net/vhost/rte_eth_vhost.c      |   2 +-
 drivers/net/vhost/rte_eth_vhost.h      |   2 +-
 examples/tep_termination/main.c        |   2 +-
 examples/tep_termination/vxlan_setup.c |   2 +-
 examples/vhost/main.c                  |   2 +-
 lib/librte_vhost/Makefile              |   2 +-
 lib/librte_vhost/rte_vhost.h           | 425 +++++++++++++++++++++++++++++++++
 lib/librte_vhost/rte_virtio_net.h      | 425 ---------------------------------
 lib/librte_vhost/vhost.c               |   2 +-
 lib/librte_vhost/vhost.h               |   2 +-
 lib/librte_vhost/vhost_user.h          |   2 +-
 lib/librte_vhost/virtio_net.c          |   2 +-
 14 files changed, 438 insertions(+), 444 deletions(-)
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d6544ed..9708b39 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -95,15 +95,6 @@ Deprecation Notices
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
 
-* vhost: API/ABI changes are planned for 17.05, for making DPDK vhost library
-  generic enough so that applications can build different vhost-user drivers
-  (instead of vhost-user net only) on top of that.
-  Specifically, ``virtio_net_device_ops`` will be renamed to ``vhost_device_ops``.
-  Correspondingly, some API's parameter need be changed. Few more functions also
-  need be reworked to let it be device aware. For example, different virtio device
-  has different feature set, meaning functions like ``rte_vhost_feature_disable``
-  need be changed. Last, file rte_virtio_net.h will be renamed to rte_vhost.h.
-
 * ABI changes are planned for 17.05 in the ``rte_cryptodev_ops`` structure.
   A pointer to a rte_cryptodev_config structure will be added to the
   function prototype ``cryptodev_configure_t``, as a new parameter.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index ed206d2..ab83098 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -231,6 +231,9 @@ API Changes
     ``rte_vhost_driver_start`` should be used, and no need to create a
     thread to call it.
 
+  * The vhost public header file ``rte_virtio_net.h`` is renamed to
+    ``rte_vhost.h``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 0a4c476..a4a35be 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -40,7 +40,7 @@
 #include <rte_memcpy.h>
 #include <rte_vdev.h>
 #include <rte_kvargs.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_spinlock.h>
 
 #include "rte_eth_vhost.h"
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index ea4bce4..39ca771 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -41,7 +41,7 @@
 #include <stdint.h>
 #include <stdbool.h>
 
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 /*
  * Event description.
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 24c62cd..cd6e3f1 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "main.h"
 #include "vxlan.h"
diff --git a/examples/tep_termination/vxlan_setup.c b/examples/tep_termination/vxlan_setup.c
index 8f1f15b..87de74d 100644
--- a/examples/tep_termination/vxlan_setup.c
+++ b/examples/tep_termination/vxlan_setup.c
@@ -49,7 +49,7 @@
 #include <rte_tcp.h>
 
 #include "main.h"
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 #include "vxlan.h"
 #include "vxlan_setup.h"
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 64b3eea..08b82f6 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 1262dcc..4a116fe 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -51,6 +51,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \
 				   virtio_net.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
new file mode 100644
index 0000000..6681dd7
--- /dev/null
+++ b/lib/librte_vhost/rte_vhost.h
@@ -0,0 +1,425 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VHOST_H_
+#define _RTE_VHOST_H_
+
+/**
+ * @file
+ * Interface to vhost-user
+ */
+
+#include <stdint.h>
+#include <linux/vhost.h>
+#include <linux/virtio_ring.h>
+#include <sys/eventfd.h>
+
+#include <rte_memory.h>
+#include <rte_mempool.h>
+
+#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
+#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
+#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
+
+/**
+ * Information relating to memory regions including offsets to
+ * addresses in QEMUs memory file.
+ */
+struct rte_vhost_mem_region {
+	uint64_t guest_phys_addr;
+	uint64_t guest_user_addr;
+	uint64_t host_user_addr;
+	uint64_t size;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+/**
+ * Memory structure includes region and mapping information.
+ */
+struct rte_vhost_memory {
+	uint32_t nregions;
+	struct rte_vhost_mem_region regions[0];
+};
+
+struct rte_vhost_vring {
+	struct vring_desc	*desc;
+	struct vring_avail	*avail;
+	struct vring_used	*used;
+	uint64_t		log_guest_addr;
+
+	int			callfd;
+	int			kickfd;
+	uint16_t		size;
+};
+
+/**
+ * Device and vring operations.
+ */
+struct vhost_device_ops {
+	int (*new_device)(int vid);		/**< Add device. */
+	void (*destroy_device)(int vid);	/**< Remove device. */
+
+	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
+
+	/**
+	 * Features could be changed after the feature negotiation.
+	 * For example, VHOST_F_LOG_ALL will be set/cleared at the
+	 * start/end of live migration, respectively. This callback
+	 * is used to inform the application on such change.
+	 */
+	int (*features_changed)(int vid, uint64_t features);
+
+	void *reserved[4]; /**< Reserved for future extension */
+};
+
+/**
+ * Convert guest physical address to host virtual address
+ *
+ * @param mem
+ *  the guest memory regions
+ * @param gpa
+ *  the guest physical address for querying
+ * @return
+ *  the host virtual address on success, 0 on failure
+ */
+static inline uint64_t __attribute__((always_inline))
+rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
+{
+	struct rte_vhost_mem_region *reg;
+	uint32_t i;
+
+	for (i = 0; i < mem->nregions; i++) {
+		reg = &mem->regions[i];
+		if (gpa >= reg->guest_phys_addr &&
+		    gpa <  reg->guest_phys_addr + reg->size) {
+			return gpa - reg->guest_phys_addr +
+			       reg->host_user_addr;
+		}
+	}
+
+	return 0;
+}
+
+#define RTE_VHOST_NEED_LOG(features)	((features) & (1ULL << VHOST_F_LOG_ALL))
+
+/**
+ * Log the memory write start with given address.
+ *
+ * This function only need be invoked when the live migration starts.
+ * Therefore, we won't need call it at all in the most of time. For
+ * making the performance impact be minimum, it's suggested to do a
+ * check before calling it:
+ *
+ *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
+ *                rte_vhost_log_write(vid, addr, len);
+ *
+ * @param vid
+ *  vhost device ID
+ * @param addr
+ *  the starting address for write
+ * @param len
+ *  the length to write
+ */
+void rte_vhost_log_write(int vid, uint64_t addr, uint64_t len);
+
+/**
+ * Log the used ring update start at given offset.
+ *
+ * Same as rte_vhost_log_write, it's suggested to do a check before
+ * calling it:
+ *
+ *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
+ *                rte_vhost_log_used_vring(vid, vring_idx, offset, len);
+ *
+ * @param vid
+ *  vhost device ID
+ * @param vring_idx
+ *  the vring index
+ * @param offset
+ *  the offset inside the used ring
+ * @param len
+ *  the length to write
+ */
+void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
+			      uint64_t offset, uint64_t len);
+
+int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
+
+/**
+ * Register vhost driver. path could be different for multiple
+ * instance support.
+ */
+int rte_vhost_driver_register(const char *path, uint64_t flags);
+
+/* Unregister vhost driver. This is only meaningful to vhost user. */
+int rte_vhost_driver_unregister(const char *path);
+
+/**
+ * Set the feature bits the vhost-user driver supports.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_set_features(const char *path, uint64_t features);
+
+/**
+ * Enable vhost-user driver features.
+ *
+ * Note that
+ * - the param @features should be a subset of the feature bits provided
+ *   by rte_vhost_driver_set_features().
+ * - it must be invoked before vhost-user negotiation starts.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @param features
+ *  Features to enable
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_enable_features(const char *path, uint64_t features);
+
+/**
+ * Disable vhost-user driver features.
+ *
+ * The two notes at rte_vhost_driver_enable_features() also apply here.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @param features
+ *  Features to disable
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_disable_features(const char *path, uint64_t features);
+
+/**
+ * Get the feature bits before feature negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @param features
+ *  A pointer to store the queried feature bits
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_get_features(const char *path, uint64_t *features);
+
+/**
+ * Get the feature bits after negotiation
+ *
+ * @param vid
+ *  Vhost device ID
+ * @param features
+ *  A pointer to store the queried feature bits
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_negotiated_features(int vid, uint64_t *features);
+
+/* Register callbacks. */
+int rte_vhost_driver_callback_register(const char *path,
+	struct vhost_device_ops const * const ops);
+
+/**
+ *
+ * Start the vhost-user driver.
+ *
+ * This function triggers the vhost-user negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_start(const char *path);
+
+/**
+ * Get the MTU value of the device if set in QEMU.
+ *
+ * @param vid
+ *  virtio-net device ID
+ * @param mtu
+ *  The variable to store the MTU value
+ *
+ * @return
+ *  0: success
+ *  -EAGAIN: device not yet started
+ *  -ENOTSUP: device does not support MTU feature
+ */
+int rte_vhost_get_mtu(int vid, uint16_t *mtu);
+
+/**
+ * Get the numa node from which the virtio net device's memory
+ * is allocated.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The numa node, -1 on failure
+ */
+int rte_vhost_get_numa_node(int vid);
+
+/**
+ * @deprecated
+ * Get the number of queues the device supports.
+ *
+ * Note this function is deprecated, as it returns a queue pair number,
+ * which is vhost specific. Instead, rte_vhost_get_vring_num should
+ * be used.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of queues, 0 on failure
+ */
+__rte_deprecated
+uint32_t rte_vhost_get_queue_num(int vid);
+
+/**
+ * Get the number of vrings the device supports.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of vrings, 0 on failure
+ */
+uint16_t rte_vhost_get_vring_num(int vid);
+
+/**
+ * Get the virtio net device's ifname, which is the vhost-user socket
+ * file path.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param buf
+ *  The buffer to stored the queried ifname
+ * @param len
+ *  The length of buf
+ *
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_ifname(int vid, char *buf, size_t len);
+
+/**
+ * Get how many avail entries are left in the queue
+ *
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index
+ *
+ * @return
+ *  num of avail entires left
+ */
+uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
+
+/**
+ * This function adds buffers to the virtio devices RX virtqueue. Buffers can
+ * be received from the physical port or from another virtual device. A packet
+ * count is returned to indicate the number of packets that were succesfully
+ * added to the RX queue.
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param pkts
+ *  array to contain packets to be enqueued
+ * @param count
+ *  packets num to be enqueued
+ * @return
+ *  num of packets enqueued
+ */
+uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
+	struct rte_mbuf **pkts, uint16_t count);
+
+/**
+ * This function gets guest buffers from the virtio device TX virtqueue,
+ * construct host mbufs, copies guest buffer content to host mbufs and
+ * store them in pkts to be processed.
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param mbuf_pool
+ *  mbuf_pool where host mbuf is allocated.
+ * @param pkts
+ *  array to contain packets to be dequeued
+ * @param count
+ *  packets num to be dequeued
+ * @return
+ *  num of packets dequeued
+ */
+uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
+	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
+
+/**
+ * Get guest mem table: a list of memory regions.
+ *
+ * An rte_vhost_vhost_memory object will be allocated internaly, to hold the
+ * guest memory regions. Application should free it at destroy_device()
+ * callback.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param mem
+ *  To store the returned mem regions
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_mem_table(int vid, struct rte_vhost_memory **mem);
+
+/**
+ * Get guest vring info, including the vring address, vring size, etc.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param vring_idx
+ *  vring index
+ * @param vring
+ *  the structure to hold the requested vring info
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
+			      struct rte_vhost_vring *vring);
+
+#endif /* _RTE_VHOST_H_ */
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
deleted file mode 100644
index 890f4b2..0000000
--- a/lib/librte_vhost/rte_virtio_net.h
+++ /dev/null
@@ -1,425 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _VIRTIO_NET_H_
-#define _VIRTIO_NET_H_
-
-/**
- * @file
- * Interface to vhost net
- */
-
-#include <stdint.h>
-#include <linux/vhost.h>
-#include <linux/virtio_ring.h>
-#include <sys/eventfd.h>
-
-#include <rte_memory.h>
-#include <rte_mempool.h>
-
-#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
-#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
-#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
-
-/**
- * Information relating to memory regions including offsets to
- * addresses in QEMUs memory file.
- */
-struct rte_vhost_mem_region {
-	uint64_t guest_phys_addr;
-	uint64_t guest_user_addr;
-	uint64_t host_user_addr;
-	uint64_t size;
-	void	 *mmap_addr;
-	uint64_t mmap_size;
-	int fd;
-};
-
-/**
- * Memory structure includes region and mapping information.
- */
-struct rte_vhost_memory {
-	uint32_t nregions;
-	struct rte_vhost_mem_region regions[0];
-};
-
-struct rte_vhost_vring {
-	struct vring_desc	*desc;
-	struct vring_avail	*avail;
-	struct vring_used	*used;
-	uint64_t		log_guest_addr;
-
-	int			callfd;
-	int			kickfd;
-	uint16_t		size;
-};
-
-/**
- * Device and vring operations.
- */
-struct vhost_device_ops {
-	int (*new_device)(int vid);		/**< Add device. */
-	void (*destroy_device)(int vid);	/**< Remove device. */
-
-	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
-
-	/**
-	 * Features could be changed after the feature negotiation.
-	 * For example, VHOST_F_LOG_ALL will be set/cleared at the
-	 * start/end of live migration, respectively. This callback
-	 * is used to inform the application on such change.
-	 */
-	int (*features_changed)(int vid, uint64_t features);
-
-	void *reserved[4]; /**< Reserved for future extension */
-};
-
-/**
- * Convert guest physical address to host virtual address
- *
- * @param mem
- *  the guest memory regions
- * @param gpa
- *  the guest physical address for querying
- * @return
- *  the host virtual address on success, 0 on failure
- */
-static inline uint64_t __attribute__((always_inline))
-rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
-{
-	struct rte_vhost_mem_region *reg;
-	uint32_t i;
-
-	for (i = 0; i < mem->nregions; i++) {
-		reg = &mem->regions[i];
-		if (gpa >= reg->guest_phys_addr &&
-		    gpa <  reg->guest_phys_addr + reg->size) {
-			return gpa - reg->guest_phys_addr +
-			       reg->host_user_addr;
-		}
-	}
-
-	return 0;
-}
-
-#define RTE_VHOST_NEED_LOG(features)	((features) & (1ULL << VHOST_F_LOG_ALL))
-
-/**
- * Log the memory write start with given address.
- *
- * This function only need be invoked when the live migration starts.
- * Therefore, we won't need call it at all in the most of time. For
- * making the performance impact be minimum, it's suggested to do a
- * check before calling it:
- *
- *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
- *                rte_vhost_log_write(vid, addr, len);
- *
- * @param vid
- *  vhost device ID
- * @param addr
- *  the starting address for write
- * @param len
- *  the length to write
- */
-void rte_vhost_log_write(int vid, uint64_t addr, uint64_t len);
-
-/**
- * Log the used ring update start at given offset.
- *
- * Same as rte_vhost_log_write, it's suggested to do a check before
- * calling it:
- *
- *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
- *                rte_vhost_log_used_vring(vid, vring_idx, offset, len);
- *
- * @param vid
- *  vhost device ID
- * @param vring_idx
- *  the vring index
- * @param offset
- *  the offset inside the used ring
- * @param len
- *  the length to write
- */
-void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
-			      uint64_t offset, uint64_t len);
-
-int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
-
-/**
- * Register vhost driver. path could be different for multiple
- * instance support.
- */
-int rte_vhost_driver_register(const char *path, uint64_t flags);
-
-/* Unregister vhost driver. This is only meaningful to vhost user. */
-int rte_vhost_driver_unregister(const char *path);
-
-/**
- * Set the feature bits the vhost-user driver supports.
- *
- * @param path
- *  The vhost-user socket file path
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_set_features(const char *path, uint64_t features);
-
-/**
- * Enable vhost-user driver features.
- *
- * Note that
- * - the param @features should be a subset of the feature bits provided
- *   by rte_vhost_driver_set_features().
- * - it must be invoked before vhost-user negotiation starts.
- *
- * @param path
- *  The vhost-user socket file path
- * @param features
- *  Features to enable
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_enable_features(const char *path, uint64_t features);
-
-/**
- * Disable vhost-user driver features.
- *
- * The two notes at rte_vhost_driver_enable_features() also apply here.
- *
- * @param path
- *  The vhost-user socket file path
- * @param features
- *  Features to disable
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_disable_features(const char *path, uint64_t features);
-
-/**
- * Get the feature bits before feature negotiation.
- *
- * @param path
- *  The vhost-user socket file path
- * @param features
- *  A pointer to store the queried feature bits
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_get_features(const char *path, uint64_t *features);
-
-/**
- * Get the feature bits after negotiation
- *
- * @param vid
- *  Vhost device ID
- * @param features
- *  A pointer to store the queried feature bits
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_negotiated_features(int vid, uint64_t *features);
-
-/* Register callbacks. */
-int rte_vhost_driver_callback_register(const char *path,
-	struct vhost_device_ops const * const ops);
-
-/**
- *
- * Start the vhost-user driver.
- *
- * This function triggers the vhost-user negotiation.
- *
- * @param path
- *  The vhost-user socket file path
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_start(const char *path);
-
-/**
- * Get the MTU value of the device if set in QEMU.
- *
- * @param vid
- *  virtio-net device ID
- * @param mtu
- *  The variable to store the MTU value
- *
- * @return
- *  0: success
- *  -EAGAIN: device not yet started
- *  -ENOTSUP: device does not support MTU feature
- */
-int rte_vhost_get_mtu(int vid, uint16_t *mtu);
-
-/**
- * Get the numa node from which the virtio net device's memory
- * is allocated.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The numa node, -1 on failure
- */
-int rte_vhost_get_numa_node(int vid);
-
-/**
- * @deprecated
- * Get the number of queues the device supports.
- *
- * Note this function is deprecated, as it returns a queue pair number,
- * which is vhost specific. Instead, rte_vhost_get_vring_num should
- * be used.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The number of queues, 0 on failure
- */
-__rte_deprecated
-uint32_t rte_vhost_get_queue_num(int vid);
-
-/**
- * Get the number of vrings the device supports.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The number of vrings, 0 on failure
- */
-uint16_t rte_vhost_get_vring_num(int vid);
-
-/**
- * Get the virtio net device's ifname, which is the vhost-user socket
- * file path.
- *
- * @param vid
- *  vhost device ID
- * @param buf
- *  The buffer to stored the queried ifname
- * @param len
- *  The length of buf
- *
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_ifname(int vid, char *buf, size_t len);
-
-/**
- * Get how many avail entries are left in the queue
- *
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index
- *
- * @return
- *  num of avail entires left
- */
-uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
-
-/**
- * This function adds buffers to the virtio devices RX virtqueue. Buffers can
- * be received from the physical port or from another virtual device. A packet
- * count is returned to indicate the number of packets that were succesfully
- * added to the RX queue.
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index in mq case
- * @param pkts
- *  array to contain packets to be enqueued
- * @param count
- *  packets num to be enqueued
- * @return
- *  num of packets enqueued
- */
-uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
-	struct rte_mbuf **pkts, uint16_t count);
-
-/**
- * This function gets guest buffers from the virtio device TX virtqueue,
- * construct host mbufs, copies guest buffer content to host mbufs and
- * store them in pkts to be processed.
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index in mq case
- * @param mbuf_pool
- *  mbuf_pool where host mbuf is allocated.
- * @param pkts
- *  array to contain packets to be dequeued
- * @param count
- *  packets num to be dequeued
- * @return
- *  num of packets dequeued
- */
-uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
-	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
-
-/**
- * Get guest mem table: a list of memory regions.
- *
- * An rte_vhost_vhost_memory object will be allocated internaly, to hold the
- * guest memory regions. Application should free it at destroy_device()
- * callback.
- *
- * @param vid
- *  vhost device ID
- * @param mem
- *  To store the returned mem regions
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_mem_table(int vid, struct rte_vhost_memory **mem);
-
-/**
- * Get guest vring info, including the vring address, vring size, etc.
- *
- * @param vid
- *  vhost device ID
- * @param vring_idx
- *  vring index
- * @param vring
- *  the structure to hold the requested vring info
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
-			      struct rte_vhost_vring *vring);
-
-#endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 59de2ea..0b19d2e 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -45,7 +45,7 @@
 #include <rte_string_fns.h>
 #include <rte_memory.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "vhost.h"
 
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index a199ee6..ddd8a9c 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -46,7 +46,7 @@
 #include <rte_log.h>
 #include <rte_ether.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* Used to indicate that the device is running on a data core */
 #define VIRTIO_DEV_RUNNING 1
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 838dec8..2ba22db 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -37,7 +37,7 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* refer to hw/virtio/vhost-user.c */
 
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index fc336d9..d6b7c7a 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -39,7 +39,7 @@
 #include <rte_memcpy.h>
 #include <rte_ether.h>
 #include <rte_ip.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
-- 
1.9.0

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v4 18/22] vhost: introduce API to start a specific driver
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
                         ` (5 preceding siblings ...)
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 14/22] vhost: rename device ops struct Yuanhan Liu
@ 2017-04-01  7:22  3%       ` Yuanhan Liu
  2017-04-01  7:22  5%       ` [dpdk-dev] [PATCH v4 19/22] vhost: rename header file Yuanhan Liu
  2017-04-01  8:44  0%       ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

We used to use rte_vhost_driver_session_start() to trigger the vhost-user
session. It takes no argument, thus it's a global trigger. And it could
be problematic.

The issue is, currently, rte_vhost_driver_register(path, flags) actually
tries to put it into the session loop (by fdset_add). However, it needs
a set of APIs to set a vhost-user driver properly:
  * rte_vhost_driver_register(path, flags);
  * rte_vhost_driver_set_features(path, features);
  * rte_vhost_driver_callback_register(path, vhost_device_ops);

If a new vhost-user driver is registered after the trigger (think OVS-DPDK
that could add a port dynamically from cmdline), the current code will
effectively starts the session for the new driver just after the first
API rte_vhost_driver_register() is invoked, leaving later calls taking
no effect at all.

To handle the case properly, this patch introduce a new API,
rte_vhost_driver_start(path), to trigger a specific vhost-user driver.
To do that, the rte_vhost_driver_register(path, flags) is simplified
to create the socket only and let rte_vhost_driver_start(path) to
actually put it into the session loop.

Meanwhile, the rte_vhost_driver_session_start is removed: we could hide
the session thread internally (create the thread if it has not been
created). This would also simplify the application.

NOTE: the API order in prog guide is slightly adjusted for showing the
correct invoke order.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v3: - fix broken reconnect
---
 doc/guides/prog_guide/vhost_lib.rst    | 24 +++++-----
 doc/guides/rel_notes/release_17_05.rst |  4 ++
 drivers/net/vhost/rte_eth_vhost.c      | 50 ++------------------
 examples/tep_termination/main.c        |  8 +++-
 examples/vhost/main.c                  |  9 +++-
 lib/librte_vhost/fd_man.c              |  9 ++--
 lib/librte_vhost/fd_man.h              |  2 +-
 lib/librte_vhost/rte_vhost_version.map |  2 +-
 lib/librte_vhost/rte_virtio_net.h      | 15 +++++-
 lib/librte_vhost/socket.c              | 84 ++++++++++++++++++++--------------
 10 files changed, 104 insertions(+), 103 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index a4fb1f1..5979290 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -116,12 +116,6 @@ The following is an overview of some key Vhost API functions:
   vhost-user driver could be vhost-user net, yet it could be something else,
   say, vhost-user SCSI.
 
-* ``rte_vhost_driver_session_start()``
-
-  This function starts the vhost session loop to handle vhost messages. It
-  starts an infinite loop, therefore it should be called in a dedicated
-  thread.
-
 * ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
@@ -149,6 +143,17 @@ The following is an overview of some key Vhost API functions:
     ``VHOST_F_LOG_ALL`` will be set/cleared at the start/end of live
     migration, respectively.
 
+* ``rte_vhost_driver_disable/enable_features(path, features))``
+
+  This function disables/enables some features. For example, it can be used to
+  disable mergeable buffers and TSO features, which both are enabled by
+  default.
+
+* ``rte_vhost_driver_start(path)``
+
+  This function triggers the vhost-user negotiation. It should be invoked at
+  the end of initializing a vhost-user driver.
+
 * ``rte_vhost_enqueue_burst(vid, queue_id, pkts, count)``
 
   Transmits (enqueues) ``count`` packets from host to guest.
@@ -157,13 +162,6 @@ The following is an overview of some key Vhost API functions:
 
   Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``.
 
-* ``rte_vhost_driver_disable/enable_features(path, features))``
-
-  This function disables/enables some features. For example, it can be used to
-  disable mergeable buffers and TSO features, which both are enabled by
-  default.
-
-
 Vhost-user Implementations
 --------------------------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index a400bd0..ed206d2 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -227,6 +227,10 @@ API Changes
   * The vhost struct ``virtio_net_device_ops`` is renamed to
     ``vhost_device_ops``
 
+  * The vhost API ``rte_vhost_driver_session_start`` is removed. Instead,
+    ``rte_vhost_driver_start`` should be used, and no need to create a
+    thread to call it.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 0b514cc..0a4c476 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -127,9 +127,6 @@ struct internal_list {
 
 static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
 
-static rte_atomic16_t nb_started_ports;
-static pthread_t session_th;
-
 static struct rte_eth_link pmd_link = {
 		.link_speed = 10000,
 		.link_duplex = ETH_LINK_FULL_DUPLEX,
@@ -743,42 +740,6 @@ struct vhost_xstats_name_off {
 	return vid;
 }
 
-static void *
-vhost_driver_session(void *param __rte_unused)
-{
-	/* start event handling */
-	rte_vhost_driver_session_start();
-
-	return NULL;
-}
-
-static int
-vhost_driver_session_start(void)
-{
-	int ret;
-
-	ret = pthread_create(&session_th,
-			NULL, vhost_driver_session, NULL);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't create a thread\n");
-
-	return ret;
-}
-
-static void
-vhost_driver_session_stop(void)
-{
-	int ret;
-
-	ret = pthread_cancel(session_th);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't cancel the thread\n");
-
-	ret = pthread_join(session_th, NULL);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't join the thread\n");
-}
-
 static int
 eth_dev_start(struct rte_eth_dev *dev)
 {
@@ -1094,10 +1055,10 @@ struct vhost_xstats_name_off {
 		goto error;
 	}
 
-	/* We need only one message handling thread */
-	if (rte_atomic16_add_return(&nb_started_ports, 1) == 1) {
-		if (vhost_driver_session_start())
-			goto error;
+	if (rte_vhost_driver_start(iface_name) < 0) {
+		RTE_LOG(ERR, PMD, "Failed to start driver for %s\n",
+			iface_name);
+		goto error;
 	}
 
 	return data->port_id;
@@ -1224,9 +1185,6 @@ struct vhost_xstats_name_off {
 
 	eth_dev_close(eth_dev);
 
-	if (rte_atomic16_sub_return(&nb_started_ports, 1) == 0)
-		vhost_driver_session_stop();
-
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 738f2d2..24c62cd 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1263,7 +1263,13 @@ static inline void __attribute__((always_inline))
 			"failed to register vhost driver callbacks.\n");
 	}
 
-	rte_vhost_driver_session_start();
+	if (rte_vhost_driver_start(dev_basename) < 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to start vhost driver.\n");
+	}
+
+	RTE_LCORE_FOREACH_SLAVE(lcore_id)
+		rte_eal_wait_lcore(lcore_id);
 
 	return 0;
 }
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 4395306..64b3eea 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1545,9 +1545,16 @@ static inline void __attribute__((always_inline))
 			rte_exit(EXIT_FAILURE,
 				"failed to register vhost driver callbacks.\n");
 		}
+
+		if (rte_vhost_driver_start(file) < 0) {
+			rte_exit(EXIT_FAILURE,
+				"failed to start vhost driver.\n");
+		}
 	}
 
-	rte_vhost_driver_session_start();
+	RTE_LCORE_FOREACH_SLAVE(lcore_id)
+		rte_eal_wait_lcore(lcore_id);
+
 	return 0;
 
 }
diff --git a/lib/librte_vhost/fd_man.c b/lib/librte_vhost/fd_man.c
index c7a4490..2ceacc9 100644
--- a/lib/librte_vhost/fd_man.c
+++ b/lib/librte_vhost/fd_man.c
@@ -210,8 +210,8 @@
  * will wait until the flag is reset to zero(which indicates the callback is
  * finished), then it could free the context after fdset_del.
  */
-void
-fdset_event_dispatch(struct fdset *pfdset)
+void *
+fdset_event_dispatch(void *arg)
 {
 	int i;
 	struct pollfd *pfd;
@@ -221,9 +221,10 @@
 	int fd, numfds;
 	int remove1, remove2;
 	int need_shrink;
+	struct fdset *pfdset = arg;
 
 	if (pfdset == NULL)
-		return;
+		return NULL;
 
 	while (1) {
 
@@ -294,4 +295,6 @@
 		if (need_shrink)
 			fdset_shrink(pfdset);
 	}
+
+	return NULL;
 }
diff --git a/lib/librte_vhost/fd_man.h b/lib/librte_vhost/fd_man.h
index d319cac..90d34db 100644
--- a/lib/librte_vhost/fd_man.h
+++ b/lib/librte_vhost/fd_man.h
@@ -64,6 +64,6 @@ int fdset_add(struct fdset *pfdset, int fd,
 
 void *fdset_del(struct fdset *pfdset, int fd);
 
-void fdset_event_dispatch(struct fdset *pfdset);
+void *fdset_event_dispatch(void *arg);
 
 #endif
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index f4b74da..0785873 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -4,7 +4,6 @@ DPDK_2.0 {
 	rte_vhost_dequeue_burst;
 	rte_vhost_driver_callback_register;
 	rte_vhost_driver_register;
-	rte_vhost_driver_session_start;
 	rte_vhost_enable_guest_notification;
 	rte_vhost_enqueue_burst;
 
@@ -35,6 +34,7 @@ DPDK_17.05 {
 	rte_vhost_driver_enable_features;
 	rte_vhost_driver_get_features;
 	rte_vhost_driver_set_features;
+	rte_vhost_driver_start;
 	rte_vhost_get_mem_table;
 	rte_vhost_get_mtu;
 	rte_vhost_get_negotiated_features;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 7a08bbf..890f4b2 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -254,8 +254,19 @@ void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(const char *path,
 	struct vhost_device_ops const * const ops);
-/* Start vhost driver session blocking loop. */
-int rte_vhost_driver_session_start(void);
+
+/**
+ *
+ * Start the vhost-user driver.
+ *
+ * This function triggers the vhost-user negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_start(const char *path);
 
 /**
  * Get the MTU value of the device if set in QEMU.
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 3b68fc9..66fd335 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -63,7 +63,8 @@ struct vhost_user_socket {
 	struct vhost_user_connection_list conn_list;
 	pthread_mutex_t conn_mutex;
 	char *path;
-	int listenfd;
+	int socket_fd;
+	struct sockaddr_un un;
 	bool is_server;
 	bool reconnect;
 	bool dequeue_zero_copy;
@@ -101,7 +102,8 @@ struct vhost_user {
 
 static void vhost_user_server_new_connection(int fd, void *data, int *remove);
 static void vhost_user_read_cb(int fd, void *dat, int *remove);
-static int vhost_user_create_client(struct vhost_user_socket *vsocket);
+static int create_unix_socket(struct vhost_user_socket *vsocket);
+static int vhost_user_start_client(struct vhost_user_socket *vsocket);
 
 static struct vhost_user vhost_user = {
 	.fdset = {
@@ -280,23 +282,26 @@ struct vhost_user {
 
 		free(conn);
 
-		if (vsocket->reconnect)
-			vhost_user_create_client(vsocket);
+		if (vsocket->reconnect) {
+			create_unix_socket(vsocket);
+			vhost_user_start_client(vsocket);
+		}
 	}
 }
 
 static int
-create_unix_socket(const char *path, struct sockaddr_un *un, bool is_server)
+create_unix_socket(struct vhost_user_socket *vsocket)
 {
 	int fd;
+	struct sockaddr_un *un = &vsocket->un;
 
 	fd = socket(AF_UNIX, SOCK_STREAM, 0);
 	if (fd < 0)
 		return -1;
 	RTE_LOG(INFO, VHOST_CONFIG, "vhost-user %s: socket created, fd: %d\n",
-		is_server ? "server" : "client", fd);
+		vsocket->is_server ? "server" : "client", fd);
 
-	if (!is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) {
+	if (!vsocket->is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"vhost-user: can't set nonblocking mode for socket, fd: "
 			"%d (%s)\n", fd, strerror(errno));
@@ -306,25 +311,21 @@ struct vhost_user {
 
 	memset(un, 0, sizeof(*un));
 	un->sun_family = AF_UNIX;
-	strncpy(un->sun_path, path, sizeof(un->sun_path));
+	strncpy(un->sun_path, vsocket->path, sizeof(un->sun_path));
 	un->sun_path[sizeof(un->sun_path) - 1] = '\0';
 
-	return fd;
+	vsocket->socket_fd = fd;
+	return 0;
 }
 
 static int
-vhost_user_create_server(struct vhost_user_socket *vsocket)
+vhost_user_start_server(struct vhost_user_socket *vsocket)
 {
-	int fd;
 	int ret;
-	struct sockaddr_un un;
+	int fd = vsocket->socket_fd;
 	const char *path = vsocket->path;
 
-	fd = create_unix_socket(path, &un, vsocket->is_server);
-	if (fd < 0)
-		return -1;
-
-	ret = bind(fd, (struct sockaddr *)&un, sizeof(un));
+	ret = bind(fd, (struct sockaddr *)&vsocket->un, sizeof(vsocket->un));
 	if (ret < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"failed to bind to %s: %s; remove it and try again\n",
@@ -337,7 +338,6 @@ struct vhost_user {
 	if (ret < 0)
 		goto err;
 
-	vsocket->listenfd = fd;
 	ret = fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection,
 		  NULL, vsocket);
 	if (ret < 0) {
@@ -456,20 +456,15 @@ struct vhost_user_reconnect_list {
 }
 
 static int
-vhost_user_create_client(struct vhost_user_socket *vsocket)
+vhost_user_start_client(struct vhost_user_socket *vsocket)
 {
-	int fd;
 	int ret;
-	struct sockaddr_un un;
+	int fd = vsocket->socket_fd;
 	const char *path = vsocket->path;
 	struct vhost_user_reconnect *reconn;
 
-	fd = create_unix_socket(path, &un, vsocket->is_server);
-	if (fd < 0)
-		return -1;
-
-	ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&un,
-					  sizeof(un));
+	ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&vsocket->un,
+					  sizeof(vsocket->un));
 	if (ret == 0) {
 		vhost_user_add_connection(fd, vsocket);
 		return 0;
@@ -492,7 +487,7 @@ struct vhost_user_reconnect_list {
 		close(fd);
 		return -1;
 	}
-	reconn->un = un;
+	reconn->un = vsocket->un;
 	reconn->fd = fd;
 	reconn->vsocket = vsocket;
 	pthread_mutex_lock(&reconn_list.mutex);
@@ -645,11 +640,10 @@ struct vhost_user_reconnect_list {
 				goto out;
 			}
 		}
-		ret = vhost_user_create_client(vsocket);
 	} else {
 		vsocket->is_server = true;
-		ret = vhost_user_create_server(vsocket);
 	}
+	ret = create_unix_socket(vsocket);
 	if (ret < 0) {
 		free(vsocket->path);
 		free(vsocket);
@@ -705,8 +699,8 @@ struct vhost_user_reconnect_list {
 
 		if (!strcmp(vsocket->path, path)) {
 			if (vsocket->is_server) {
-				fdset_del(&vhost_user.fdset, vsocket->listenfd);
-				close(vsocket->listenfd);
+				fdset_del(&vhost_user.fdset, vsocket->socket_fd);
+				close(vsocket->socket_fd);
 				unlink(path);
 			} else if (vsocket->reconnect) {
 				vhost_user_remove_reconnect(vsocket);
@@ -776,8 +770,28 @@ struct vhost_device_ops const *
 }
 
 int
-rte_vhost_driver_session_start(void)
+rte_vhost_driver_start(const char *path)
 {
-	fdset_event_dispatch(&vhost_user.fdset);
-	return 0;
+	struct vhost_user_socket *vsocket;
+	static pthread_t fdset_tid;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	if (!vsocket)
+		return -1;
+
+	if (fdset_tid == 0) {
+		int ret = pthread_create(&fdset_tid, NULL, fdset_event_dispatch,
+				     &vhost_user.fdset);
+		if (ret < 0)
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"failed to create fdset handling thread");
+	}
+
+	if (vsocket->is_server)
+		return vhost_user_start_server(vsocket);
+	else
+		return vhost_user_start_client(vsocket);
 }
-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 14/22] vhost: rename device ops struct
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
                         ` (4 preceding siblings ...)
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 13/22] vhost: do not include net specific headers Yuanhan Liu
@ 2017-04-01  7:22  4%       ` Yuanhan Liu
  2017-04-01  7:22  3%       ` [dpdk-dev] [PATCH v4 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
                         ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

rename "virtio_net_device_ops" to "vhost_device_ops", to not let it
be virtio-net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/prog_guide/vhost_lib.rst    | 2 +-
 doc/guides/rel_notes/release_17_05.rst | 3 +++
 drivers/net/vhost/rte_eth_vhost.c      | 2 +-
 examples/tep_termination/main.c        | 2 +-
 examples/vhost/main.c                  | 2 +-
 lib/librte_vhost/Makefile              | 2 +-
 lib/librte_vhost/rte_virtio_net.h      | 4 ++--
 lib/librte_vhost/socket.c              | 6 +++---
 lib/librte_vhost/vhost.h               | 4 ++--
 9 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 40f3b3b..e6e34f3 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -122,7 +122,7 @@ The following is an overview of some key Vhost API functions:
   starts an infinite loop, therefore it should be called in a dedicated
   thread.
 
-* ``rte_vhost_driver_callback_register(path, virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
   the appropriate action when some events happen. The following events are
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index ebc28f5..a400bd0 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -224,6 +224,9 @@ API Changes
     * ``linux/if.h``
     * ``rte_ether.h``
 
+  * The vhost struct ``virtio_net_device_ops`` is renamed to
+    ``vhost_device_ops``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index f6e49da..0b514cc 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -671,7 +671,7 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
-static struct virtio_net_device_ops vhost_ops = {
+static struct vhost_device_ops vhost_ops = {
 	.new_device          = new_device,
 	.destroy_device      = destroy_device,
 	.vring_state_changed = vring_state_changed,
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 18b977e..738f2d2 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1081,7 +1081,7 @@ static inline void __attribute__((always_inline))
  * These callback allow devices to be added to the data core when configuration
  * has been fully complete.
  */
-static const struct virtio_net_device_ops virtio_net_device_ops = {
+static const struct vhost_device_ops virtio_net_device_ops = {
 	.new_device =  new_device,
 	.destroy_device = destroy_device,
 };
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 72a9d69..4395306 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1270,7 +1270,7 @@ static inline void __attribute__((always_inline))
  * These callback allow devices to be added to the data core when configuration
  * has been fully complete.
  */
-static const struct virtio_net_device_ops virtio_net_device_ops =
+static const struct vhost_device_ops virtio_net_device_ops =
 {
 	.new_device =  new_device,
 	.destroy_device = destroy_device,
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 1b224b3..1262dcc 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -36,7 +36,7 @@ LIB = librte_vhost.a
 
 EXPORT_MAP := rte_vhost_version.map
 
-LIBABIVER := 3
+LIBABIVER := 4
 
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64
 CFLAGS += -I vhost_user
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 9915751..4287c68 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -87,7 +87,7 @@ struct rte_vhost_vring {
 /**
  * Device and vring operations.
  */
-struct virtio_net_device_ops {
+struct vhost_device_ops {
 	int (*new_device)(int vid);		/**< Add device. */
 	void (*destroy_device)(int vid);	/**< Remove device. */
 
@@ -202,7 +202,7 @@ static inline uint64_t __attribute__((always_inline))
 
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(const char *path,
-	struct virtio_net_device_ops const * const ops);
+	struct vhost_device_ops const * const ops);
 /* Start vhost driver session blocking loop. */
 int rte_vhost_driver_session_start(void);
 
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index aa948b9..3b68fc9 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -78,7 +78,7 @@ struct vhost_user_socket {
 	uint64_t supported_features;
 	uint64_t features;
 
-	struct virtio_net_device_ops const *notify_ops;
+	struct vhost_device_ops const *notify_ops;
 };
 
 struct vhost_user_connection {
@@ -750,7 +750,7 @@ struct vhost_user_reconnect_list {
  */
 int
 rte_vhost_driver_callback_register(const char *path,
-	struct virtio_net_device_ops const * const ops)
+	struct vhost_device_ops const * const ops)
 {
 	struct vhost_user_socket *vsocket;
 
@@ -763,7 +763,7 @@ struct vhost_user_reconnect_list {
 	return vsocket ? 0 : -1;
 }
 
-struct virtio_net_device_ops const *
+struct vhost_device_ops const *
 vhost_driver_callback_get(const char *path)
 {
 	struct vhost_user_socket *vsocket;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 672098b..225ff2e 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -191,7 +191,7 @@ struct virtio_net {
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
-	struct virtio_net_device_ops const *notify_ops;
+	struct vhost_device_ops const *notify_ops;
 
 	uint32_t		nr_guest_pages;
 	uint32_t		max_guest_pages;
@@ -265,7 +265,7 @@ static inline phys_addr_t __attribute__((always_inline))
 void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);
 
-struct virtio_net_device_ops const *vhost_driver_callback_get(const char *path);
+struct vhost_device_ops const *vhost_driver_callback_get(const char *path);
 
 /*
  * Backend-specific cleanup.
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 12/22] vhost: drop the Rx and Tx queue macro
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
                         ` (2 preceding siblings ...)
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 10/22] vhost: export the number of vrings Yuanhan Liu
@ 2017-04-01  7:22  4%       ` Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 13/22] vhost: do not include net specific headers Yuanhan Liu
                         ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

They are virtio-net specific and should be defined inside the virtio-net
driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst | 6 ++++++
 drivers/net/vhost/rte_eth_vhost.c      | 2 ++
 examples/tep_termination/main.h        | 2 ++
 examples/vhost/main.h                  | 2 ++
 lib/librte_vhost/rte_virtio_net.h      | 3 ---
 5 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 5dc5b87..471a509 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -211,6 +211,12 @@ API Changes
   * The vhost API ``rte_vhost_get_queue_num`` is deprecated, instead,
     ``rte_vhost_get_vring_num`` should be used.
 
+  * Following macros are removed in ``rte_virtio_net.h``
+
+    * ``VIRTIO_RXQ``
+    * ``VIRTIO_TXQ``
+    * ``VIRTIO_QNUM``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 5435bd6..f6e49da 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -45,6 +45,8 @@
 
 #include "rte_eth_vhost.h"
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 #define ETH_VHOST_IFACE_ARG		"iface"
 #define ETH_VHOST_QUEUES_ARG		"queues"
 #define ETH_VHOST_CLIENT_ARG		"client"
diff --git a/examples/tep_termination/main.h b/examples/tep_termination/main.h
index c0ea766..8ed817d 100644
--- a/examples/tep_termination/main.h
+++ b/examples/tep_termination/main.h
@@ -54,6 +54,8 @@
 /* Max number of devices. Limited by the application. */
 #define MAX_DEVICES 64
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 /* Per-device statistics struct */
 struct device_statistics {
 	uint64_t tx_total;
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index 6bb42e8..7a3d251 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -41,6 +41,8 @@
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER2
 #define RTE_LOGTYPE_VHOST_PORT   RTE_LOGTYPE_USER3
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 struct device_statistics {
 	uint64_t	tx;
 	uint64_t	tx_total;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 9c1809e..c6e11e9 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -55,9 +55,6 @@
 #define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
 #define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
 
-/* Enum for virtqueue management. */
-enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
-
 /**
  * Information relating to memory regions including offsets to
  * addresses in QEMUs memory file.
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 13/22] vhost: do not include net specific headers
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
                         ` (3 preceding siblings ...)
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
@ 2017-04-01  7:22  4%       ` Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 14/22] vhost: rename device ops struct Yuanhan Liu
                         ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Include it internally, at vhost.h.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst | 7 +++++++
 examples/vhost/main.h                  | 2 ++
 lib/librte_vhost/rte_virtio_net.h      | 4 ----
 lib/librte_vhost/vhost.h               | 4 ++++
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 471a509..ebc28f5 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -217,6 +217,13 @@ API Changes
     * ``VIRTIO_TXQ``
     * ``VIRTIO_QNUM``
 
+  * Following net specific header files are removed in ``rte_virtio_net.h``
+
+    * ``linux/virtio_net.h``
+    * ``sys/socket.h``
+    * ``linux/if.h``
+    * ``rte_ether.h``
+
 
 ABI Changes
 -----------
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index 7a3d251..ddcd858 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -36,6 +36,8 @@
 
 #include <sys/queue.h>
 
+#include <rte_ether.h>
+
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER2
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index c6e11e9..9915751 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -42,14 +42,10 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 #include <linux/virtio_ring.h>
-#include <linux/virtio_net.h>
 #include <sys/eventfd.h>
-#include <sys/socket.h>
-#include <linux/if.h>
 
 #include <rte_memory.h>
 #include <rte_mempool.h>
-#include <rte_ether.h>
 
 #define RTE_VHOST_USER_CLIENT		(1ULL << 0)
 #define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 84e379a..672098b 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -39,8 +39,12 @@
 #include <sys/queue.h>
 #include <unistd.h>
 #include <linux/vhost.h>
+#include <linux/virtio_net.h>
+#include <sys/socket.h>
+#include <linux/if.h>
 
 #include <rte_log.h>
+#include <rte_ether.h>
 
 #include "rte_virtio_net.h"
 
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 10/22] vhost: export the number of vrings
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 02/22] net/vhost: remove feature related APIs Yuanhan Liu
  2017-04-01  7:22  3%       ` [dpdk-dev] [PATCH v4 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
@ 2017-04-01  7:22  4%       ` Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
                         ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

We used to use rte_vhost_get_queue_num() for telling how many vrings.
However, the return value is the number of "queue pairs", which is
very virtio-net specific. To make it generic, we should return the
number of vrings instead, and let the driver do the proper translation.
Say, virtio-net driver could turn it to the number of queue pairs by
dividing 2.

Meanwhile, mark rte_vhost_get_queue_num as deprecated.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst |  3 +++
 drivers/net/vhost/rte_eth_vhost.c      |  2 +-
 lib/librte_vhost/rte_vhost_version.map |  1 +
 lib/librte_vhost/rte_virtio_net.h      | 17 +++++++++++++++++
 lib/librte_vhost/vhost.c               | 11 +++++++++++
 5 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2a4a480..5dc5b87 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -208,6 +208,9 @@ API Changes
     be per vhost-user socket file. Thus, it takes one more argument:
     ``rte_vhost_driver_callback_register(path, ops)``.
 
+  * The vhost API ``rte_vhost_get_queue_num`` is deprecated, instead,
+    ``rte_vhost_get_vring_num`` should be used.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 7504f89..5435bd6 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -569,7 +569,7 @@ struct vhost_xstats_name_off {
 		vq->port = eth_dev->data->port_id;
 	}
 
-	for (i = 0; i < rte_vhost_get_queue_num(vid) * VIRTIO_QNUM; i++)
+	for (i = 0; i < rte_vhost_get_vring_num(vid); i++)
 		rte_vhost_enable_guest_notification(vid, i, 0);
 
 	rte_vhost_get_mtu(vid, &eth_dev->data->mtu);
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 2b309b2..8df14dc 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -39,6 +39,7 @@ DPDK_17.05 {
 	rte_vhost_get_mtu;
 	rte_vhost_get_negotiated_features;
 	rte_vhost_get_vhost_vring;
+	rte_vhost_get_vring_num;
 	rte_vhost_gpa_to_vva;
 
 } DPDK_16.07;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index e019f98..9c1809e 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -241,17 +241,34 @@ int rte_vhost_driver_callback_register(const char *path,
 int rte_vhost_get_numa_node(int vid);
 
 /**
+ * @deprecated
  * Get the number of queues the device supports.
  *
+ * Note this function is deprecated, as it returns a queue pair number,
+ * which is virtio-net specific. Instead, rte_vhost_get_vring_num should
+ * be used.
+ *
  * @param vid
  *  virtio-net device ID
  *
  * @return
  *  The number of queues, 0 on failure
  */
+__rte_deprecated
 uint32_t rte_vhost_get_queue_num(int vid);
 
 /**
+ * Get the number of vrings the device supports.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of vrings, 0 on failure
+ */
+uint16_t rte_vhost_get_vring_num(int vid);
+
+/**
  * Get the virtio net device's ifname, which is the vhost-user socket
  * file path.
  *
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index f0ed729..d57d4b2 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -317,6 +317,17 @@ struct virtio_net *
 	return dev->nr_vring / 2;
 }
 
+uint16_t
+rte_vhost_get_vring_num(int vid)
+{
+	struct virtio_net *dev = get_device(vid);
+
+	if (dev == NULL)
+		return 0;
+
+	return dev->nr_vring;
+}
+
 int
 rte_vhost_get_ifname(int vid, char *buf, size_t len)
 {
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 04/22] vhost: make notify ops per vhost driver
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 02/22] net/vhost: remove feature related APIs Yuanhan Liu
@ 2017-04-01  7:22  3%       ` Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 10/22] vhost: export the number of vrings Yuanhan Liu
                         ` (6 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Assume there is an application both support vhost-user net and
vhost-user scsi, the callback should be different. Making notify
ops per vhost driver allow application define different set of
callbacks for different driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - check the return value of callback_register and callback_get
    - update release note
---
 doc/guides/prog_guide/vhost_lib.rst    |  2 +-
 doc/guides/rel_notes/release_17_05.rst |  4 ++++
 drivers/net/vhost/rte_eth_vhost.c      | 20 +++++++++++---------
 examples/tep_termination/main.c        |  7 ++++++-
 examples/vhost/main.c                  |  9 +++++++--
 lib/librte_vhost/rte_virtio_net.h      |  3 ++-
 lib/librte_vhost/socket.c              | 32 ++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost.c               | 16 +---------------
 lib/librte_vhost/vhost.h               |  5 ++++-
 lib/librte_vhost/vhost_user.c          | 22 ++++++++++++++++------
 10 files changed, 84 insertions(+), 36 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 6a4d206..40f3b3b 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -122,7 +122,7 @@ The following is an overview of some key Vhost API functions:
   starts an infinite loop, therefore it should be called in a dedicated
   thread.
 
-* ``rte_vhost_driver_callback_register(virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, virtio_net_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
   the appropriate action when some events happen. The following events are
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index e0432ea..2a4a480 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -204,6 +204,10 @@ API Changes
     * ``rte_eth_vhost_feature_enable``
     * ``rte_eth_vhost_feature_get``
 
+  * The vhost API ``rte_vhost_driver_callback_register(ops)`` is reworked to
+    be per vhost-user socket file. Thus, it takes one more argument:
+    ``rte_vhost_driver_callback_register(path, ops)``.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 762509b..7504f89 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -669,6 +669,12 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
+static struct virtio_net_device_ops vhost_ops = {
+	.new_device          = new_device,
+	.destroy_device      = destroy_device,
+	.vring_state_changed = vring_state_changed,
+};
+
 int
 rte_eth_vhost_get_queue_event(uint8_t port_id,
 		struct rte_eth_vhost_queue_event *event)
@@ -738,15 +744,6 @@ struct vhost_xstats_name_off {
 static void *
 vhost_driver_session(void *param __rte_unused)
 {
-	static struct virtio_net_device_ops vhost_ops;
-
-	/* set vhost arguments */
-	vhost_ops.new_device = new_device;
-	vhost_ops.destroy_device = destroy_device;
-	vhost_ops.vring_state_changed = vring_state_changed;
-	if (rte_vhost_driver_callback_register(&vhost_ops) < 0)
-		RTE_LOG(ERR, PMD, "Can't register callbacks\n");
-
 	/* start event handling */
 	rte_vhost_driver_session_start();
 
@@ -1090,6 +1087,11 @@ struct vhost_xstats_name_off {
 	if (rte_vhost_driver_register(iface_name, flags))
 		goto error;
 
+	if (rte_vhost_driver_callback_register(iface_name, &vhost_ops) < 0) {
+		RTE_LOG(ERR, PMD, "Can't register callbacks\n");
+		goto error;
+	}
+
 	/* We need only one message handling thread */
 	if (rte_atomic16_add_return(&nb_started_ports, 1) == 1) {
 		if (vhost_driver_session_start())
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 8097dcd..18b977e 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1256,7 +1256,12 @@ static inline void __attribute__((always_inline))
 	rte_vhost_driver_disable_features(dev_basename,
 		1ULL << VIRTIO_NET_F_MRG_RXBUF);
 
-	rte_vhost_driver_callback_register(&virtio_net_device_ops);
+	ret = rte_vhost_driver_callback_register(dev_basename,
+		&virtio_net_device_ops);
+	if (ret != 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to register vhost driver callbacks.\n");
+	}
 
 	rte_vhost_driver_session_start();
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 972a6a8..72a9d69 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1538,9 +1538,14 @@ static inline void __attribute__((always_inline))
 			rte_vhost_driver_enable_features(file,
 				1ULL << VIRTIO_NET_F_CTRL_RX);
 		}
-	}
 
-	rte_vhost_driver_callback_register(&virtio_net_device_ops);
+		ret = rte_vhost_driver_callback_register(file,
+			&virtio_net_device_ops);
+		if (ret != 0) {
+			rte_exit(EXIT_FAILURE,
+				"failed to register vhost driver callbacks.\n");
+		}
+	}
 
 	rte_vhost_driver_session_start();
 	return 0;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 90db986..8c8e67e 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -135,7 +135,8 @@ struct virtio_net_device_ops {
 int rte_vhost_driver_get_features(const char *path, uint64_t *features);
 
 /* Register callbacks. */
-int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const);
+int rte_vhost_driver_callback_register(const char *path,
+	struct virtio_net_device_ops const * const ops);
 /* Start vhost driver session blocking loop. */
 int rte_vhost_driver_session_start(void);
 
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 416b1fd..aa948b9 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -77,6 +77,8 @@ struct vhost_user_socket {
 	 */
 	uint64_t supported_features;
 	uint64_t features;
+
+	struct virtio_net_device_ops const *notify_ops;
 };
 
 struct vhost_user_connection {
@@ -743,6 +745,36 @@ struct vhost_user_reconnect_list {
 	return -1;
 }
 
+/*
+ * Register ops so that we can add/remove device to data core.
+ */
+int
+rte_vhost_driver_callback_register(const char *path,
+	struct virtio_net_device_ops const * const ops)
+{
+	struct vhost_user_socket *vsocket;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	if (vsocket)
+		vsocket->notify_ops = ops;
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	return vsocket ? 0 : -1;
+}
+
+struct virtio_net_device_ops const *
+vhost_driver_callback_get(const char *path)
+{
+	struct vhost_user_socket *vsocket;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	return vsocket ? vsocket->notify_ops : NULL;
+}
+
 int
 rte_vhost_driver_session_start(void)
 {
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 7b40a92..7d7bb3c 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -51,9 +51,6 @@
 
 struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
 
-/* device ops to add/remove device to/from data core. */
-struct virtio_net_device_ops const *notify_ops;
-
 struct virtio_net *
 get_device(int vid)
 {
@@ -253,7 +250,7 @@ struct virtio_net *
 
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(vid);
+		dev->notify_ops->destroy_device(vid);
 	}
 
 	cleanup_device(dev, 1);
@@ -396,14 +393,3 @@ struct virtio_net *
 	dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
 	return 0;
 }
-
-/*
- * Register ops so that we can add/remove device to data core.
- */
-int
-rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
-{
-	notify_ops = ops;
-
-	return 0;
-}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 692691b..6186216 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -185,6 +185,8 @@ struct virtio_net {
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
+	struct virtio_net_device_ops const *notify_ops;
+
 	uint32_t		nr_guest_pages;
 	uint32_t		max_guest_pages;
 	struct guest_page       *guest_pages;
@@ -288,7 +290,6 @@ static inline phys_addr_t __attribute__((always_inline))
 	return 0;
 }
 
-struct virtio_net_device_ops const *notify_ops;
 struct virtio_net *get_device(int vid);
 
 int vhost_new_device(void);
@@ -301,6 +302,8 @@ static inline phys_addr_t __attribute__((always_inline))
 void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);
 
+struct virtio_net_device_ops const *vhost_driver_callback_get(const char *path);
+
 /*
  * Backend-specific cleanup.
  *
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 72eb368..7fe07bc 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -135,7 +135,7 @@
 {
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	cleanup_device(dev, 0);
@@ -509,7 +509,7 @@
 	/* Remove from the data plane. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	if (dev->mem) {
@@ -693,7 +693,7 @@
 						"dequeue zero copy is enabled\n");
 			}
 
-			if (notify_ops->new_device(dev->vid) == 0)
+			if (dev->notify_ops->new_device(dev->vid) == 0)
 				dev->flags |= VIRTIO_DEV_RUNNING;
 		}
 	}
@@ -727,7 +727,7 @@
 	/* We have to stop the queue (virtio) if it is running. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	dev->flags &= ~VIRTIO_DEV_READY;
@@ -769,8 +769,8 @@
 		"set queue enable: %d to qp idx: %d\n",
 		enable, state->index);
 
-	if (notify_ops->vring_state_changed)
-		notify_ops->vring_state_changed(dev->vid, state->index, enable);
+	if (dev->notify_ops->vring_state_changed)
+		dev->notify_ops->vring_state_changed(dev->vid, state->index, enable);
 
 	dev->virtqueue[state->index]->enabled = enable;
 
@@ -984,6 +984,16 @@
 	if (dev == NULL)
 		return -1;
 
+	if (!dev->notify_ops) {
+		dev->notify_ops = vhost_driver_callback_get(dev->ifname);
+		if (!dev->notify_ops) {
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"failed to get callback ops for driver %s\n",
+				dev->ifname);
+			return -1;
+		}
+	}
+
 	ret = read_vhost_message(fd, &msg);
 	if (ret <= 0 || msg.request >= VHOST_USER_MAX) {
 		if (ret < 0)
-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 02/22] net/vhost: remove feature related APIs
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
@ 2017-04-01  7:22  4%       ` Yuanhan Liu
  2017-04-01  7:22  3%       ` [dpdk-dev] [PATCH v4 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
                         ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

The rte_eth_vhost_feature_disable/enable/get APIs are just a wrapper of
rte_vhost_feature_disable/enable/get. However, the later are going to
be refactored; it's going to take an extra parameter (socket_file path),
to let it be per-device.

Instead of changing those vhost-pmd APIs to adapt to the new vhost APIs,
we could simply remove them, and let vdev to serve this purpose. After
all, vdev options is better for disabling/enabling some features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - write more informative commit log on why they are removed.
    - update release note
---
 doc/guides/rel_notes/release_17_05.rst      | 13 +++++++++++++
 drivers/net/vhost/rte_eth_vhost.c           | 25 ------------------------
 drivers/net/vhost/rte_eth_vhost.h           | 30 -----------------------------
 drivers/net/vhost/rte_pmd_vhost_version.map |  3 ---
 4 files changed, 13 insertions(+), 58 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index d6c4616..e0432ea 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -192,6 +192,19 @@ API Changes
     flagged by the compiler. The return value usage should be checked
     while fixing the compiler error due to the extra parameter.
 
+* **Reworked rte_vhost library**
+
+  The rte_vhost library has been reworked to make it generic enough so that
+  user could build other vhost-user drivers on top of it. To achieve that,
+  following changes have been made:
+
+  * The following vhost-pmd APIs are removed:
+
+    * ``rte_eth_vhost_feature_disable``
+    * ``rte_eth_vhost_feature_enable``
+    * ``rte_eth_vhost_feature_get``
+
+
 ABI Changes
 -----------
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 3c6669f..762509b 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -975,31 +975,6 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
-/**
- * Disable features in feature_mask. Returns 0 on success.
- */
-int
-rte_eth_vhost_feature_disable(uint64_t feature_mask)
-{
-	return rte_vhost_feature_disable(feature_mask);
-}
-
-/**
- * Enable features in feature_mask. Returns 0 on success.
- */
-int
-rte_eth_vhost_feature_enable(uint64_t feature_mask)
-{
-	return rte_vhost_feature_enable(feature_mask);
-}
-
-/* Returns currently supported vhost features */
-uint64_t
-rte_eth_vhost_feature_get(void)
-{
-	return rte_vhost_feature_get();
-}
-
 static const struct eth_dev_ops ops = {
 	.dev_start = eth_dev_start,
 	.dev_stop = eth_dev_stop,
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index 7c98b1a..ea4bce4 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -43,36 +43,6 @@
 
 #include <rte_virtio_net.h>
 
-/**
- * Disable features in feature_mask.
- *
- * @param feature_mask
- *  Vhost features defined in "linux/virtio_net.h".
- * @return
- *  - On success, zero.
- *  - On failure, a negative value.
- */
-int rte_eth_vhost_feature_disable(uint64_t feature_mask);
-
-/**
- * Enable features in feature_mask.
- *
- * @param feature_mask
- *  Vhost features defined in "linux/virtio_net.h".
- * @return
- *  - On success, zero.
- *  - On failure, a negative value.
- */
-int rte_eth_vhost_feature_enable(uint64_t feature_mask);
-
-/**
- * Returns currently supported vhost features.
- *
- * @return
- *  Vhost features defined in "linux/virtio_net.h".
- */
-uint64_t rte_eth_vhost_feature_get(void);
-
 /*
  * Event description.
  */
diff --git a/drivers/net/vhost/rte_pmd_vhost_version.map b/drivers/net/vhost/rte_pmd_vhost_version.map
index 3d44083..695db85 100644
--- a/drivers/net/vhost/rte_pmd_vhost_version.map
+++ b/drivers/net/vhost/rte_pmd_vhost_version.map
@@ -1,9 +1,6 @@
 DPDK_16.04 {
 	global:
 
-	rte_eth_vhost_feature_disable;
-	rte_eth_vhost_feature_enable;
-	rte_eth_vhost_feature_get;
 	rte_eth_vhost_get_queue_event;
 
 	local: *;
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
                       ` (7 preceding siblings ...)
  2017-03-28 12:45 10%     ` [dpdk-dev] [PATCH v3 19/22] vhost: rename header file Yuanhan Liu
@ 2017-04-01  7:22  3%     ` Yuanhan Liu
  2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 02/22] net/vhost: remove feature related APIs Yuanhan Liu
                         ` (8 more replies)
  8 siblings, 9 replies; 200+ results
From: Yuanhan Liu @ 2017-04-01  7:22 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

This patchset makes DPDK vhost library generic enough, so that we could
build other vhost-user drivers on top of it. For example, SPDK (Storage
Performance Development Kit) is trying to enable vhost-user SCSI.

The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
the info about the virtio device (i.e. vring address, negotiated features,
etc) and let the specific vhost-user driver to fetch them (by the API
provided by DPDK vhost lib). With those info being provided, the vhost-user
driver then could get/put vring entries, thus, it could exchange data
between the guest and host.

The last patch demonstrates how to use these new APIs to implement a
very simple vhost-user net driver, without any fancy features enabled.


Change log
==========

v2: - rebase
    - updated release note
    - updated API comments
    - renamed rte_vhost_get_vhost_memory to rte_vhost_get_mem_table

    - added a new device callback: features_changed(), bascially for live
      migration support
    - introduced rte_vhost_driver_start() to start a specific driver
    - misc fixes

v3: - rebaseon top of vhost-user socket fix
    - fix reconnect
    - fix shared build
    - fix typos

v4: - rebase
    - let rte_vhost_get.*_features() to return features by parameter and
      return -1 on failure
    - Follow the style of ring rework to update the release note: use one
      entry for all vhost changes and add sub items for each change.


Major API/ABI Changes summary
=============================

- some renames
  * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
  * "rte_virtio_net.h"  ==> "rte_vhost.h"

- driver related APIs are bond with the socket file
  * rte_vhost_driver_set_features(socket_file, features);
  * rte_vhost_driver_get_features(socket_file, features);
  * rte_vhost_driver_enable_features(socket_file, features)
  * rte_vhost_driver_disable_features(socket_file, features)
  * rte_vhost_driver_callback_register(socket_file, notify_ops);
  * rte_vhost_driver_start(socket_file);
    This function replaces rte_vhost_driver_session_start(). Check patch
    18 for more information.

- new APIs to fetch guest and vring info
  * rte_vhost_get_mem_table(vid, mem);
  * rte_vhost_get_negotiated_features(vid);
  * rte_vhost_get_vhost_vring(vid, vring_idx, vring);

- new exported structures 
  * struct rte_vhost_vring
  * struct rte_vhost_mem_region
  * struct rte_vhost_memory

- a new device ops callback: features_changed().


	--yliu

---
Yuanhan Liu (22):
  vhost: introduce driver features related APIs
  net/vhost: remove feature related APIs
  vhost: use new APIs to handle features
  vhost: make notify ops per vhost driver
  vhost: export guest memory regions
  vhost: introduce API to fetch negotiated features
  vhost: export vhost vring info
  vhost: export API to translate gpa to vva
  vhost: turn queue pair to vring
  vhost: export the number of vrings
  vhost: move the device ready check at proper place
  vhost: drop the Rx and Tx queue macro
  vhost: do not include net specific headers
  vhost: rename device ops struct
  vhost: rename virtio-net to vhost
  vhost: add features changed callback
  vhost: export APIs for live migration support
  vhost: introduce API to start a specific driver
  vhost: rename header file
  vhost: workaround the build dependency on mbuf header
  vhost: do not destroy device on repeat mem table message
  examples/vhost: demonstrate the new generic vhost APIs

 doc/guides/prog_guide/vhost_lib.rst         |  42 +--
 doc/guides/rel_notes/deprecation.rst        |   9 -
 doc/guides/rel_notes/release_17_05.rst      |  43 +++
 drivers/net/vhost/rte_eth_vhost.c           | 101 ++-----
 drivers/net/vhost/rte_eth_vhost.h           |  32 +--
 drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
 examples/tep_termination/main.c             |  23 +-
 examples/tep_termination/main.h             |   2 +
 examples/tep_termination/vxlan_setup.c      |   2 +-
 examples/vhost/Makefile                     |   2 +-
 examples/vhost/main.c                       | 100 +++++--
 examples/vhost/main.h                       |  32 ++-
 examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++
 lib/librte_vhost/Makefile                   |   4 +-
 lib/librte_vhost/fd_man.c                   |   9 +-
 lib/librte_vhost/fd_man.h                   |   2 +-
 lib/librte_vhost/rte_vhost.h                | 427 ++++++++++++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map      |  16 +-
 lib/librte_vhost/rte_virtio_net.h           | 208 --------------
 lib/librte_vhost/socket.c                   | 229 ++++++++++++---
 lib/librte_vhost/vhost.c                    | 230 ++++++++-------
 lib/librte_vhost/vhost.h                    | 113 +++++---
 lib/librte_vhost/vhost_user.c               | 121 ++++----
 lib/librte_vhost/vhost_user.h               |   2 +-
 lib/librte_vhost/virtio_net.c               |  71 ++---
 25 files changed, 1541 insertions(+), 687 deletions(-)
 create mode 100644 examples/vhost/virtio_net.c
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] doc: add eventdev library to programmers guide
  2017-03-31 12:43  0%   ` Thomas Monjalon
@ 2017-03-31 13:31  0%     ` Jerin Jacob
  0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2017-03-31 13:31 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Harry van Haaren, dev, hemant.agrawal, bruce.richardson

On Fri, Mar 31, 2017 at 02:43:15PM +0200, Thomas Monjalon wrote:
> 2017-03-30 19:06, Jerin Jacob:
> > 2) ethdev integration is pending.
> 
> What is pending exactly?

As a standalone library, Nothing is pending in eventdev.

There are some HW features in the NIC side in some Hardwares that helps
in offloading eventdev related work in NIC. For example,

In the RX side, Some actor is required to inject the events/packets to
eventdev. Some Hardware is capable of offloading that work in NIC.
if that HW feature is not available then core can do the same.

On the TX side, N cores can send the packets to the same TX queue
WITHOUT SW lock. This helps in proper scaling, If that HW feature is not available
in NIC side then core can decide to bind only one tx queue per core.

On the RX side, I don't expect any change in ethdev API.
On the TX side, We may need to have some capability to expose the lock-less
features(I haven't thought about the abstraction details fully).

What I meant by pending here is that, Some future work is required to
enable, abstracting those HW difference and to have portable example
eventdev applications.

> 
> [...]
> > Based on the above status, I would like to merge eventdev to DPDK
> > v17.05 with EXPERIMENTAL tag and address pending work in v17.08 and
> > remove the EXPERIMENTAL tag.
> > 
> > Thomas and all,
> > 
> > Thoughts ?
> > 
> > NOTE:
> > 
> > Keeping the EXPERIMENTAL status to allow minor change in the API without
> > ABI issue when we address the pending work and I guess NXP eventdev also
> > lined up for v17.08 and they may need a minor change in the API
> 
> Yes, generally speaking, it is a good idea to put the experimental tag
> for the first release of an API.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: add eventdev library to programmers guide
  2017-03-30 13:36  3% ` Jerin Jacob
  2017-03-30 15:59  3%   ` Van Haaren, Harry
@ 2017-03-31 12:43  0%   ` Thomas Monjalon
  2017-03-31 13:31  0%     ` Jerin Jacob
  1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2017-03-31 12:43 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: Harry van Haaren, dev, hemant.agrawal, bruce.richardson

2017-03-30 19:06, Jerin Jacob:
> 2) ethdev integration is pending.

What is pending exactly?

[...]
> Based on the above status, I would like to merge eventdev to DPDK
> v17.05 with EXPERIMENTAL tag and address pending work in v17.08 and
> remove the EXPERIMENTAL tag.
> 
> Thomas and all,
> 
> Thoughts ?
> 
> NOTE:
> 
> Keeping the EXPERIMENTAL status to allow minor change in the API without
> ABI issue when we address the pending work and I guess NXP eventdev also
> lined up for v17.08 and they may need a minor change in the API

Yes, generally speaking, it is a good idea to put the experimental tag
for the first release of an API.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2 0/5] Extended xstats API in ethdev library to allow grouping of stats
@ 2017-03-30 21:50  3% Michal Jastrzebski
  2017-03-30 21:50  1% ` [dpdk-dev] [PATCH v2 1/5] add new xstats API retrieving by id Michal Jastrzebski
  0 siblings, 1 reply; 200+ results
From: Michal Jastrzebski @ 2017-03-30 21:50 UTC (permalink / raw)
  To: dev

Extended xstats API in ethdev library to allow grouping of stats logically
so they can be retrieved per logical grouping – managed by the application.
Changed existing functions rte_eth_xstats_get_names and rte_eth_xstats_get
to use a new list of arguments: array of ids and array of values.
ABI versioning mechanism was used to support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the previous
ones (respectively rte_eth_xstats_get and rte_eth_xstats_get_names)
but use new API inside. Both functions marked as deprecated. 
Introduced new function: rte_eth_xstats_get_id_by_name to retrieve
xstats ids by its names.
Extended functionality of proc_info application:
--xstats-name NAME: to display single xstat value by NAME
Updated test-pmd application to use new API.

v2 changes:
replaced grouping mechanism to use mechanism based on IDs

Jacek Piasecki (5):
  add new xstats API retrieving by id
  add new xstats API id support for e1000
  add new xstats API id support for ixgbe
  add support for new xstats API retrieving by id
  add support for new xstats API retrieving by id

 app/proc_info/main.c                   |  56 ++++-
 app/test-pmd/config.c                  |  18 +-
 drivers/net/e1000/igb_ethdev.c         |  92 ++++++-
 drivers/net/ixgbe/ixgbe_ethdev.c       | 178 ++++++++++++++
 lib/librte_ether/Makefile              |   2 +-
 lib/librte_ether/rte_ethdev.c          | 435 +++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h          | 168 ++++++++++++-
 lib/librte_ether/rte_ether_version.map |  12 +
 8 files changed, 824 insertions(+), 137 deletions(-)

-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 1/5] add new xstats API retrieving by id
  2017-03-30 21:50  3% [dpdk-dev] [PATCH v2 0/5] Extended xstats API in ethdev library to allow grouping of stats Michal Jastrzebski
@ 2017-03-30 21:50  1% ` Michal Jastrzebski
  2017-04-03 12:09  3%   ` [dpdk-dev] [PATCH v3 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
  0 siblings, 1 reply; 200+ results
From: Michal Jastrzebski @ 2017-03-30 21:50 UTC (permalink / raw)
  To: dev; +Cc: Jacek Piasecki, Kuba Kozak

Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping – managed
by the application.
Changed existing functions rte_eth_xstats_get_names and
rte_eth_xstats_get to use a new list of arguments: array of ids
and array of values. ABI versioning mechanism was used to
support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the
previous ones (respectively rte_eth_xstats_get and
rte_eth_xstats_get_names) but use new API inside.
Both functions marked as deprecated. 
Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
---
 lib/librte_ether/Makefile              |   2 +-
 lib/librte_ether/rte_ethdev.c          | 435 +++++++++++++++++++++++++--------
 lib/librte_ether/rte_ethdev.h          | 168 ++++++++++++-
 lib/librte_ether/rte_ether_version.map |  12 +
 4 files changed, 502 insertions(+), 115 deletions(-)

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 066114b..5bbf721 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS)
 
 EXPORT_MAP := rte_ether_version.map
 
-LIBABIVER := 6
+LIBABIVER := 7
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b796e7d..d61c3ff 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1449,7 +1449,7 @@ struct rte_eth_dev *
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 	dev = &rte_eth_devices[port_id];
 	if (dev->dev_ops->xstats_get_names != NULL) {
-		count = (*dev->dev_ops->xstats_get_names)(dev, NULL, 0);
+		count = (*dev->dev_ops->xstats_get_names_by_ids)(dev, NULL, NULL, 0);
 		if (count < 0)
 			return count;
 	} else
@@ -1463,150 +1463,375 @@ struct rte_eth_dev *
 }
 
 int
-rte_eth_xstats_get_names(uint8_t port_id,
-	struct rte_eth_xstat_name *xstats_names,
-	unsigned size)
+rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id)
 {
-	struct rte_eth_dev *dev;
-	int cnt_used_entries;
-	int cnt_expected_entries;
-	int cnt_driver_entries;
-	uint32_t idx, id_queue;
-	uint16_t num_q;
+	uint64_t *values;
+	int cnt_xstats, idx_xstat;
+	struct rte_eth_xstat_name *xstats_names;
 
-	cnt_expected_entries = get_xstats_count(port_id);
-	if (xstats_names == NULL || cnt_expected_entries < 0 ||
-			(int)size < cnt_expected_entries)
-		return cnt_expected_entries;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 
-	/* port_id checked in get_xstats_count() */
-	dev = &rte_eth_devices[port_id];
-	cnt_used_entries = 0;
+	if (!id) {
+		RTE_PMD_DEBUG_TRACE("Error: id pointer is NULL\n");
+		return -1;
+	}
 
-	for (idx = 0; idx < RTE_NB_STATS; idx++) {
-		snprintf(xstats_names[cnt_used_entries].name,
-			sizeof(xstats_names[0].name),
-			"%s", rte_stats_strings[idx].name);
-		cnt_used_entries++;
+	if (!xstat_name) {
+		RTE_PMD_DEBUG_TRACE("Error: xstat_name pointer is NULL\n");
+		return -1;
+	}
+
+	/* Get count */
+	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, NULL, 0);
+	if (cnt_xstats  < 0) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get count of xstats\n");
+		return -1;
+	}
+
+	/* Get id-name lookup table */
+	xstats_names = malloc(sizeof(struct rte_eth_xstat_name) * cnt_xstats);
+	if (xstats_names == NULL) {
+		RTE_PMD_DEBUG_TRACE("Cannot allocate memory"
+				"for xstats lookup\n");
+		return -1;
+	}
+	if (cnt_xstats != rte_eth_xstats_get_names(
+			port_id, xstats_names, NULL, cnt_xstats)) {
+		RTE_PMD_DEBUG_TRACE("Error: Cannot get xstats lookup\n");
+		free(xstats_names);
+		return -1;
+	}
+
+	/* Get stats themselves */
+	values = malloc(sizeof(values) * cnt_xstats);
+	if (values == NULL) {
+		RTE_PMD_DEBUG_TRACE("Cannot allocate memory for xstats\n");
+		free(xstats_names);
+		return -1;
+	}
+
+	if (cnt_xstats != rte_eth_xstats_get(port_id, NULL,
+			values, cnt_xstats)) {
+		RTE_PMD_DEBUG_TRACE("Error: Unable to get xstats\n");
+		free(xstats_names);
+		free(values);
+		return -1;
 	}
-	num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
-			snprintf(xstats_names[cnt_used_entries].name,
-				sizeof(xstats_names[0].name),
-				"rx_q%u%s",
-				id_queue, rte_rxq_stats_strings[idx].name);
-			cnt_used_entries++;
-		}
 
+	/* Display xstats */
+	for (idx_xstat = 0; idx_xstat < cnt_xstats; idx_xstat++) {
+		if (!strcmp(xstats_names[idx_xstat].name, xstat_name)) {
+			*id = idx_xstat;
+			free(xstats_names);
+			free(values);
+			return 0;
+		};
 	}
-	num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	for (id_queue = 0; id_queue < num_q; id_queue++) {
-		for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+
+	free(xstats_names);
+	free(values);
+	return -EINVAL;
+}
+
+int
+rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size)
+{
+	/* Get all xstats */
+	if (!ids) {
+		struct rte_eth_dev *dev;
+		int cnt_used_entries;
+		int cnt_expected_entries;
+		int cnt_driver_entries;
+		uint32_t idx, id_queue;
+		uint16_t num_q;
+
+		cnt_expected_entries = get_xstats_count(port_id);
+		if (xstats_names == NULL || cnt_expected_entries < 0 ||
+				(int)size < cnt_expected_entries)
+			return cnt_expected_entries;
+
+		/* port_id checked in get_xstats_count() */
+		dev = &rte_eth_devices[port_id];
+		cnt_used_entries = 0;
+
+		for (idx = 0; idx < RTE_NB_STATS; idx++) {
 			snprintf(xstats_names[cnt_used_entries].name,
 				sizeof(xstats_names[0].name),
-				"tx_q%u%s",
-				id_queue, rte_txq_stats_strings[idx].name);
+				"%s", rte_stats_strings[idx].name);
 			cnt_used_entries++;
 		}
+		num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"rx_q%u%s",
+					id_queue, rte_rxq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+
+		}
+		num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		for (id_queue = 0; id_queue < num_q; id_queue++) {
+			for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
+				snprintf(xstats_names[cnt_used_entries].name,
+					sizeof(xstats_names[0].name),
+					"tx_q%u%s",
+					id_queue, rte_txq_stats_strings[idx].name);
+				cnt_used_entries++;
+			}
+		}
+
+		if (dev->dev_ops->xstats_get_names_by_ids != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names_by_ids)(
+				dev,
+				xstats_names + cnt_used_entries,
+				NULL,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+
+		} else if (dev->dev_ops->xstats_get_names != NULL) {
+			/* If there are any driver-specific xstats, append them
+			 * to end of list.
+			 */
+			cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
+				dev,
+				xstats_names + cnt_used_entries,
+				size - cnt_used_entries);
+			if (cnt_driver_entries < 0)
+				return cnt_driver_entries;
+			cnt_used_entries += cnt_driver_entries;
+		}
+
+		return cnt_used_entries;
 	}
+	/* Get only xstats given by IDS */
+	else {
+		uint16_t len, i;
+		struct rte_eth_xstat_name *xstats_names_copy;
 
-	if (dev->dev_ops->xstats_get_names != NULL) {
-		/* If there are any driver-specific xstats, append them
-		 * to end of list.
-		 */
-		cnt_driver_entries = (*dev->dev_ops->xstats_get_names)(
-			dev,
-			xstats_names + cnt_used_entries,
-			size - cnt_used_entries);
-		if (cnt_driver_entries < 0)
-			return cnt_driver_entries;
-		cnt_used_entries += cnt_driver_entries;
+		len = rte_eth_xstats_get_names_v1705(port_id, NULL, NULL, 0);
+
+		xstats_names_copy = malloc(sizeof(struct rte_eth_xstat_name) * len);
+		if (!xstats_names_copy) {
+			RTE_PMD_DEBUG_TRACE("ERROR: can't allocate memory "
+					"for values_copy \n");
+			free(xstats_names_copy);
+			return -1;
+		}
+
+		rte_eth_xstats_get_names_v1705(port_id, xstats_names_copy,
+				NULL, len);
+
+		for (i = 0; i < size; i++) {
+			if (ids[i] >= len) {
+				RTE_PMD_DEBUG_TRACE("ERROR: id value"
+						"isn't valid \n");
+				return -1;
+			}
+			strcpy(xstats_names[i].name, xstats_names_copy[ids[i]].name);
+		}
+		free(xstats_names_copy);
+		return size;
 	}
+}
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get_names(uint8_t port_id,
+			struct rte_eth_xstat_name *xstats_names,
+			uint64_t *ids,
+			unsigned size), rte_eth_xstats_get_names_v1705);
 
-	return cnt_used_entries;
+int
+rte_eth_xstats_get_names_v1702(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names,
+	unsigned int size)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, NULL, size);
 }
+VERSION_SYMBOL(rte_eth_xstats_get_names, _v1702, 17.02);
 
 /* retrieve ethdev extended statistics */
 int
-rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-	unsigned n)
+rte_eth_xstats_get_v1702(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
 {
-	struct rte_eth_stats eth_stats;
-	struct rte_eth_dev *dev;
-	unsigned count = 0, i, q;
-	signed xcount = 0;
-	uint64_t val, *stats_ptr;
-	uint16_t nb_rxqs, nb_txqs;
+	uint64_t *values_copy;
+	uint16_t size,i;
 
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE("ERROR: Cannot allocate memory"
+				"for xstats \n");
+		return -1;
+	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	dev = &rte_eth_devices[port_id];
+	for (i = 0; i < n; i++) {
+		xstats[i].id = i;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
+}
+VERSION_SYMBOL(rte_eth_xstats_get, _v1702, 17.02);
 
-	nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
-	nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+/* retrieve ethdev extended statistics */
+int
+rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n)
+{
+	/* If need all xstats */
+	if (!ids) {
+		struct rte_eth_stats eth_stats;
+		struct rte_eth_dev *dev;
+		unsigned count = 0, i, q;
+		signed xcount = 0;
+		uint64_t val, *stats_ptr;
+		uint16_t nb_rxqs, nb_txqs;
 
-	/* Return generic statistics */
-	count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
-		(nb_txqs * RTE_NB_TXQ_STATS);
+		RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+		dev = &rte_eth_devices[port_id];
 
-	/* implemented by the driver */
-	if (dev->dev_ops->xstats_get != NULL) {
-		/* Retrieve the xstats from the driver at the end of the
-		 * xstats struct.
-		 */
-		xcount = (*dev->dev_ops->xstats_get)(dev,
-				     xstats ? xstats + count : NULL,
-				     (n > count) ? n - count : 0);
+		nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+		nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
 
-		if (xcount < 0)
-			return xcount;
-	}
+		/* Return generic statistics */
+		count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+			(nb_txqs * RTE_NB_TXQ_STATS);
 
-	if (n < count + xcount || xstats == NULL)
-		return count + xcount;
 
-	/* now fill the xstats structure */
-	count = 0;
-	rte_eth_stats_get(port_id, &eth_stats);
+		/* implemented by the driver */
+		if (dev->dev_ops->xstats_get_by_ids != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 */
+			xcount = (*dev->dev_ops->xstats_get_by_ids)(dev,
+					NULL,
+					values ? values + count : NULL,
+					(n > count) ? n - count : 0);
+
+			if (xcount < 0)
+				return xcount;
+		/* implemented by the driver */
+		} else if (dev->dev_ops->xstats_get != NULL) {
+			/* Retrieve the xstats from the driver at the end of the
+			 * xstats struct. Retrieve all xstats.
+			 * Backward compatibility for PMD without xstats_get_by_ids
+			 */
+			xcount = (*dev->dev_ops->xstats_get)(dev, NULL, 0);
+
+			if (xcount < 0)
+				return xcount;
+		}
 
-	/* global stats */
-	for (i = 0; i < RTE_NB_STATS; i++) {
-		stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_stats_strings[i].offset);
-		val = *stats_ptr;
-		xstats[count++].value = val;
-	}
+		if (n < count + xcount || values == NULL) {
+			return count + xcount;
+		}
+
+		/* now fill the xstats structure */
+		count = 0;
+		rte_eth_stats_get(port_id, &eth_stats);
 
-	/* per-rxq stats */
-	for (q = 0; q < nb_rxqs; q++) {
-		for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+		/* global stats */
+		for (i = 0; i < RTE_NB_STATS; i++) {
 			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_rxq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
+						rte_stats_strings[i].offset);
 			val = *stats_ptr;
-			xstats[count++].value = val;
+			values[count++] = val;
+		}
+
+		/* per-rxq stats */
+		for (q = 0; q < nb_rxqs; q++) {
+			for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+						rte_rxq_stats_strings[i].offset +
+						q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
+		}
+
+		/* per-txq stats */
+		for (q = 0; q < nb_txqs; q++) {
+			for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
+				stats_ptr = RTE_PTR_ADD(&eth_stats,
+						rte_txq_stats_strings[i].offset +
+						q * sizeof(uint64_t));
+				val = *stats_ptr;
+				values[count++] = val;
+			}
 		}
+
+		return count + xcount;
 	}
+	/* Need only xstats given by IDS array */
+	else {
+		uint16_t i, size;
+		uint64_t *values_copy;
 
-	/* per-txq stats */
-	for (q = 0; q < nb_txqs; q++) {
-		for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
-			stats_ptr = RTE_PTR_ADD(&eth_stats,
-					rte_txq_stats_strings[i].offset +
-					q * sizeof(uint64_t));
-			val = *stats_ptr;
-			xstats[count++].value = val;
+		size = rte_eth_xstats_get_v1705(port_id, NULL, NULL, 0);
+
+		values_copy = malloc(sizeof(values_copy) * size);
+		if (!values_copy) {
+			RTE_PMD_DEBUG_TRACE("ERROR: can't allocate memory "
+						"for values_copy \n");
+			return -1;
 		}
+
+		rte_eth_xstats_get_v1705(port_id, NULL, values_copy, size);
+
+		for (i = 0; i < n; i++) {
+			if (ids[i] >= size) {
+				RTE_PMD_DEBUG_TRACE("ERROR: id value"
+							"isn't valid \n");
+				return -1;
+			}
+			values[i] = values_copy[ids[i]];
+		}
+		free(values_copy);
+		return n;
+	}
+}
+MAP_STATIC_SYMBOL(int
+		rte_eth_xstats_get(uint8_t port_id, uint64_t *ids,
+		uint64_t *values, unsigned int n), rte_eth_xstats_get_v1705);
+
+__rte_deprecated int
+rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n)
+{
+	uint64_t *values_copy;
+	uint16_t size, i;
+
+	values_copy = malloc(sizeof(values_copy) * n);
+	if (!values_copy) {
+		RTE_PMD_DEBUG_TRACE("ERROR: Cannot allocate memory"
+				"for xstats \n");
+		return -1;
 	}
+	size = rte_eth_xstats_get(port_id, 0, values_copy, n);
 
-	for (i = 0; i < count; i++)
+	for (i = 0; i < n; i++) {
 		xstats[i].id = i;
-	/* add an offset to driver-specific stats */
-	for ( ; i < count + xcount; i++)
-		xstats[i].id += count;
+		xstats[i].value = values_copy[i];
+	}
+	free(values_copy);
+	return size;
+}
 
-	return count + xcount;
+__rte_deprecated int
+rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n)
+{
+	return rte_eth_xstats_get_names(port_id, xstats_names, NULL, n);
 }
 
 /* reset ethdev extended statistics */
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b3ee872..083cd10 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -186,6 +186,7 @@
 #include "rte_ether.h"
 #include "rte_eth_ctrl.h"
 #include "rte_dev_info.h"
+#include "rte_compat.h"
 
 struct rte_mbuf;
 
@@ -1118,6 +1119,10 @@ typedef int (*eth_xstats_get_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat *stats, unsigned n);
 /**< @internal Get extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_by_ids_t)(struct rte_eth_dev *dev,
+		uint64_t *ids, uint64_t *values, unsigned n);
+/**< @internal Get extended stats of an Ethernet device. */
+
 typedef void (*eth_xstats_reset_t)(struct rte_eth_dev *dev);
 /**< @internal Reset extended stats of an Ethernet device. */
 
@@ -1125,6 +1130,16 @@ typedef int (*eth_xstats_get_names_t)(struct rte_eth_dev *dev,
 	struct rte_eth_xstat_name *xstats_names, unsigned size);
 /**< @internal Get names of extended stats of an Ethernet device. */
 
+typedef int (*eth_xstats_get_names_by_ids_t)(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids, unsigned size);
+/**< @internal Get names of extended stats of an Ethernet device. */
+
+typedef int (*eth_xstats_get_by_name_t)(struct rte_eth_dev *dev,
+		struct rte_eth_xstat_name *xstats_names,
+		struct rte_eth_xstat *xstat,
+		const char *name);
+/**< @internal Get xstat specified by name of an Ethernet device. */
+
 typedef int (*eth_queue_stats_mapping_set_t)(struct rte_eth_dev *dev,
 					     uint16_t queue_id,
 					     uint8_t stat_idx,
@@ -1459,9 +1474,11 @@ struct eth_dev_ops {
 	eth_stats_get_t            stats_get;     /**< Get generic device statistics. */
 	eth_stats_reset_t          stats_reset;   /**< Reset generic device statistics. */
 	eth_xstats_get_t           xstats_get;    /**< Get extended device statistics. */
+	eth_xstats_get_by_ids_t    xstats_get_by_ids;    /**< Get extended device statistics by ID. */
 	eth_xstats_reset_t         xstats_reset;  /**< Reset extended device statistics. */
-	eth_xstats_get_names_t     xstats_get_names;
-	/**< Get names of extended statistics. */
+	eth_xstats_get_names_t	   xstats_get_names;	/**< Get names of extended device statistics. */
+	eth_xstats_get_names_by_ids_t xstats_get_names_by_ids;	/**< Get name of extended device statistics by ID. */
+	eth_xstats_get_by_name_t   xstats_get_by_name;	/**< Get extended device statistics by name. */
 	eth_queue_stats_mapping_set_t queue_stats_mapping_set;
 	/**< Configure per queue stat counter mapping. */
 
@@ -2287,8 +2304,56 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  */
 void rte_eth_stats_reset(uint8_t port_id);
 
+
 /**
- * Retrieve names of extended statistics of an Ethernet device.
+* Gets the ID of a statistic from its name.
+*
+* Note this function searches for the statistics using string compares, and
+* as such should not be used on the fast-path. For fast-path retrieval of
+* specific statistics, store the ID as provided in *id* from this function,
+* and pass the ID to rte_eth_xstats_get()
+*
+* @param port_id The port to look up statistics from
+* @param xstat_name The name of the statistic to return
+* @param[out] id A pointer to an app-supplied uint64_t which should be
+*                set to the ID of the stat if the stat exists.
+* @return
+*    0 on success
+*    -ENODEV for invalid port_id,
+*    -EINVAL if the xstat_name doesn't exist in port_id
+*/
+int rte_eth_xstats_get_id_by_name(uint8_t port_id, const char *xstat_name,
+		uint64_t *id);
+
+/**
+ * Retrieve all extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+__rte_deprecated int rte_eth_xstats_get_all(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_all, _v1705, 17.05);
+
+
+/**
+ * Retrieve names of all extended statistics of an Ethernet device.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
@@ -2296,7 +2361,7 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *   An rte_eth_xstat_name array of at least *size* elements to
  *   be filled. If set to NULL, the function returns the required number
  *   of elements.
- * @param size
+ * @param n
  *   The size of the xstats_names array (number of elements).
  * @return
  *   - A positive value lower or equal to size: success. The return value
@@ -2307,9 +2372,10 @@ int rte_eth_tx_queue_setup(uint8_t port_id, uint16_t tx_queue_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get_names(uint8_t port_id,
-		struct rte_eth_xstat_name *xstats_names,
-		unsigned size);
+__rte_deprecated int rte_eth_xstats_get_names_all(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names_all, _v1705, 17.05);
+
 
 /**
  * Retrieve extended statistics of an Ethernet device.
@@ -2333,8 +2399,92 @@ int rte_eth_xstats_get_names(uint8_t port_id,
  *     shall not be used by the caller.
  *   - A negative value on error (invalid port id).
  */
-int rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat *xstats,
-		unsigned n);
+int rte_eth_xstats_get_v1702(uint8_t port_id, struct rte_eth_xstat *xstats,
+	unsigned int n);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ids
+ *   A pointer to an ids array passed by application. This tells wich
+ *   statistics values function should retrieve. This parameter
+ *   can be set to NULL if n is 0. In this case function will retrieve
+ *   all avalible statistics.
+ * @param values
+ *   A pointer to a table to be filled with device statistics values.
+ * @param n
+ *   The size of the ids array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_v1705(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+
+int rte_eth_xstats_get(uint8_t port_id, uint64_t *ids, uint64_t *values,
+	unsigned int n);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get, _v1705, 17.05);
+
+/**
+ * Retrieve extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats
+ *   A pointer to a table of structure of type *rte_eth_xstat*
+ *   to be filled with device statistics ids and values: id is the
+ *   index of the name string in xstats_names (see rte_eth_xstats_get_names()),
+ *   and value is the statistic counter.
+ *   This parameter can be set to NULL if n is 0.
+ * @param n
+ *   The size of the xstats array (number of elements).
+ * @return
+ *   - A positive value lower or equal to n: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than n: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1702(uint8_t port_id,
+		struct rte_eth_xstat_name *xstats_names, unsigned int n);
+
+/**
+ * Retrieve names of extended statistics of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param xstats_names
+ *   An rte_eth_xstat_name array of at least *size* elements to
+ *   be filled. If set to NULL, the function returns the required number
+ *   of elements.
+ * @param size
+ *   The size of the xstats_names array (number of elements).
+ * @return
+ *   - A positive value lower or equal to size: success. The return value
+ *     is the number of entries filled in the stats table.
+ *   - A positive value higher than size: error, the given statistics table
+ *     is too small. The return value corresponds to the size that should
+ *     be given to succeed. The entries in the table are not valid and
+ *     shall not be used by the caller.
+ *   - A negative value on error (invalid port id).
+ */
+int rte_eth_xstats_get_names_v1705(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+
+int rte_eth_xstats_get_names(uint8_t port_id,
+	struct rte_eth_xstat_name *xstats_names, uint64_t *ids,
+	unsigned int size);
+BIND_DEFAULT_SYMBOL(rte_eth_xstats_get_names, _v1705, 17.05);
 
 /**
  * Reset extended statistics of an Ethernet device.
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index c6c9d0d..8a1a330 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -154,3 +154,15 @@ DPDK_17.02 {
 	rte_flow_validate;
 
 } DPDK_16.11;
+
+DPDK_17.05 {
+	global:
+
+	rte_eth_xstats_get_names;
+	rte_eth_xstats_get;
+	rte_eth_xstats_get_all;
+	rte_eth_xstats_get_names_all;
+	rte_eth_xstats_get_id_by_name;
+
+} DPDK_17.02;
+
-- 
1.9.1

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v14 3/6] lib: add bitrate statistics library
    2017-03-30 21:00  1% ` [dpdk-dev] [PATCH v14 1/6] lib: add information metrics library Remy Horton
@ 2017-03-30 21:00  2% ` Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-30 21:00 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a library that calculates peak and average data-rate
statistics. For ethernet devices. These statistics are reported using
the metrics library.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                        |   4 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/metrics_lib.rst              |  65 +++++++++
 doc/guides/rel_notes/release_17_05.rst             |   6 +
 lib/Makefile                                       |   2 +
 lib/librte_bitratestats/Makefile                   |  53 ++++++++
 lib/librte_bitratestats/rte_bitrate.c              | 146 +++++++++++++++++++++
 lib/librte_bitratestats/rte_bitrate.h              |  84 ++++++++++++
 .../rte_bitratestats_version.map                   |   9 ++
 mk/rte.app.mk                                      |   1 +
 12 files changed, 377 insertions(+)
 create mode 100644 lib/librte_bitratestats/Makefile
 create mode 100644 lib/librte_bitratestats/rte_bitrate.c
 create mode 100644 lib/librte_bitratestats/rte_bitrate.h
 create mode 100644 lib/librte_bitratestats/rte_bitratestats_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 4eede38..09afcd3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -630,6 +630,10 @@ Metrics
 M: Remy Horton <remy.horton@intel.com>
 F: lib/librte_metrics/
 
+Bit-rate statistics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_bitratestats/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index 52394d9..1b59174 100644
--- a/config/common_base
+++ b/config/common_base
@@ -632,3 +632,8 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
 # Compile the crypto performance application
 #
 CONFIG_RTE_APP_CRYPTO_PERF=y
+
+#
+# Compile the bitrate statistics library
+#
+CONFIG_RTE_LIBRTE_BITRATE=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 1cbe128..18c8676 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -157,4 +157,5 @@ There are many libraries, so their headers may be grouped by topics:
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
   [device metrics]     (@ref rte_metrics.h),
+  [bitrate statistics] (@ref rte_bitrate.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 0562814..7a971b6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -36,6 +36,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_eal/common/include \
                           lib/librte_eal/common/include/generic \
                           lib/librte_acl \
+                          lib/librte_bitratestats \
                           lib/librte_cfgfile \
                           lib/librte_cmdline \
                           lib/librte_compat \
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
index 87f806d..71c3a1b 100644
--- a/doc/guides/prog_guide/metrics_lib.rst
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -178,3 +178,68 @@ print out all metrics for a given port:
         free(metrics);
         free(names);
     }
+
+
+Bit-rate statistics library
+---------------------------
+
+The bit-rate library calculates the exponentially-weighted moving
+average and peak bit-rates for each active port (i.e. network device).
+These statistics are reported via the metrics library using the
+following names:
+
+    - ``mean_bits_in``: Average inbound bit-rate
+    - ``mean_bits_out``:  Average outbound bit-rate
+    - ``ewma_bits_in``: Average inbound bit-rate (EWMA smoothed)
+    - ``ewma_bits_out``:  Average outbound bit-rate (EWMA smoothed)
+    - ``peak_bits_in``:  Peak inbound bit-rate
+    - ``peak_bits_out``:  Peak outbound bit-rate
+
+Once initialised and clocked at the appropriate frequency, these
+statistics can be obtained by querying the metrics library.
+
+Initialization
+~~~~~~~~~~~~~~
+
+Before the library can be used, it has to be initialised by calling
+``rte_stats_bitrate_create()``, which will return a bit-rate
+calculation object. Since the bit-rate library uses the metrics library
+to report the calculated statistics, the bit-rate library then needs to
+register the calculated statistics with the metrics library. This is
+done using the helper function ``rte_stats_bitrate_reg()``.
+
+.. code-block:: c
+
+    struct rte_stats_bitrates *bitrate_data;
+
+    bitrate_data = rte_stats_bitrate_create();
+    if (bitrate_data == NULL)
+        rte_exit(EXIT_FAILURE, "Could not allocate bit-rate data.\n");
+    rte_stats_bitrate_reg(bitrate_data);
+
+Controlling the sampling rate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since the library works by periodic sampling but does not use an
+internal thread, the application has to periodically call
+``rte_stats_bitrate_calc()``. The frequency at which this function
+is called should be the intended sampling rate required for the
+calculated statistics. For instance if per-second statistics are
+desired, this function should be called once a second.
+
+.. code-block:: c
+
+    tics_datum = rte_rdtsc();
+    tics_per_1sec = rte_get_timer_hz();
+
+    while( 1 ) {
+        /* ... */
+        tics_current = rte_rdtsc();
+	if (tics_current - tics_datum >= tics_per_1sec) {
+	    /* Periodic bitrate calculation */
+	    for (idx_port = 0; idx_port < cnt_ports; idx_port++)
+	            rte_stats_bitrate_calc(bitrate_data, idx_port);
+		tics_datum = tics_current;
+	    }
+        /* ... */
+    }
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index e8695b2..9699541 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -95,6 +95,11 @@ Resolved Issues
   reporting mechanism that is independent of other libraries such
   as ethdev.
 
+* **Added bit-rate calculation library.**
+
+  A library that can be used to calculate device bit-rates. Calculated
+  bitrates are reported using the metrics library.
+
 
 EAL
 ~~~
@@ -228,6 +233,7 @@ The libraries prepended with a plus sign were incremented in this version.
 .. code-block:: diff
 
      librte_acl.so.2
+   + librte_bitratestats.so.1
      librte_cfgfile.so.2
      librte_cmdline.so.2
      librte_cryptodev.so.2
diff --git a/lib/Makefile b/lib/Makefile
index 4da6daf..90ab559 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -70,6 +70,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
 DEPDIRS-librte_jobstats := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
 DEPDIRS-librte_metrics := librte_eal
+DIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += librte_bitratestats
+DEPDIRS-librte_bitratestats := librte_eal librte_metrics librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DEPDIRS-librte_power := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
diff --git a/lib/librte_bitratestats/Makefile b/lib/librte_bitratestats/Makefile
new file mode 100644
index 0000000..74fd75a
--- /dev/null
+++ b/lib/librte_bitratestats/Makefile
@@ -0,0 +1,53 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_bitratestats.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_bitratestats_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_BITRATE) := rte_bitrate.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_BITRATE)-include += rte_bitrate.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_metrics
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_bitratestats/rte_bitrate.c b/lib/librte_bitratestats/rte_bitrate.c
new file mode 100644
index 0000000..260750f
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.c
@@ -0,0 +1,146 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_bitrate.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+/*
+ * Persistent bit-rate data.
+ * @internal
+ */
+struct rte_stats_bitrate {
+	uint64_t last_ibytes;
+	uint64_t last_obytes;
+	uint64_t peak_ibits;
+	uint64_t peak_obits;
+	uint64_t mean_ibits;
+	uint64_t mean_obits;
+	uint64_t ewma_ibits;
+	uint64_t ewma_obits;
+};
+
+struct rte_stats_bitrates {
+	struct rte_stats_bitrate port_stats[RTE_MAX_ETHPORTS];
+	uint16_t id_stats_set;
+};
+
+struct rte_stats_bitrates *
+rte_stats_bitrate_create(void)
+{
+	return rte_zmalloc(NULL, sizeof(struct rte_stats_bitrates),
+		RTE_CACHE_LINE_SIZE);
+}
+
+int
+rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data)
+{
+	const char * const names[] = {
+		"ewma_bits_in", "ewma_bits_out",
+		"mean_bits_in", "mean_bits_out",
+		"peak_bits_in", "peak_bits_out",
+	};
+	int return_value;
+
+	return_value = rte_metrics_reg_names(&names[0], ARRAY_SIZE(names));
+	if (return_value >= 0)
+		bitrate_data->id_stats_set = return_value;
+	return return_value;
+}
+
+int
+rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id)
+{
+	struct rte_stats_bitrate *port_data;
+	struct rte_eth_stats eth_stats;
+	int ret_code;
+	uint64_t cnt_bits;
+	int64_t delta;
+	const int64_t alpha_percent = 20;
+	uint64_t values[6];
+
+	ret_code = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret_code != 0)
+		return ret_code;
+
+	port_data = &bitrate_data->port_stats[port_id];
+
+	/* Incoming bitrate. This is an iteratively calculated EWMA
+	 * (Exponentially Weighted Moving Average) that uses a
+	 * weighting factor of alpha_percent. An unsmoothed mean
+	 * for just the current time delta is also calculated for the
+	 * benefit of people who don't understand signal processing.
+	 */
+	cnt_bits = (eth_stats.ibytes - port_data->last_ibytes) << 3;
+	port_data->last_ibytes = eth_stats.ibytes;
+	if (cnt_bits > port_data->peak_ibits)
+		port_data->peak_ibits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_ibits;
+	/* The +-50 fixes integer rounding during divison */
+	if (delta > 0)
+		delta = (delta * alpha_percent + 50) / 100;
+	else
+		delta = (delta * alpha_percent - 50) / 100;
+	port_data->ewma_ibits += delta;
+	port_data->mean_ibits = cnt_bits;
+
+	/* Outgoing bitrate (also EWMA) */
+	cnt_bits = (eth_stats.obytes - port_data->last_obytes) << 3;
+	port_data->last_obytes = eth_stats.obytes;
+	if (cnt_bits > port_data->peak_obits)
+		port_data->peak_obits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_obits;
+	if (delta > 0)
+		delta = (delta * alpha_percent + 50) / 100;
+	else
+		delta = (delta * alpha_percent - 50) / 100;
+	port_data->ewma_obits += delta;
+	port_data->mean_obits = cnt_bits;
+
+	values[0] = port_data->ewma_ibits;
+	values[1] = port_data->ewma_obits;
+	values[2] = port_data->mean_ibits;
+	values[3] = port_data->mean_obits;
+	values[4] = port_data->peak_ibits;
+	values[5] = port_data->peak_obits;
+	rte_metrics_update_values(port_id, bitrate_data->id_stats_set,
+		values, ARRAY_SIZE(values));
+	return 0;
+}
diff --git a/lib/librte_bitratestats/rte_bitrate.h b/lib/librte_bitratestats/rte_bitrate.h
new file mode 100644
index 0000000..b9f11bd
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.h
@@ -0,0 +1,84 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BITRATE_H_
+#define _RTE_BITRATE_H_
+
+/**
+ *  Bitrate statistics data structure.
+ *  This data structure is intentionally opaque.
+ */
+struct rte_stats_bitrates;
+
+
+/**
+ * Allocate a bitrate statistics structure
+ *
+ * @return
+ *   - Pointer to structure on success
+ *   - NULL on error (zmalloc failure)
+ */
+struct rte_stats_bitrates *rte_stats_bitrate_create(void);
+
+
+/**
+ * Register bitrate statistics with the metric library.
+ *
+ * @param bitrate_data
+ *   Pointer allocated by rte_stats_create()
+ *
+ * @return
+ *   Zero on success
+ *   Negative on error
+ */
+int rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data);
+
+
+/**
+ * Calculate statistics for current time window. The period with which
+ * this function is called should be the intended sampling window width.
+ *
+ * @param bitrate_data
+ *   Bitrate statistics data pointer
+ *
+ * @param port_id
+ *   Port id to calculate statistics for
+ *
+ * @return
+ *  - Zero on success
+ *  - Negative value on error
+ */
+int rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id);
+
+#endif /* _RTE_BITRATE_H_ */
diff --git a/lib/librte_bitratestats/rte_bitratestats_version.map b/lib/librte_bitratestats/rte_bitratestats_version.map
new file mode 100644
index 0000000..fe74544
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitratestats_version.map
@@ -0,0 +1,9 @@
+DPDK_17.05 {
+	global:
+
+	rte_stats_bitrate_calc;
+	rte_stats_bitrate_create;
+	rte_stats_bitrate_reg;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index ed3315c..7104a8b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -99,6 +99,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
+_LDLIBS-$(CONFIG_RTE_LIBRTE_BITRATE)        += -lrte_bitratestats
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
-- 
2.5.5

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 1/6] lib: add information metrics library
  @ 2017-03-30 21:00  1% ` Remy Horton
  2017-03-30 21:00  2% ` [dpdk-dev] [PATCH v14 3/6] lib: add bitrate statistics library Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-30 21:00 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a new information metrics library. This Metrics
library implements a mechanism by which producers can publish
numeric information for later querying by consumers. Metrics
themselves are statistics that are not generated by PMDs, and
hence are not reported via ethdev extended statistics.

Metric information is populated using a push model, where
producers update the values contained within the metric
library by calling an update function on the relevant metrics.
Consumers receive metric information by querying the central
metric data, which is held in shared memory.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                |   4 +
 config/common_base                         |   5 +
 doc/api/doxy-api-index.md                  |   3 +-
 doc/api/doxy-api.conf                      |   3 +-
 doc/guides/prog_guide/index.rst            |   3 +-
 doc/guides/prog_guide/metrics_lib.rst      | 180 +++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst     |   9 +
 lib/Makefile                               |   2 +
 lib/librte_metrics/Makefile                |  51 +++++
 lib/librte_metrics/rte_metrics.c           | 302 +++++++++++++++++++++++++++++
 lib/librte_metrics/rte_metrics.h           | 240 +++++++++++++++++++++++
 lib/librte_metrics/rte_metrics_version.map |  13 ++
 mk/rte.app.mk                              |   1 +
 13 files changed, 813 insertions(+), 3 deletions(-)
 create mode 100644 doc/guides/prog_guide/metrics_lib.rst
 create mode 100644 lib/librte_metrics/Makefile
 create mode 100644 lib/librte_metrics/rte_metrics.c
 create mode 100644 lib/librte_metrics/rte_metrics.h
 create mode 100644 lib/librte_metrics/rte_metrics_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 4cb6e49..4eede38 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -626,6 +626,10 @@ F: lib/librte_jobstats/
 F: examples/l2fwd-jobstats/
 F: doc/guides/sample_app_ug/l2_forward_job_stats.rst
 
+Metrics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_metrics/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index 2d54ddf..52394d9 100644
--- a/config/common_base
+++ b/config/common_base
@@ -503,6 +503,11 @@ CONFIG_RTE_LIBRTE_EFD=y
 CONFIG_RTE_LIBRTE_JOBSTATS=y
 
 #
+# Compile the device metrics library
+#
+CONFIG_RTE_LIBRTE_METRICS=y
+
+#
 # Compile librte_lpm
 #
 CONFIG_RTE_LIBRTE_LPM=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index eb39f69..1cbe128 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -4,7 +4,7 @@ API {#index}
 <!--
   BSD LICENSE
 
-  Copyright 2013 6WIND S.A.
+  Copyright 2013-2017 6WIND S.A.
 
   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
@@ -156,4 +156,5 @@ There are many libraries, so their headers may be grouped by topics:
   [common]             (@ref rte_common.h),
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
+  [device metrics]     (@ref rte_metrics.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fdcf13c..0562814 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -1,6 +1,6 @@
 # BSD LICENSE
 #
-# Copyright 2013 6WIND S.A.
+# Copyright 2013-2017 6WIND S.A.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -52,6 +52,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_mbuf \
                           lib/librte_mempool \
                           lib/librte_meter \
+                          lib/librte_metrics \
                           lib/librte_net \
                           lib/librte_pdump \
                           lib/librte_pipeline \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 77f427e..ef5a02a 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-    Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+    Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
     All rights reserved.
 
     Redistribution and use in source and binary forms, with or without
@@ -62,6 +62,7 @@ Programmer's Guide
     packet_classif_access_ctrl
     packet_framework
     vhost_lib
+    metrics_lib
     port_hotplug_framework
     source_org
     dev_kit_build_system
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
new file mode 100644
index 0000000..87f806d
--- /dev/null
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -0,0 +1,180 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Metrics_Library:
+
+Metrics Library
+===============
+
+The Metrics library implements a mechanism by which *producers* can
+publish numeric information for later querying by *consumers*. In
+practice producers will typically be other libraries or primary
+processes, whereas consumers will typically be applications.
+
+Metrics themselves are statistics that are not generated by PMDs. Metric
+information is populated using a push model, where producers update the
+values contained within the metric library by calling an update function
+on the relevant metrics. Consumers receive metric information by querying
+the central metric data, which is held in shared memory.
+
+For each metric, a separate value is maintained for each port id, and
+when publishing metric values the producers need to specify which port is
+being updated. In addition there is a special id ``RTE_METRICS_GLOBAL``
+that is intended for global statistics that are not associated with any
+individual device. Since the metrics library is self-contained, the only
+restriction on port numbers is that they are less than ``RTE_MAX_ETHPORTS``
+- there is no requirement for the ports to actually exist.
+
+Initialising the library
+------------------------
+
+Before the library can be used, it has to be initialized by calling
+``rte_metrics_init()`` which sets up the metric store in shared memory.
+This is where producers will publish metric information to, and where
+consumers will query it from.
+
+.. code-block:: c
+
+    rte_metrics_init(rte_socket_id());
+
+This function **must** be called from a primary process, but otherwise
+producers and consumers can be in either primary or secondary processes.
+
+Registering metrics
+-------------------
+
+Metrics must first be *registered*, which is the way producers declare
+the names of the metrics they will be publishing. Registration can either
+be done individually, or a set of metrics can be registered as a group.
+Individual registration is done using ``rte_metrics_reg_name()``:
+
+.. code-block:: c
+
+    id_1 = rte_metrics_reg_name("mean_bits_in");
+    id_2 = rte_metrics_reg_name("mean_bits_out");
+    id_3 = rte_metrics_reg_name("peak_bits_in");
+    id_4 = rte_metrics_reg_name("peak_bits_out");
+
+or alternatively, a set of metrics can be registered together using
+``rte_metrics_reg_names()``:
+
+.. code-block:: c
+
+    const char * const names[] = {
+        "mean_bits_in", "mean_bits_out",
+        "peak_bits_in", "peak_bits_out",
+    };
+    id_set = rte_metrics_reg_names(&names[0], 4);
+
+If the return value is negative, it means registration failed. Otherwise
+the return value is the *key* for the metric, which is used when updating
+values. A table mapping together these key values and the metrics' names
+can be obtained using ``rte_metrics_get_names()``.
+
+Updating metric values
+----------------------
+
+Once registered, producers can update the metric for a given port using
+the ``rte_metrics_update_value()`` function. This uses the metric key
+that is returned when registering the metric, and can also be looked up
+using ``rte_metrics_get_names()``.
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_1, values[0]);
+    rte_metrics_update_value(port_id, id_2, values[1]);
+    rte_metrics_update_value(port_id, id_3, values[2]);
+    rte_metrics_update_value(port_id, id_4, values[3]);
+
+if metrics were registered as a single set, they can either be updated
+individually using ``rte_metrics_update_value()``, or updated together
+using the ``rte_metrics_update_values()`` function:
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_set, values[0]);
+    rte_metrics_update_value(port_id, id_set + 1, values[1]);
+    rte_metrics_update_value(port_id, id_set + 2, values[2]);
+    rte_metrics_update_value(port_id, id_set + 3, values[3]);
+
+    rte_metrics_update_values(port_id, id_set, values, 4);
+
+Note that ``rte_metrics_update_values()`` cannot be used to update
+metric values from *multiple* *sets*, as there is no guarantee two
+sets registered one after the other have contiguous id values.
+
+Querying metrics
+----------------
+
+Consumers can obtain metric values by querying the metrics library using
+the ``rte_metrics_get_values()`` function that return an array of
+``struct rte_metric_value``. Each entry within this array contains a metric
+value and its associated key. A key-name mapping can be obtained using the
+``rte_metrics_get_names()`` function that returns an array of
+``struct rte_metric_name`` that is indexed by the key. The following will
+print out all metrics for a given port:
+
+.. code-block:: c
+
+    void print_metrics() {
+        struct rte_metric_name *names;
+        int len;
+
+        len = rte_metrics_get_names(NULL, 0);
+        if (len < 0) {
+            printf("Cannot get metrics count\n");
+            return;
+        }
+        if (len == 0) {
+            printf("No metrics to display (none have been registered)\n");
+            return;
+        }
+        metrics = malloc(sizeof(struct rte_metric_value) * len);
+        names =  malloc(sizeof(struct rte_metric_name) * len);
+        if (metrics == NULL || names == NULL) {
+            printf("Cannot allocate memory\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        ret = rte_metrics_get_values(port_id, metrics, len);
+        if (ret < 0 || ret > len) {
+            printf("Cannot get metrics values\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        printf("Metrics for port %i:\n", port_id);
+        for (i = 0; i < len; i++)
+            printf("  %s: %"PRIu64"\n",
+                names[metrics[i].key].name, metrics[i].value);
+        free(metrics);
+        free(names);
+    }
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index c4bb49a..e8695b2 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -87,6 +87,14 @@ Resolved Issues
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Added information metric library.**
+
+  A library that allows information metrics to be added and updated
+  by producers, typically other libraries, for later retrieval by
+  consumers such as applications. It is intended to provide a
+  reporting mechanism that is independent of other libraries such
+  as ethdev.
+
 
 EAL
 ~~~
@@ -235,6 +243,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_mbuf.so.2
      librte_mempool.so.2
      librte_meter.so.1
+   + librte_metrics.so.1
      librte_net.so.1
      librte_pdump.so.1
      librte_pipeline.so.3
diff --git a/lib/Makefile b/lib/Makefile
index 531b162..4da6daf 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -68,6 +68,8 @@ DEPDIRS-librte_ip_frag := librte_eal librte_mempool librte_mbuf librte_ether
 DEPDIRS-librte_ip_frag += librte_hash
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
 DEPDIRS-librte_jobstats := librte_eal
+DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
+DEPDIRS-librte_metrics := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DEPDIRS-librte_power := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
diff --git a/lib/librte_metrics/Makefile b/lib/librte_metrics/Makefile
new file mode 100644
index 0000000..27b3ff9
--- /dev/null
+++ b/lib/librte_metrics/Makefile
@@ -0,0 +1,51 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_metrics.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
+
+EXPORT_MAP := rte_metrics_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_METRICS) := rte_metrics.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_METRICS)-include += rte_metrics.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_METRICS) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_metrics/rte_metrics.c b/lib/librte_metrics/rte_metrics.c
new file mode 100644
index 0000000..e9a122c
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.c
@@ -0,0 +1,302 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_lcore.h>
+#include <rte_memzone.h>
+#include <rte_spinlock.h>
+
+#define RTE_METRICS_MAX_METRICS 256
+#define RTE_METRICS_MEMZONE_NAME "RTE_METRICS"
+
+/**
+ * Internal stats metadata and value entry.
+ *
+ * @internal
+ */
+struct rte_metrics_meta_s {
+	/** Name of metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+	/** Current value for metric */
+	uint64_t value[RTE_MAX_ETHPORTS];
+	/** Used for global metrics */
+	uint64_t global_value;
+	/** Index of next root element (zero for none) */
+	uint16_t idx_next_set;
+	/** Index of next metric in set (zero for none) */
+	uint16_t idx_next_stat;
+};
+
+/**
+ * Internal stats info structure.
+ *
+ * @internal
+ * Offsets into metadata are used instead of pointers because ASLR
+ * means that having the same physical addresses in different
+ * processes is not guaranteed.
+ */
+struct rte_metrics_data_s {
+	/**   Index of last metadata entry with valid data.
+	 * This value is not valid if cnt_stats is zero.
+	 */
+	uint16_t idx_last_set;
+	/**   Number of metrics. */
+	uint16_t cnt_stats;
+	/** Metric data memory block. */
+	struct rte_metrics_meta_s metadata[RTE_METRICS_MAX_METRICS];
+	/** Metric data access lock */
+	rte_spinlock_t lock;
+};
+
+void
+rte_metrics_init(int socket_id)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone != NULL)
+		return;
+	memzone = rte_memzone_reserve(RTE_METRICS_MEMZONE_NAME,
+		sizeof(struct rte_metrics_data_s), socket_id, 0);
+	if (memzone == NULL)
+		rte_exit(EXIT_FAILURE, "Unable to allocate stats memzone\n");
+	stats = memzone->addr;
+	memset(stats, 0, sizeof(struct rte_metrics_data_s));
+	rte_spinlock_init(&stats->lock);
+}
+
+int
+rte_metrics_reg_name(const char *name)
+{
+	const char * const list_names[] = {name};
+
+	return rte_metrics_reg_names(list_names, 1);
+}
+
+int
+rte_metrics_reg_names(const char * const *names, uint16_t cnt_names)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	uint16_t idx_base;
+
+	/* Some sanity checks */
+	if (cnt_names < 1 || names == NULL)
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	if (stats->cnt_stats + cnt_names >= RTE_METRICS_MAX_METRICS)
+		return -ENOMEM;
+
+	rte_spinlock_lock(&stats->lock);
+
+	/* Overwritten later if this is actually first set.. */
+	stats->metadata[stats->idx_last_set].idx_next_set = stats->cnt_stats;
+
+	stats->idx_last_set = idx_base = stats->cnt_stats;
+
+	for (idx_name = 0; idx_name < cnt_names; idx_name++) {
+		entry = &stats->metadata[idx_name + stats->cnt_stats];
+		strncpy(entry->name, names[idx_name],
+			RTE_METRICS_MAX_NAME_LEN);
+		memset(entry->value, 0, sizeof(entry->value));
+		entry->idx_next_stat = idx_name + stats->cnt_stats + 1;
+	}
+	entry->idx_next_stat = 0;
+	entry->idx_next_set = 0;
+	stats->cnt_stats += cnt_names;
+
+	rte_spinlock_unlock(&stats->lock);
+
+	return idx_base;
+}
+
+int
+rte_metrics_update_value(int port_id, uint16_t key, const uint64_t value)
+{
+	return rte_metrics_update_values(port_id, key, &value, 1);
+}
+
+int
+rte_metrics_update_values(int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_metric;
+	uint16_t idx_value;
+	uint16_t cnt_setsize;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	if (values == NULL)
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	rte_spinlock_lock(&stats->lock);
+	idx_metric = key;
+	cnt_setsize = 1;
+	while (idx_metric < stats->cnt_stats) {
+		entry = &stats->metadata[idx_metric];
+		if (entry->idx_next_stat == 0)
+			break;
+		cnt_setsize++;
+		idx_metric++;
+	}
+	/* Check update does not cross set border */
+	if (count > cnt_setsize) {
+		rte_spinlock_unlock(&stats->lock);
+		return -ERANGE;
+	}
+
+	if (port_id == RTE_METRICS_GLOBAL)
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].global_value =
+				values[idx_value];
+		}
+	else
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].value[port_id] =
+				values[idx_value];
+		}
+	rte_spinlock_unlock(&stats->lock);
+	return 0;
+}
+
+int
+rte_metrics_get_names(struct rte_metric_name *names,
+	uint16_t capacity)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+	if (names != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		for (idx_name = 0; idx_name < stats->cnt_stats; idx_name++)
+			strncpy(names[idx_name].name,
+				stats->metadata[idx_name].name,
+				RTE_METRICS_MAX_NAME_LEN);
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
+
+int
+rte_metrics_get_values(int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+
+	if (values != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		if (port_id == RTE_METRICS_GLOBAL)
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->global_value;
+			}
+		else
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->value[port_id];
+			}
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
diff --git a/lib/librte_metrics/rte_metrics.h b/lib/librte_metrics/rte_metrics.h
new file mode 100644
index 0000000..fd0154f
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.h
@@ -0,0 +1,240 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ *
+ * DPDK Metrics module
+ *
+ * Metrics are statistics that are not generated by PMDs, and hence
+ * are better reported through a mechanism that is independent from
+ * the ethdev-based extended statistics. Providers will typically
+ * be other libraries and consumers will typically be applications.
+ *
+ * Metric information is populated using a push model, where producers
+ * update the values contained within the metric library by calling
+ * an update function on the relevant metrics. Consumers receive
+ * metric information by querying the central metric data, which is
+ * held in shared memory. Currently only bulk querying of metrics
+ * by consumers is supported.
+ */
+
+#ifndef _RTE_METRICS_H_
+#define _RTE_METRICS_H_
+
+/** Maximum length of metric name (including null-terminator) */
+#define RTE_METRICS_MAX_NAME_LEN 64
+
+/**
+ * Global metric special id.
+ *
+ * When used for the port_id parameter when calling
+ * rte_metrics_update_metric() or rte_metrics_update_metric(),
+ * the global metric, which are not associated with any specific
+ * port (i.e. device), are updated.
+ */
+#define RTE_METRICS_GLOBAL -1
+
+
+/**
+ * A name-key lookup for metrics.
+ *
+ * An array of this structure is returned by rte_metrics_get_names().
+ * The struct rte_metric_value references these names via their array index.
+ */
+struct rte_metric_name {
+	/** String describing metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+};
+
+
+/**
+ * Metric value structure.
+ *
+ * This structure is used by rte_metrics_get_values() to return metrics,
+ * which are statistics that are not generated by PMDs. It maps a name key,
+ * which corresponds to an index in the array returned by
+ * rte_metrics_get_names().
+ */
+struct rte_metric_value {
+	/** Numeric identifier of metric. */
+	uint16_t key;
+	/** Value for metric */
+	uint64_t value;
+};
+
+
+/**
+ * Initializes metric module. This function must be called from
+ * a primary process before metrics are used.
+ *
+ * @param socket_id
+ *   Socket to use for shared memory allocation.
+ */
+void rte_metrics_init(int socket_id);
+
+/**
+ * Register a metric, making it available as a reporting parameter.
+ *
+ * Registering a metric is the way producers declare a parameter
+ * that they wish to be reported. Once registered, the associated
+ * numeric key can be obtained via rte_metrics_get_names(), which
+ * is required for updating said metric's value.
+ *
+ * @param name
+ *   Metric name
+ *
+ * @return
+ *  - Zero or positive: Success (index key of new metric)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_name(const char *name);
+
+/**
+ * Register a set of metrics.
+ *
+ * This is a bulk version of rte_metrics_reg_names() and aside from
+ * handling multiple keys at once is functionally identical.
+ *
+ * @param names
+ *   List of metric names
+ *
+ * @param cnt_names
+ *   Number of metrics in set
+ *
+ * @return
+ *  - Zero or positive: Success (index key of start of set)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_names(const char * const *names, uint16_t cnt_names);
+
+/**
+ * Get metric name-key lookup table.
+ *
+ * @param names
+ *   A struct rte_metric_name array of at least *capacity* in size to
+ *   receive key names. If this is NULL, function returns the required
+ *   number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_name array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *names* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_names(
+	struct rte_metric_name *names,
+	uint16_t capacity);
+
+/**
+ * Get metric value table.
+ *
+ * @param port_id
+ *   Port id to query
+ *
+ * @param values
+ *   A struct rte_metric_value array of at least *capacity* in size to
+ *   receive metric ids and values. If this is NULL, function returns
+ *   the required number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_value array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *values* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_values(
+	int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity);
+
+/**
+ * Updates a metric
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Id of metric to update
+ * @param value
+ *   New value
+ *
+ * @return
+ *   - -EIO if unable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_value(
+	int port_id,
+	uint16_t key,
+	const uint64_t value);
+
+/**
+ * Updates a metric set. Note that it is an error to try to
+ * update across a set boundary.
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Base id of metrics set to update
+ * @param values
+ *   Set of new values
+ * @param count
+ *   Number of new values
+ *
+ * @return
+ *   - -ERANGE if count exceeds metric set size
+ *   - -EIO if unable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_values(
+	int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count);
+
+#endif
diff --git a/lib/librte_metrics/rte_metrics_version.map b/lib/librte_metrics/rte_metrics_version.map
new file mode 100644
index 0000000..4c5234c
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics_version.map
@@ -0,0 +1,13 @@
+DPDK_17.05 {
+	global:
+
+	rte_metrics_get_names;
+	rte_metrics_get_values;
+	rte_metrics_init;
+	rte_metrics_reg_name;
+	rte_metrics_reg_names;
+	rte_metrics_update_value;
+	rte_metrics_update_values;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 62a2a1a..ed3315c 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
-- 
2.5.5

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH] doc: add eventdev library to programmers guide
  2017-03-30 13:36  3% ` Jerin Jacob
@ 2017-03-30 15:59  3%   ` Van Haaren, Harry
  2017-03-31 12:43  0%   ` Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Van Haaren, Harry @ 2017-03-30 15:59 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev, thomas.monjalon, hemant.agrawal, Richardson, Bruce

> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Thursday, March 30, 2017 2:37 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: dev@dpdk.org; thomas.monjalon@6wind.com; hemant.agrawal@nxp.com; Richardson, Bruce
> <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH] doc: add eventdev library to programmers guide
> 
> On Wed, Mar 15, 2017 at 05:21:18PM +0000, Harry van Haaren wrote:
> > This commit adds an entry in the programmers guide
> > explaining the eventdev library.
> >
> > The rte_event struct, queues and ports are explained.
> > An API walktrough of a simple two stage atomic pipeline
> > provides the reader with a step by step overview of the
> > expected usage of the Eventdev API.
> 
> 
> Thanks Harry for writing this document.
> Overall it looks good.

Thanks for the thanks!


> I have a few thoughts on this document and next steps of eventdev:
> 
> 1) Need more use cases and associated example applications
> 
> The initial version talks only about queue based event pipelining,
> I think, we need to add another use case like following in the eventdev
> programmers guide with associated example applications.
> a) automatic multicore scaling
> b) dynamic load balancing,
> c) flow based pipelining
> d) packet ingress order maintenance
> e) synchronization services.

Agreed some sample apps would ease starting to use the eventdev API.


> 2) ethdev integration is pending.
> 
> There are some ethdev(NIC) HW features associated with HW eventdev
> implementations. For example,  The given example in the programming
> guide has dedicated Rx and Tx cores, In HW ethdev and HW eventdev case,
> 
> - NIC HW can produce and inject the events to eventdev. No need for
>   additional Rx core push the data to eventdev on those HW
> - In Some NIC HW without SW lock, multiple cores can access the same tx queue
>   hence the need for additional TX core may not valid on those HW.

Right - these details will need to be worked out as the PMDs are developed.
 

> IMO, Addressing all the use case with generic example application and
> ethdev integration with eventdev would call for minor changes in the
> eventdev API and the programmer guide content.
 
Yes a generic sample app would be great, I'll post the sample app that we've been using for testing the SW PMD to patchwork in the next weeks if possible. It is a multi-stage pipeline based application, similar to that described in the prog-guide documentation.


> Considering the above points, I would like to share the current
> eventdev status and propose the next step for eventdev.
> 
> Current eventdev status:
> 1) Eventdev specification more or less complete
> 2) Two eventdev driver implementations are ready(completed the review) along
> with unit test cases.
> a) An eventdev SW driver from Intel
> b) An eventdev HW driver from Cavium
> 
> Pending work:
> 1) More generic example applications with additional eventdev uses case
> backed by programming guide.
> IMO, programming guide will be complete only with generic example
> applications.IMHO, we can add the programming guide once we have generic examples
> 2) Minor change in eventdev specification to accommodate ethdev integration
> and example generic application with ethdev and eventdev for packet
> forwarding.
> 
> If noone is working this, I am planning to work on item 1 and 2 to
> complete eventdev work.

Generic sample apps that work with all PMDs would be great, as above I'll share the example pipeline app asap.
Regarding API changes, yes we can deal with them as they arise during implementation.


> Based on the above status, I would like to merge eventdev to DPDK
> v17.05 with EXPERIMENTAL tag and address pending work in v17.08 and
> remove the EXPERIMENTAL tag.
> 
> Thomas and all,
> 
> Thoughts ?
> 
> NOTE:
> 
> Keeping the EXPERIMENTAL status to allow minor change in the API without
> ABI issue when we address the pending work and I guess NXP eventdev also
> lined up for v17.08 and they may need a minor change in the API


This seems a good suggestion to me, as it enables us to move quickly in terms of API/ABI (if required for new/updated PMDs) but still ensures that the eventdev becomes available to DPDK mainline users too.


No objections to this approach here, -Harry

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] next technical board meeting, 2017-04-06
  2017-03-30 15:05  3%   ` Olivier Matz
@ 2017-03-30 15:51  4%     ` Wiles, Keith
  0 siblings, 0 replies; 200+ results
From: Wiles, Keith @ 2017-03-30 15:51 UTC (permalink / raw)
  To: Olivier Matz; +Cc: Jerin Jacob, dev, techboard

> On Mar 30, 2017, at 10:05 AM, Olivier Matz <olivier.matz@6wind.com> wrote:
> 
> Hi Keith,
> 
> On Thu, 30 Mar 2017 14:25:12 +0000, "Wiles, Keith" <keith.wiles@intel.com> wrote:
>>> On Mar 30, 2017, at 4:41 AM, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
>>> 
>>> Hello everyone,
>>> 
>>> A meeting of the DPDK technical board will occur next Thursday,
>>> April 6th 2017 at 9am UTC?
>>> 
>>> The meeting takes place on the #dpdk-board channel on IRC.
>>> This meeting is public, so anybody can join, see below for the agenda.
>>> 
>>> Jerin
>>> 
>>> 1) Divergence between DPDK/Linux PF/VF implementations.
>>> 
>>> Discussions:
>>> http://dpdk.org/ml/archives/dev/2017-March/060529.html
>>> http://dpdk.org/ml/archives/dev/2017-March/060063.html
>>> http://dpdk.org/ml/archives/dev/2016-December/053056.html
>>> 
>>> 2) Representative for the DPDK governance board
>>> 
>>> 3) Scope of cmdline and cfgfile libraries in DPDK.
>>> Discuss the scope of cmdline and cfgfile libraries in DPDK
>>> and see if we allow more libs like that (Keith proposed a CLI lib),
>>> or we do not do more,
>>> or do we target to replace them by better external equivalents?  
>> 
>> I would prefer not lumping cfgfile into the CLI discussion, we can have a different discussion on it later.
>> 
>> A couple of options for CLI are:
>> 
>> 1 - Include CLI in DPDK repo, then start converting apps to CLI.
>>     Keep or deprecate cmdline in the future.
> 
> Before including the CLI lib and consider replacing the cmdline,
> we should first all be convinced that:
> - the app code will be more maintainable

The app code for CLI IMO is far easier to maintain, but without others looking at the code and working with the code in an application it will be difficult for others to comment on this design. I feel it is obvious that CLI provides many new features and being dynamic at runtime then cmdline, but others need to work with the code and convert or write an application.

CLI uses known methods (arg/argv) and again I was able to reduce test-pmd from 12K lines to 4.5K lines of code, which I can provide a copy if someone wants to look at the conversion or just look at Pktgen.

As far as the CLI code I have tried to make it reasonable easy to maintain, but it can always be improved.

> - we will be able to replace all that we have (we won't loose feature)

I have attempted to keep most of the features in cmdline that I thought were the key features, but what are the features of cmdline? Someone needs to present a fair comparison to CLI and cmdline.

> - the api is well designed: we won't do the same job with another
>  librte_cli2 next year

I have attempted to provide a clean design, but without others looking and using the APIs how can we discuss this point honestly. As for a CLI-2 next year that will have to go thru the same process as CLI is going thru today and we then decide. To me this is not a real valid concern.

> 
> If we choose this option, I think the patch introducing the lib should
> come with a significant amount of demonstration changes. I'm for instance
> thinking about the recently introduced rte_flow that provide a contextual
> completion.

I did convert test-pmd to allow the developer to pick which CLI to use on the command line and I can provide that test case. The original cmdline code is not changed and could be the fall back. I know test-pmd with CLI needs to be tested a lot. I also will provide an example code for CLI, which I have written already and I can provide that code to someone now.

> 
> Honnestly, I don't think it's worth doing it...
> 
> Another question that could be raised: should the cmdline/cfgfile/...
> libraries be part of the public dpdk API? I think they could be
> considered internal libraries. It would remove the need to preserve
> API/ABI.

We can always decide any given library or API is not within the ABI. I am fine with not having CLI not use ABI as I feel we have abused the current ABI in DPDK today. It does not seem to really guaranty backward compatibility from release to release other then stating it changed from release to release. IMO the LTS is the only real ABI compatibility solution today.

> 
>> 
>> 2 - Include CLI in DPDK repo, do not convert current APPs allow new apps to use cmdline or CLI.
> 
> I think we should not have 2 libs for the same thing.
> It's even more true for something that is out of scope of dpdk.

I do not disagree, but giving an option is my first choice and this allows for backward compatibility IMO.

> 
> 
> 
>> 3 - Put CLI in a different repo in DPDK, do not convert apps to use CLI allow developers to decide if they want to use CLI.
>>     - I would like to be able to clone CLI into DPDK lib directory and build it as a DPDK library.
>>       This would mean updating common_base, lib/Makefile and rte.apps.mk using a patch or use common_base config option.
>>       Building CLI outside of DPDK as a external lib is not very easy for developers to manage.
>>       Could use a patch to include CLI into the DPDK build system, but just adding the configuration defaulted off is the cleaner way.
> 
> I think the proper way is to do/find a generic command line library,
> and have it integrated into distros.

I have looked for a reasonable solution for many years, if you find a simpler and easier solution I am willing to look at using it. Adding CLI to a distro is reasonable, but it is not standalone today it requires DPDK APIs, but I could be forced to convert it back to standalone which was not my goal.

> 
> That say, the dpdk framework is missing some stuff to properly manage
> the dependencies to external libs. We have this need not only for cli.

As much as I like a good 'sparking generality' as the next person, this is Not the case, Pktgen is another real case and I am sure others. Besides I think we should deal with external builds with my comment below.

> 
> 
> 
>> 
>> We talked about creating a new repo on DPDK.org and I am happy with it, just want it better integrated into DPDK as a first class library to make using it simpler and easier for developers.
>> 
>> Doing option #1 or #2 is my first choice, but option #3 is good if we can have it as a first class library.
> 
> What does first class mean?

First class means allowing CLI or other applications or libraries to be easily included into DPDK using the standard internal build system.

Take Pktgen it is one of the must used applications for DPDK today, but it has to be built outside of DPDK using DPDK external build support. The external build support IMO could be dropped (less code to deal with) and just allow someone to clone a repo into a DPDK directory enable the config and build DPDK normally using the standard make options. This allows all applications to locate the common includes and libraries without having to add code to the Makefile to locate includes and libraries as they are all contained in the standard DPDK location.

We can add automatic support for new config files into the build system without having to edit DPDK files to make it build.

My Soap box comment:
   I think we are limiting DPDK’s growth by only focusing on a few new PMDs and reworking the existing code. We need to look forward and grow DPDK as a community to get more people involved in adding more applications and new designs. I believe DPDK.org needs to be a bigger community and not just a I/O library called DPDK. We need to actively move the organization to include more then just a high speed I/O library. Some will focus on DPDK and others will focus on providing a higher level applications, libraries and features.

> 
> 
> 
> Regards,
> Olivier

Regards,
Keith

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] next technical board meeting, 2017-04-06
  @ 2017-03-30 15:05  3%   ` Olivier Matz
  2017-03-30 15:51  4%     ` Wiles, Keith
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2017-03-30 15:05 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: Jerin Jacob, dev, techboard

Hi Keith,

On Thu, 30 Mar 2017 14:25:12 +0000, "Wiles, Keith" <keith.wiles@intel.com> wrote:
> > On Mar 30, 2017, at 4:41 AM, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> > 
> > Hello everyone,
> > 
> > A meeting of the DPDK technical board will occur next Thursday,
> > April 6th 2017 at 9am UTC?
> > 
> > The meeting takes place on the #dpdk-board channel on IRC.
> > This meeting is public, so anybody can join, see below for the agenda.
> > 
> > Jerin
> > 
> > 1) Divergence between DPDK/Linux PF/VF implementations.
> > 
> > Discussions:
> > http://dpdk.org/ml/archives/dev/2017-March/060529.html
> > http://dpdk.org/ml/archives/dev/2017-March/060063.html
> > http://dpdk.org/ml/archives/dev/2016-December/053056.html
> > 
> > 2) Representative for the DPDK governance board
> > 
> > 3) Scope of cmdline and cfgfile libraries in DPDK.
> > Discuss the scope of cmdline and cfgfile libraries in DPDK
> > and see if we allow more libs like that (Keith proposed a CLI lib),
> > or we do not do more,
> > or do we target to replace them by better external equivalents?  
> 
> I would prefer not lumping cfgfile into the CLI discussion, we can have a different discussion on it later.
> 
> A couple of options for CLI are:
> 
>  1 - Include CLI in DPDK repo, then start converting apps to CLI.
>      Keep or deprecate cmdline in the future.

Before including the CLI lib and consider replacing the cmdline,
we should first all be convinced that:
- the app code will be more maintainable
- we will be able to replace all that we have (we won't loose feature)
- the api is well designed: we won't do the same job with another
  librte_cli2 next year

If we choose this option, I think the patch introducing the lib should
come with a significant amount of demonstration changes. I'm for instance
thinking about the recently introduced rte_flow that provide a contextual
completion.

Honnestly, I don't think it's worth doing it...

Another question that could be raised: should the cmdline/cfgfile/...
libraries be part of the public dpdk API? I think they could be
considered internal libraries. It would remove the need to preserve
API/ABI.

> 
>  2 - Include CLI in DPDK repo, do not convert current APPs allow new apps to use cmdline or CLI.

I think we should not have 2 libs for the same thing.
It's even more true for something that is out of scope of dpdk.



>  3 - Put CLI in a different repo in DPDK, do not convert apps to use CLI allow developers to decide if they want to use CLI.
>      - I would like to be able to clone CLI into DPDK lib directory and build it as a DPDK library.
>        This would mean updating common_base, lib/Makefile and rte.apps.mk using a patch or use common_base config option.
>        Building CLI outside of DPDK as a external lib is not very easy for developers to manage.
>        Could use a patch to include CLI into the DPDK build system, but just adding the configuration defaulted off is the cleaner way.

I think the proper way is to do/find a generic command line library,
and have it integrated into distros.

That say, the dpdk framework is missing some stuff to properly manage
the dependencies to external libs. We have this need not only for cli.



> 
> We talked about creating a new repo on DPDK.org and I am happy with it, just want it better integrated into DPDK as a first class library to make using it simpler and easier for developers.
> 
> Doing option #1 or #2 is my first choice, but option #3 is good if we can have it as a first class library.

What does first class mean?



Regards,
Olivier

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] doc: add eventdev library to programmers guide
  @ 2017-03-30 13:36  3% ` Jerin Jacob
  2017-03-30 15:59  3%   ` Van Haaren, Harry
  2017-03-31 12:43  0%   ` Thomas Monjalon
  0 siblings, 2 replies; 200+ results
From: Jerin Jacob @ 2017-03-30 13:36 UTC (permalink / raw)
  To: Harry van Haaren; +Cc: dev, thomas.monjalon, hemant.agrawal, bruce.richardson

On Wed, Mar 15, 2017 at 05:21:18PM +0000, Harry van Haaren wrote:
> This commit adds an entry in the programmers guide
> explaining the eventdev library.
> 
> The rte_event struct, queues and ports are explained.
> An API walktrough of a simple two stage atomic pipeline
> provides the reader with a step by step overview of the
> expected usage of the Eventdev API.

Thanks Harry for writing this document.
Overall it looks good.

I have a few thoughts on this document and next steps of eventdev:

1) Need more use cases and associated example applications

The initial version talks only about queue based event pipelining,
I think, we need to add another use case like following in the eventdev
programmers guide with associated example applications.
a) automatic multicore scaling
b) dynamic load balancing,
c) flow based pipelining
d) packet ingress order maintenance
e) synchronization services.

2) ethdev integration is pending.

There are some ethdev(NIC) HW features associated with HW eventdev
implementations. For example,  The given example in the programming
guide has dedicated Rx and Tx cores, In HW ethdev and HW eventdev case,

- NIC HW can produce and inject the events to eventdev. No need for
  additional Rx core push the data to eventdev on those HW
- In Some NIC HW without SW lock, multiple cores can access the same tx queue
  hence the need for additional TX core may not valid on those HW.

IMO, Addressing all the use case with generic example application and
ethdev integration with eventdev would call for minor changes in the
eventdev API and the programmer guide content.

Considering the above points, I would like to share the current
eventdev status and propose the next step for eventdev.

Current eventdev status:
1) Eventdev specification more or less complete
2) Two eventdev driver implementations are ready(completed the review) along
with unit test cases.
a) An eventdev SW driver from Intel
b) An eventdev HW driver from Cavium

Pending work:
1) More generic example applications with additional eventdev uses case
backed by programming guide.
IMO, programming guide will be complete only with generic example
applications.IMHO, we can add the programming guide once we have generic examples
2) Minor change in eventdev specification to accommodate ethdev integration
and example generic application with ethdev and eventdev for packet
forwarding.

If noone is working this, I am planning to work on item 1 and 2 to
complete eventdev work.

Based on the above status, I would like to merge eventdev to DPDK
v17.05 with EXPERIMENTAL tag and address pending work in v17.08 and
remove the EXPERIMENTAL tag.

Thomas and all,

Thoughts ?

NOTE:

Keeping the EXPERIMENTAL status to allow minor change in the API without
ABI issue when we address the pending work and I guess NXP eventdev also
lined up for v17.08 and they may need a minor change in the API

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 2/2] ethdev: add hierarchical scheduler API
  2017-03-04  1:10  1% ` [dpdk-dev] [PATCH v3 2/2] ethdev: add hierarchical scheduler API Cristian Dumitrescu
@ 2017-03-30 10:32  0%   ` Hemant Agrawal
  0 siblings, 0 replies; 200+ results
From: Hemant Agrawal @ 2017-03-30 10:32 UTC (permalink / raw)
  To: Cristian Dumitrescu, dev
  Cc: thomas.monjalon, jerin.jacob, balasubramanian.manoharan, shreyansh.jain

Hi Cristian,
	
On 3/4/2017 6:40 AM, Cristian Dumitrescu wrote:
> This patch introduces the generic ethdev API for the traffic manager
> capability, which includes: hierarchical scheduling, traffic shaping,
> congestion management, packet marking.
>
> Main features:
> - Exposed as ethdev plugin capability (similar to rte_flow approach)
> - Capability query API per port, per hierarchy level and per hierarchy node
> - Scheduling algorithms: Strict Priority (SP), Weighed Fair Queuing (WFQ),
>   Weighted Round Robin (WRR)
> - Traffic shaping: single/dual rate, private (per node) and shared (by multiple
>   nodes) shapers
> - Congestion management for hierarchy leaf nodes: algorithms of tail drop,
>   head drop, WRED; private (per node) and shared (by multiple nodes) WRED
>   contexts
> - Packet marking: IEEE 802.1q (VLAN DEI), IETF RFC 3168 (IPv4/IPv6 ECN for
>   TCP and SCTP), IETF RFC 2597 (IPv4 / IPv6 DSCP)
>
> Changes in v3:
> - Implemented feedback from Jerin [5]
> - Changed naming convention: scheddev -> tm
> - Improvements on the capability API:
> 	- Specification of marking capabilities per color
> 	- WFQ/WRR groups: sp_n_children_max -> wfq_wrr_n_children_per_group_max,
> 	  added wfq_wrr_n_groups_max, improved description of both, improved
> 	  description of wfq_wrr_weight_max
> 	- Dynamic updates: added KEEP_LEVEL and CHANGE_LEVEL for parent update
> - Enforced/documented restrictions for root node (node_add() and update())
> - Enforced/documented shaper profile restrictions on PIR: PIR != 0, PIR >= CIR
> - Turned repetitive code in rte_tm.c into macro
> - Removed dependency on rte_red.h file (added RED params to rte_tm.h)
> - Color: removed "e_" from color names enum
> - Fixed small Doxygen style issues
>
> Changes in v2:
> - Implemented feedback from Hemant [4]
> - Improvements on the capability API
> 	- Added capability API for hierarchy level
> 	- Merged stats capability into the capability API
> 	- Added dynamic updates
> 	- Added non-leaf/leaf union to the node capability structure
> 	- Renamed sp_priority_min to sp_n_priorities_max, added clarifications
> 	- Fixed description for sp_n_children_max
> - Clarified and enforced rule on node ID range for leaf and non-leaf nodes
> 	- Added API functions to get node type (i.e. leaf/non-leaf):
> 	  get_leaf_nodes(), node_type_get()
> - Added clarification for the root node: its creation, its parent, its role
> 	- Macro NODE_ID_NULL as root node's parent
> 	- Description of the node_add() and node_parent_update() API functions
> - Added clarification for the first time add vs. subsequent updates rule
> 	- Cleaned up the description for the node_add() function
> - Statistics API improvements
> 	- Merged stats capability into the capability API
> 	- Added API function node_stats_update()
> 	- Added more stats per packet color
> - Added more error types
> - Fixed small Doxygen style issues
>
> Changes in v1 (since RFC [1]):
> - Implemented as ethdev plugin (similar to rte_flow) as opposed to more
>   monolithic additions to ethdev itself
> - Implemented feedback from Jerin [2] and Hemant [3]. Implemented all the
>   suggested items with only one exception, see the long list below, hopefully
>   nothing was forgotten.
>     - The item not done (hopefully for a good reason): driver-generated object
>       IDs. IMO the choice to have application-generated object IDs adds marginal
>       complexity to the driver (search ID function required), but it provides
>       huge simplification for the application. The app does not need to worry
>       about building & managing tree-like structure for storing driver-generated
>       object IDs, the app can use its own convention for node IDs depending on
>       the specific hierarchy that it needs. Trivial example: identify all
>       level-2 nodes with IDs like 100, 200, 300, … and the level-3 nodes based
>       on their level-2 parents: 110, 120, 130, 140, …, 210, 220, 230, 240, …,
>       310, 320, 330, … and level-4 nodes based on their level-3 parents: 111,
>       112, 113, 114, …, 121, 122, 123, 124, …). Moreover, see the change log for
>       the other related simplification that was implemented: leaf nodes now have
>       predefined IDs that are the same with their Ethernet TX queue ID (
>       therefore no translation is required for leaf nodes).
> - Capability API. Done per port and per node as well.
> - Dual rate shapers
> - Added configuration of private shaper (per node) directly from the shaper
>   profile as part of node API (no shaper ID needed for private shapers), while
>   the shared shapers are configured outside of the node API using shaper profile
>   and communicated to the node using shared shaper ID. So there is no
>   configuration overhead for shared shapers if the app does not use any of them.
> - Leaf nodes now have predefined IDs that are the same with their Ethernet TX
>   queue ID (therefore no translation is required for leaf nodes). This is also
>   used to differentiate between a leaf node and a non-leaf node.
> - Domain-specific errors to give a precise indication of the error cause (same
>   as done by rte_flow)
> - Packet marking API
> - Packet length optional adjustment for shapers, positive (e.g. for adding
>   Ethernet framing overhead of 20 bytes) or negative (e.g. for rate limiting
>   based on IP packet bytes)
>
> [1] RFC: http://dpdk.org/ml/archives/dev/2016-November/050956.html
> [2] Jerin’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054484.html
> [3] Hemant’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054866.html
> [4] Hemant's feedback on v1: http://www.dpdk.org/ml/archives/dev/2017-February/058033.html
> [5] Jerin's feedback on v1: http://www.dpdk.org/ml/archives/dev/2017-March/058895.html
>
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  MAINTAINERS                            |    4 +
>  lib/librte_ether/Makefile              |    5 +-
>  lib/librte_ether/rte_ether_version.map |   30 +
>  lib/librte_ether/rte_tm.c              |  436 ++++++++++
>  lib/librte_ether/rte_tm.h              | 1466 ++++++++++++++++++++++++++++++++
>  lib/librte_ether/rte_tm_driver.h       |  365 ++++++++
>  6 files changed, 2305 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_ether/rte_tm.c
>  create mode 100644 lib/librte_ether/rte_tm.h
>  create mode 100644 lib/librte_ether/rte_tm_driver.h
>

<snip>...

> +
> +/**
> + * Traffic manager dynamic updates
> + */
> +enum rte_tm_dynamic_update_type {
> +	/**< Dynamic parent node update. The new parent node is located on same
> +	 * hierarchy level as the former parent node. Consequently, the node
> +	 * whose parent is changed preserves its hierarchy level.
> +	 */
> +	RTE_TM_UPDATE_NODE_PARENT_KEEP_LEVEL = 1 << 0,
> +
> +	/**< Dynamic parent node update. The new parent node is located on
> +	 * different hierarchy level than the former parent node. Consequently,
> +	 * the node whose parent is changed also changes its hierarchy level.
> +	 */
> +	RTE_TM_UPDATE_NODE_PARENT_CHANGE_LEVEL = 1 << 1,
> +
> +	/**< Dynamic node add/delete. */
> +	RTE_TM_UPDATE_NODE_ADD_DELETE = 1 << 2,
> +
> +	/**< Suspend/resume nodes. */
> +	RTE_TM_UPDATE_NODE_SUSPEND_RESUME = 1 << 3,

what is the expectation from suspend/resumes?
1. the Fan-out will stop.
2. the Fan-In will stop, any enqueue request will result into error?  or 
it will be implementation dependent.
During suspend/hold state the buffers should not hold in the node.

> +
> +	/**< Dynamic switch between WFQ and WRR per node SP priority level. */
> +	RTE_TM_UPDATE_NODE_SCHEDULING_MODE = 1 << 4,
> +
> +	/**< Dynamic update of the set of enabled stats counter types. */
> +	RTE_TM_UPDATE_NODE_STATS = 1 << 5,
> +
> +	/**< Dynamic update of congestion management mode for leaf nodes. */
> +	RTE_TM_UPDATE_NODE_CMAN = 1 << 6,
> +};
> +
> +/**
> + * Traffic manager node capabilities
> + */
> +struct rte_tm_node_capabilities {
> +	/**< Private shaper support. */
> +	int shaper_private_supported;
> +
> +	/**< Dual rate shaping support for private shaper. Valid only when
> +	 * private shaper is supported.
> +	 */
> +	int shaper_private_dual_rate_supported;
> +
> +	/**< Minimum committed/peak rate (bytes per second) for private
> +	 * shaper. Valid only when private shaper is supported.
> +	 */
> +	uint64_t shaper_private_rate_min;
> +
> +	/**< Maximum committed/peak rate (bytes per second) for private
> +	 * shaper. Valid only when private shaper is supported.
> +	 */
> +	uint64_t shaper_private_rate_max;
> +
> +	/**< Maximum number of supported shared shapers. The value of zero
> +	 * indicates that shared shapers are not supported.
> +	 */
> +	uint32_t shaper_shared_n_max;
> +
> +	/**< Mask of supported statistics counter types. */
> +	uint64_t stats_mask;
> +
> +	union {
> +		/**< Items valid only for non-leaf nodes. */
> +		struct {
> +			/**< Maximum number of children nodes. */
> +			uint32_t n_children_max;
> +
> +			/**< Maximum number of supported priority levels. The
> +			 * value of zero is invalid. The value of 1 indicates
> +			 * that only priority 0 is supported, which essentially
> +			 * means that Strict Priority (SP) algorithm is not
> +			 * supported.
> +			 */
> +			uint32_t sp_n_priorities_max;
> +
> +			/**< Maximum number of sibling nodes that can have the
> +			 * same priority at any given time, i.e. maximum size
> +			 * of the WFQ/WRR sibling node group. The value of zero
> +			 * is invalid. The value of 1 indicates that WFQ/WRR
> +			 * algorithms are not supported. The maximum value is
> +			 * *n_children_max*.
> +			 */
> +			uint32_t wfq_wrr_n_children_per_group_max;
> +
> +			/**< Maximum number of priority levels that can have
> +			 * more than one child node at any given time, i.e.
> +			 * maximum number of WFQ/WRR sibling node groups that
> +			 * have two or more members. The value of zero states
> +			 * that WFQ/WRR algorithms are not supported. The value
> +			 * of 1 indicates that (*sp_n_priorities_max* - 1)
> +			 * priority levels have at most one child node, so
> +			 * there can be only one priority level with two or
> +			 * more sibling nodes making up a WFQ/WRR group. The
> +			 * maximum value is: min(floor(*n_children_max* / 2),
> +			 * *sp_n_priorities_max*).
> +			 */
> +			uint32_t wfq_wrr_n_groups_max;
> +
> +			/**< WFQ algorithm support. */
> +			int wfq_supported;
> +
> +			/**< WRR algorithm support. */
> +			int wrr_supported;
> +
> +			/**< Maximum WFQ/WRR weight. The value of 1 indicates
> +			 * that all sibling nodes with same priority have the
> +			 * same WFQ/WRR weight, so WFQ/WRR is reduced to FQ/RR.
> +			 */
> +			uint32_t wfq_wrr_weight_max;
> +		} nonleaf;
> +
> +		/**< Items valid only for leaf nodes. */
> +		struct {
> +			/**< Head drop algorithm support. */
> +			int cman_head_drop_supported;
> +
> +			/**< Private WRED context support. */
> +			int cman_wred_context_private_supported;
> +
> +			/**< Maximum number of shared WRED contexts supported.
> +			 * The value of zero indicates that shared WRED
> +			 * contexts are not supported.
> +			 */
> +			uint32_t cman_wred_context_shared_n_max;
> +		} leaf;
> +	};
> +};
> +
> +/**
> + * Traffic manager level capabilities
> + */
> +struct rte_tm_level_capabilities {
> +	/**< Maximum number of nodes for the current hierarchy level. */
> +	uint32_t n_nodes_max;
> +
> +	/**< Maximum number of non-leaf nodes for the current hierarchy level.
> +	 * The value of 0 indicates that current level only supports leaf
> +	 * nodes. The maximum value is *n_nodes_max*.
> +	 */
> +	uint32_t n_nodes_nonleaf_max;
> +
> +	/**< Maximum number of leaf nodes for the current hierarchy level. The
> +	 * value of 0 indicates that current level only supports non-leaf
> +	 * nodes. The maximum value is *n_nodes_max*.
> +	 */
> +	uint32_t n_nodes_leaf_max;
> +
> +	/**< Summary of node-level capabilities across all the non-leaf nodes
> +	 * of the current hierarchy level. Valid only when
> +	 * *n_nodes_nonleaf_max* is greater than 0.
> +	 */
> +	struct rte_tm_node_capabilities nonleaf;
> +
> +	/**< Summary of node-level capabilities across all the leaf nodes of
> +	 * the current hierarchy level. Valid only when *n_nodes_leaf_max* is
> +	 * greater than 0.
> +	 */
> +	struct rte_tm_node_capabilities leaf;

you also need to provide a flag that all the nodes in this level has 
equal capability of not.

In case, all nodes in this level has equal priority, the implementation 
need to to inquire any further for node level capability. Generally in 
case of HWs, they have equal capabilities at a particular level.

If all nodes are not having equal priority, this information may not be 
correct and it may only provide a very rough overview of node 
capabilities. the applications should not assume anything about a 
particular node capability within this level.


> +};
> +
> +/**
> + * Traffic manager capabilities
> + */
> +struct rte_tm_capabilities {
> +	/**< Maximum number of nodes. */
> +	uint32_t n_nodes_max;
> +
> +	/**< Maximum number of levels (i.e. number of nodes connecting the root
> +	 * node with any leaf node, including the root and the leaf).
> +	 */
> +	uint32_t n_levels_max;
> +
> +	/**< Maximum number of shapers, either private or shared. In case the
> +	 * implementation does not share any resource between private and
> +	 * shared shapers, it is typically equal to the sum between
> +	 * *shaper_private_n_max* and *shaper_shared_n_max*.
> +	 */
> +	uint32_t shaper_n_max;
> +
> +	/**< Maximum number of private shapers. Indicates the maximum number of
> +	 * nodes that can concurrently have the private shaper enabled.
> +	 */
> +	uint32_t shaper_private_n_max;
> +
> +	/**< Maximum number of shared shapers. The value of zero indicates that
> +	 * shared shapers are not supported.
> +	 */
> +	uint32_t shaper_shared_n_max;
> +
> +	/**< Maximum number of nodes that can share the same shared shaper.
> +	 * Only valid when shared shapers are supported.
> +	 */
> +	uint32_t shaper_shared_n_nodes_max;
> +
> +	/**< Maximum number of shared shapers that can be configured with dual
> +	 * rate shaping. The value of zero indicates that dual rate shaping
> +	 * support is not available for shared shapers.
> +	 */
> +	uint32_t shaper_shared_dual_rate_n_max;
> +
> +	/**< Minimum committed/peak rate (bytes per second) for shared shapers.
> +	 * Only valid when shared shapers are supported.
> +	 */
> +	uint64_t shaper_shared_rate_min;
> +
> +	/**< Maximum committed/peak rate (bytes per second) for shared shaper.
> +	 * Only valid when shared shapers are supported.
> +	 */
> +	uint64_t shaper_shared_rate_max;
> +
> +	/**< Minimum value allowed for packet length adjustment for
> +	 * private/shared shapers.
> +	 */
> +	int shaper_pkt_length_adjust_min;
> +
> +	/**< Maximum value allowed for packet length adjustment for
> +	 * private/shared shapers.
> +	 */
> +	int shaper_pkt_length_adjust_max;
> +
> +	/**< Maximum number of WRED contexts. */
> +	uint32_t cman_wred_context_n_max;
> +
> +	/**< Maximum number of private WRED contexts. Indicates the maximum
> +	 * number of leaf nodes that can concurrently have the private WRED
> +	 * context enabled.
> +	 */
> +	uint32_t cman_wred_context_private_n_max;
> +
> +	/**< Maximum number of shared WRED contexts. The value of zero
> +	 * indicates that shared WRED contexts are not supported.
> +	 */
> +	uint32_t cman_wred_context_shared_n_max;
> +
> +	/**< Maximum number of leaf nodes that can share the same WRED context.
> +	 * Only valid when shared WRED contexts are supported.
> +	 */
> +	uint32_t cman_wred_context_shared_n_nodes_max;
> +
> +	/**< Support for VLAN DEI packet marking (per color). */
> +	int mark_vlan_dei_supported[RTE_TM_COLORS];
> +
> +	/**< Support for IPv4/IPv6 ECN marking of TCP packets (per color). */
> +	int mark_ip_ecn_tcp_supported[RTE_TM_COLORS];
> +
> +	/**< Support for IPv4/IPv6 ECN marking of SCTP packets (per color). */
> +	int mark_ip_ecn_sctp_supported[RTE_TM_COLORS];
> +
> +	/**< Support for IPv4/IPv6 DSCP packet marking (per color). */
> +	int mark_ip_dscp_supported[RTE_TM_COLORS];
> +
> +	/**< Set of supported dynamic update operations
> +	 * (see enum rte_tm_dynamic_update_type).
> +	 */
> +	uint64_t dynamic_update_mask;
> +
> +	/**< Summary of node-level capabilities across all non-leaf nodes. */
> +	struct rte_tm_node_capabilities nonleaf;
> +
> +	/**< Summary of node-level capabilities across all leaf nodes. */
> +	struct rte_tm_node_capabilities leaf;

This is not right, When you are having per level capabilities, why to 
return a node level capability with TM.

In software, it is easy to maintain all nodes of same capabilities. But 
in case of hardware, the capabilities of different levels is going to be 
different.

This will result into non-portable implementation, where the application 
will find the easy way to build the TM tree on this basis of these values.

You already have most of the capability indications in the tm level 
parameters. the only information missing is w.r.t WRR/WFQ and SP 
capability of tm. you can add some flags to get that.


> +};
> +
> +/**
> + * Congestion management (CMAN) mode
> + *
> + * This is used for controlling the admission of packets into a packet queue or
> + * group of packet queues on congestion. On request of writing a new packet
> + * into the current queue while the queue is full, the *tail drop* algorithm
> + * drops the new packet while leaving the queue unmodified, as opposed to *head
> + * drop* algorithm, which drops the packet at the head of the queue (the oldest
> + * packet waiting in the queue) and admits the new packet at the tail of the
> + * queue.
> + *
> + * The *Random Early Detection (RED)* algorithm works by proactively dropping
> + * more and more input packets as the queue occupancy builds up. When the queue
> + * is full or almost full, RED effectively works as *tail drop*. The *Weighted
> + * RED* algorithm uses a separate set of RED thresholds for each packet color.
> + */
> +enum rte_tm_cman_mode {
> +	RTE_TM_CMAN_TAIL_DROP = 0, /**< Tail drop */
> +	RTE_TM_CMAN_HEAD_DROP, /**< Head drop */
> +	RTE_TM_CMAN_WRED, /**< Weighted Random Early Detection (WRED) */
> +};
> +
> +/**
> + * Random Early Detection (RED) profile
> + */
> +struct rte_tm_red_params {
> +	/**< Minimum queue threshold */
> +	uint16_t min_th;
> +
> +	/**< Maximum queue threshold */
> +	uint16_t max_th;
> +
> +	/**< Inverse of packet marking probability maximum value (maxp), i.e.
> +	 * maxp_inv = 1 / maxp
> +	 */
> +	uint16_t maxp_inv;
> +
> +	/**< Negated log2 of queue weight (wq), i.e. wq = 1 / (2 ^ wq_log2) */
> +	uint16_t wq_log2;
> +};
> +
> +/**
> + * Weighted RED (WRED) profile
> + *
> + * Multiple WRED contexts can share the same WRED profile. Each leaf node with
> + * WRED enabled as its congestion management mode has zero or one private WRED
> + * context (only one leaf node using it) and/or zero, one or several shared
> + * WRED contexts (multiple leaf nodes use the same WRED context). A private
> + * WRED context is used to perform congestion management for a single leaf
> + * node, while a shared WRED context is used to perform congestion management
> + * for a group of leaf nodes.
> + */
> +struct rte_tm_wred_params {
> +	/**< One set of RED parameters per packet color */
> +	struct rte_tm_red_params red_params[RTE_TM_COLORS];
> +};
> +
> +/**
> + * Token bucket
> + */
> +struct rte_tm_token_bucket {
> +	/**< Token bucket rate (bytes per second) */
> +	uint64_t rate;
> +
> +	/**< Token bucket size (bytes), a.k.a. max burst size */
> +	uint64_t size;
> +};
> +
> +/**
> + * Shaper (rate limiter) profile
> + *
> + * Multiple shaper instances can share the same shaper profile. Each node has
> + * zero or one private shaper (only one node using it) and/or zero, one or
> + * several shared shapers (multiple nodes use the same shaper instance).
> + * A private shaper is used to perform traffic shaping for a single node, while
> + * a shared shaper is used to perform traffic shaping for a group of nodes.
> + *
> + * Single rate shapers use a single token bucket. A single rate shaper can be
> + * configured by setting the rate of the committed bucket to zero, which
> + * effectively disables this bucket. The peak bucket is used to limit the rate
> + * and the burst size for the current shaper.
> + *
> + * Dual rate shapers use both the committed and the peak token buckets. The
> + * rate of the peak bucket has to be bigger than zero, as well as greater than
> + * or equal to the rate of the committed bucket.
> + */
> +struct rte_tm_shaper_params {
> +	/**< Committed token bucket */
> +	struct rte_tm_token_bucket committed;
> +
> +	/**< Peak token bucket */
> +	struct rte_tm_token_bucket peak;
> +
> +	/**< Signed value to be added to the length of each packet for the
> +	 * purpose of shaping. Can be used to correct the packet length with
> +	 * the framing overhead bytes that are also consumed on the wire (e.g.
> +	 * RTE_TM_ETH_FRAMING_OVERHEAD_FCS).
> +	 */
> +	int32_t pkt_length_adjust;
> +};
> +
> +/**
> + * Node parameters
> + *
> + * Each hierarchy node has multiple inputs (children nodes of the current
> + * parent node) and a single output (which is input to its parent node). The
> + * current node arbitrates its inputs using Strict Priority (SP), Weighted Fair
> + * Queuing (WFQ) and Weighted Round Robin (WRR) algorithms to schedule input
> + * packets on its output while observing its shaping (rate limiting)
> + * constraints.
> + *
> + * Algorithms such as byte-level WRR, Deficit WRR (DWRR), etc are considered
> + * approximations of the ideal of WFQ and are assimilated to WFQ, although an
> + * associated implementation-dependent trade-off on accuracy, performance and
> + * resource usage might exist.
> + *
> + * Children nodes with different priorities are scheduled using the SP
> + * algorithm, based on their priority, with zero (0) as the highest priority.
> + * Children with same priority are scheduled using the WFQ or WRR algorithm,
> + * based on their weight, which is relative to the sum of the weights of all
> + * siblings with same priority, with one (1) as the lowest weight.
> + *
> + * Each leaf node sits on on top of a TX queue of the current Ethernet port.
> + * Therefore, the leaf nodes are predefined with the node IDs of 0 .. (N-1),
> + * where N is the number of TX queues configured for the current Ethernet port.
> + * The non-leaf nodes have their IDs generated by the application.
> + */
> +struct rte_tm_node_params {
> +	/**< Shaper profile for the private shaper. The absence of the private
> +	 * shaper for the current node is indicated by setting this parameter
> +	 * to RTE_TM_SHAPER_PROFILE_ID_NONE.
> +	 */
> +	uint32_t shaper_profile_id;
> +
> +	/**< User allocated array of valid shared shaper IDs. */
> +	uint32_t *shared_shaper_id;
> +
> +	/**< Number of shared shaper IDs in the *shared_shaper_id* array. */
> +	uint32_t n_shared_shapers;
> +
> +	/**< Mask of statistics counter types to be enabled for this node. This
> +	 * needs to be a subset of the statistics counter types available for
> +	 * the current node. Any statistics counter type not included in this
> +	 * set is to be disabled for the current node.
> +	 */
> +	uint64_t stats_mask;
> +
> +	union {
> +		/**< Parameters only valid for non-leaf nodes. */
> +		struct {
> +			/**< For each priority, indicates whether the children
> +			 * nodes sharing the same priority are to be scheduled
> +			 * by WFQ or by WRR. When NULL, it indicates that WFQ
> +			 * is to be used for all priorities. When non-NULL, it
> +			 * points to a pre-allocated array of *n_priority*
> +			 * elements, with a non-zero value element indicating
> +			 * WFQ and a zero value element for WRR.
> +			 */
> +			int *scheduling_mode_per_priority;
> +
> +			/**< Number of priorities. */
> +			uint32_t n_priorities;
> +		} nonleaf;
> +
> +		/**< Parameters only valid for leaf nodes. */
> +		struct {
> +			/**< Congestion management mode */
> +			enum rte_tm_cman_mode cman;
> +
> +			/**< WRED parameters (valid when *cman* is WRED). */
> +			struct {
> +				/**< WRED profile for private WRED context. */
> +				uint32_t wred_profile_id;
> +
> +				/**< User allocated array of shared WRED
> +				 * context IDs. The absence of a private WRED
> +				 * context for current leaf node is indicated
> +				 * by value RTE_TM_WRED_PROFILE_ID_NONE.
> +				 */
> +				uint32_t *shared_wred_context_id;
> +
> +				/**< Number of shared WRED context IDs in the
> +				 * *shared_wred_context_id* array.
> +				 */
> +				uint32_t n_shared_wred_contexts;
> +			} wred;
> +		} leaf;
> +	};
> +};
> +
> +/**
> + * Verbose error types.
> + *
> + * Most of them provide the type of the object referenced by struct
> + * rte_tm_error::cause.
> + */
> +enum rte_tm_error_type {
> +	RTE_TM_ERROR_TYPE_NONE, /**< No error. */
> +	RTE_TM_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
> +	RTE_TM_ERROR_TYPE_CAPABILITIES,
> +	RTE_TM_ERROR_TYPE_LEVEL_ID,
> +	RTE_TM_ERROR_TYPE_WRED_PROFILE,
> +	RTE_TM_ERROR_TYPE_WRED_PROFILE_GREEN,
> +	RTE_TM_ERROR_TYPE_WRED_PROFILE_YELLOW,
> +	RTE_TM_ERROR_TYPE_WRED_PROFILE_RED,
> +	RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
> +	RTE_TM_ERROR_TYPE_SHARED_WRED_CONTEXT_ID,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_SIZE,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
> +	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
> +	RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
> +	RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
> +	RTE_TM_ERROR_TYPE_NODE_PRIORITY,
> +	RTE_TM_ERROR_TYPE_NODE_WEIGHT,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_SCHEDULING_MODE,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_N_PRIORITIES,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_WRED_CONTEXT_ID,
> +	RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
> +	RTE_TM_ERROR_TYPE_NODE_ID,
> +};
> +
> +/**
> + * Verbose error structure definition.
> + *
> + * This object is normally allocated by applications and set by PMDs, the
> + * message points to a constant string which does not need to be freed by
> + * the application, however its pointer can be considered valid only as long
> + * as its associated DPDK port remains configured. Closing the underlying
> + * device or unloading the PMD invalidates it.
> + *
> + * Both cause and message may be NULL regardless of the error type.
> + */
> +struct rte_tm_error {
> +	enum rte_tm_error_type type; /**< Cause field and error type. */
> +	const void *cause; /**< Object responsible for the error. */
> +	const char *message; /**< Human-readable error message. */
> +};
> +
> +/**
> + * Traffic manager get number of leaf nodes
> + *
> + * Each leaf node sits on on top of a TX queue of the current Ethernet port.
> + * Therefore, the set of leaf nodes is predefined, their number is always equal
> + * to N (where N is the number of TX queues configured for the current port)
> + * and their IDs are 0 .. (N-1).
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param n_leaf_nodes
> + *   Number of leaf nodes for the current port.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_get_leaf_nodes(uint8_t port_id,
> +	uint32_t *n_leaf_nodes,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node type (i.e. leaf or non-leaf) get
> + *
> + * The leaf nodes have predefined IDs in the range of 0 .. (N-1), where N is
> + * the number of TX queues of the current Ethernet port. The non-leaf nodes
> + * have their IDs generated by the application outside of the above range,
> + * which is reserved for leaf nodes.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID value. Needs to be valid.
> + * @param is_leaf
> + *   Set to non-zero value when node is leaf and to zero otherwise (non-leaf).
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_type_get(uint8_t port_id,
> +	uint32_t node_id,
> +	int *is_leaf,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager capabilities get
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param cap
> + *   Traffic manager capabilities. Needs to be pre-allocated and valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_capabilities_get(uint8_t port_id,
> +	struct rte_tm_capabilities *cap,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager level capabilities get
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param level_id
> + *   The hierarchy level identifier. The value of 0 identifies the level of the
> + *   root node.
> + * @param cap
> + *   Traffic manager level capabilities. Needs to be pre-allocated and valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_level_capabilities_get(uint8_t port_id,
> +	uint32_t level_id,
> +	struct rte_tm_level_capabilities *cap,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node capabilities get
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param cap
> + *   Traffic manager node capabilities. Needs to be pre-allocated and valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_capabilities_get(uint8_t port_id,
> +	uint32_t node_id,
> +	struct rte_tm_node_capabilities *cap,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager WRED profile add
> + *
> + * Create a new WRED profile with ID set to *wred_profile_id*. The new profile
> + * is used to create one or several WRED contexts.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param wred_profile_id
> + *   WRED profile ID for the new profile. Needs to be unused.
> + * @param profile
> + *   WRED profile parameters. Needs to be pre-allocated and valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_wred_profile_add(uint8_t port_id,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_wred_params *profile,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager WRED profile delete
> + *
> + * Delete an existing WRED profile. This operation fails when there is
> + * currently at least one user (i.e. WRED context) of this WRED profile.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param wred_profile_id
> + *   WRED profile ID. Needs to be the valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_wred_profile_delete(uint8_t port_id,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager shared WRED context add or update
> + *
> + * When *shared_wred_context_id* is invalid, a new WRED context with this ID is
> + * created by using the WRED profile identified by *wred_profile_id*.
> + *
> + * When *shared_wred_context_id* is valid, this WRED context is no longer using
> + * the profile previously assigned to it and is updated to use the profile
> + * identified by *wred_profile_id*.
> + *
> + * A valid shared WRED context can be assigned to several hierarchy leaf nodes
> + * configured to use WRED as the congestion management mode.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param shared_wred_context_id
> + *   Shared WRED context ID
> + * @param wred_profile_id
> + *   WRED profile ID. Needs to be the valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_shared_wred_context_add_update(uint8_t port_id,
> +	uint32_t shared_wred_context_id,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager shared WRED context delete
> + *
> + * Delete an existing shared WRED context. This operation fails when there is
> + * currently at least one user (i.e. hierarchy leaf node) of this shared WRED
> + * context.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param shared_wred_context_id
> + *   Shared WRED context ID. Needs to be the valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_shared_wred_context_delete(uint8_t port_id,
> +	uint32_t shared_wred_context_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager shaper profile add
> + *
> + * Create a new shaper profile with ID set to *shaper_profile_id*. The new
> + * shaper profile is used to create one or several shapers.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param shaper_profile_id
> + *   Shaper profile ID for the new profile. Needs to be unused.
> + * @param profile
> + *   Shaper profile parameters. Needs to be pre-allocated and valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_shaper_profile_add(uint8_t port_id,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_shaper_params *profile,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager shaper profile delete
> + *
> + * Delete an existing shaper profile. This operation fails when there is
> + * currently at least one user (i.e. shaper) of this shaper profile.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param shaper_profile_id
> + *   Shaper profile ID. Needs to be the valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_shaper_profile_delete(uint8_t port_id,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager shared shaper add or update
> + *
> + * When *shared_shaper_id* is not a valid shared shaper ID, a new shared shaper
> + * with this ID is created using the shaper profile identified by
> + * *shaper_profile_id*.
> + *
> + * When *shared_shaper_id* is a valid shared shaper ID, this shared shaper is
> + * no longer using the shaper profile previously assigned to it and is updated
> + * to use the shaper profile identified by *shaper_profile_id*.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param shared_shaper_id
> + *   Shared shaper ID
> + * @param shaper_profile_id
> + *   Shaper profile ID. Needs to be the valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_shared_shaper_add_update(uint8_t port_id,
> +	uint32_t shared_shaper_id,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager shared shaper delete
> + *
> + * Delete an existing shared shaper. This operation fails when there is
> + * currently at least one user (i.e. hierarchy node) of this shared shaper.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param shared_shaper_id
> + *   Shared shaper ID. Needs to be the valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_shared_shaper_delete(uint8_t port_id,
> +	uint32_t shared_shaper_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node add
> + *
> + * Create new node and connect it as child of an existing node. The new node is
> + * further identified by *node_id*, which needs to be unused by any of the
> + * existing nodes. The parent node is identified by *parent_node_id*, which
> + * needs to be the valid ID of an existing non-leaf node. The parent node is
> + * going to use the provided SP *priority* and WFQ/WRR *weight* to schedule its
> + * new child node.
> + *
> + * This function has to be called for both leaf and non-leaf nodes. In the case
> + * of leaf nodes (i.e. *node_id* is within the range of 0 .. (N-1), with N as
> + * the number of configured TX queues of the current port), the leaf node is
> + * configured rather than created (as the set of leaf nodes is predefined) and
> + * it is also connected as child of an existing node.
> + *
> + * The first node that is added becomes the root node and all the nodes that
> + * are subsequently added have to be added as descendants of the root node. The
> + * parent of the root node has to be specified as RTE_TM_NODE_ID_NULL and there
> + * can only be one node with this parent ID (i.e. the root node). Further
> + * restrictions for root node: needs to be non-leaf, its private shaper profile
> + * needs to be valid and single rate, cannot use any shared shapers.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be unused by any of the existing nodes.
> + * @param parent_node_id
> + *   Parent node ID. Needs to be the valid.
> + * @param priority
> + *   Node priority. The highest node priority is zero. Used by the SP algorithm
> + *   running on the parent of the current node for scheduling this child node.
> + * @param weight
> + *   Node weight. The node weight is relative to the weight sum of all siblings
> + *   that have the same priority. The lowest weight is one. Used by the WFQ/WRR
> + *   algorithm running on the parent of the current node for scheduling this
> + *   child node.
> + * @param params
> + *   Node parameters. Needs to be pre-allocated and valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_add(uint8_t port_id,
> +	uint32_t node_id,
> +	uint32_t parent_node_id,
> +	uint32_t priority,
> +	uint32_t weight,
> +	struct rte_tm_node_params *params,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node delete
> + *
> + * Delete an existing node. This operation fails when this node currently has
> + * at least one user (i.e. child node).
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_delete(uint8_t port_id,
> +	uint32_t node_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node suspend
> + *
> + * Suspend an existing node.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_suspend(uint8_t port_id,
> +	uint32_t node_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node resume
> + *
> + * Resume an existing node that was previously suspended.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_resume(uint8_t port_id,
> +	uint32_t node_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager hierarchy set
> + *
> + * This function is called during the port initialization phase (before the
> + * Ethernet port is started) to freeze the start-up hierarchy.
> + *
> + * This function fails when the currently configured hierarchy is not supported
> + * by the Ethernet port, in which case the user can abort or try out another
> + * hierarchy configuration (e.g. a hierarchy with less leaf nodes), which can
> + * be build from scratch (when *clear_on_fail* is enabled) or by modifying the
> + * existing hierarchy configuration (when *clear_on_fail* is disabled).
> + *
> + * Note that, even when the configured hierarchy is supported (so this function
> + * is successful), the Ethernet port start might still fail due to e.g. not
> + * enough memory being available in the system, etc.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param clear_on_fail
> + *   On function call failure, hierarchy is cleared when this parameter is
> + *   non-zero and preserved when this parameter is equal to zero.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_hierarchy_set(uint8_t port_id,
> +	int clear_on_fail,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node parent update
> + *
> + * Restriction for root node: its parent cannot be changed.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param parent_node_id
> + *   Node ID for the new parent. Needs to be valid.
> + * @param priority
> + *   Node priority. The highest node priority is zero. Used by the SP algorithm
> + *   running on the parent of the current node for scheduling this child node.
> + * @param weight
> + *   Node weight. The node weight is relative to the weight sum of all siblings
> + *   that have the same priority. The lowest weight is zero. Used by the
> + *   WFQ/WRR algorithm running on the parent of the current node for scheduling
> + *   this child node.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_parent_update(uint8_t port_id,
> +	uint32_t node_id,
> +	uint32_t parent_node_id,
> +	uint32_t priority,
> +	uint32_t weight,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node private shaper update
> + *
> + * Restriction for root node: its private shaper profile needs to be valid and
> + * single rate.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param shaper_profile_id
> + *   Shaper profile ID for the private shaper of the current node. Needs to be
> + *   either valid shaper profile ID or RTE_TM_SHAPER_PROFILE_ID_NONE, with
> + *   the latter disabling the private shaper of the current node.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_shaper_update(uint8_t port_id,
> +	uint32_t node_id,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node shared shapers update
> + *
> + * Restriction for root node: cannot use any shared rate shapers.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param shared_shaper_id
> + *   Shared shaper ID. Needs to be valid.
> + * @param add
> + *   Set to non-zero value to add this shared shaper to current node or to zero
> + *   to delete this shared shaper from current node.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_shared_shaper_update(uint8_t port_id,
> +	uint32_t node_id,
> +	uint32_t shared_shaper_id,
> +	int add,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node enabled statistics counters update
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param stats_mask
> + *   Mask of statistics counter types to be enabled for the current node. This
> + *   needs to be a subset of the statistics counter types available for the
> + *   current node. Any statistics counter type not included in this set is to
> + *   be disabled for the current node.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_stats_update(uint8_t port_id,
> +	uint32_t node_id,
> +	uint64_t stats_mask,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node scheduling mode update
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid leaf node ID.
> + * @param scheduling_mode_per_priority
> + *   For each priority, indicates whether the children nodes sharing the same
> + *   priority are to be scheduled by WFQ or by WRR. When NULL, it indicates
> + *   that WFQ is to be used for all priorities. When non-NULL, it points to a
> + *   pre-allocated array of *n_priority* elements, with a non-zero value
> + *   element indicating WFQ and a zero value element for WRR.
> + * @param n_priorities
> + *   Number of priorities.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_scheduling_mode_update(uint8_t port_id,
> +	uint32_t node_id,
> +	int *scheduling_mode_per_priority,
> +	uint32_t n_priorities,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node congestion management mode update
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid leaf node ID.
> + * @param cman
> + *   Congestion management mode.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_cman_update(uint8_t port_id,
> +	uint32_t node_id,
> +	enum rte_tm_cman_mode cman,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node private WRED context update
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid leaf node ID.
> + * @param wred_profile_id
> + *   WRED profile ID for the private WRED context of the current node. Needs to
> + *   be either valid WRED profile ID or RTE_TM_WRED_PROFILE_ID_NONE, with
> + *   the latter disabling the private WRED context of the current node.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_wred_context_update(uint8_t port_id,
> +	uint32_t node_id,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node shared WRED context update
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid leaf node ID.
> + * @param shared_wred_context_id
> + *   Shared WRED context ID. Needs to be valid.
> + * @param add
> + *   Set to non-zero value to add this shared WRED context to current node or
> + *   to zero to delete this shared WRED context from current node.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_shared_wred_context_update(uint8_t port_id,
> +	uint32_t node_id,
> +	uint32_t shared_wred_context_id,
> +	int add,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager node statistics counters read
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param node_id
> + *   Node ID. Needs to be valid.
> + * @param stats
> + *   When non-NULL, it contains the current value for the statistics counters
> + *   enabled for the current node.
> + * @param stats_mask
> + *   When non-NULL, it contains the mask of statistics counter types that are
> + *   currently enabled for this node, indicating which of the counters
> + *   retrieved with the *stats* structure are valid.
> + * @param clear
> + *   When this parameter has a non-zero value, the statistics counters are
> + *   cleared (i.e. set to zero) immediately after they have been read,
> + *   otherwise the statistics counters are left untouched.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_node_stats_read(uint8_t port_id,
> +	uint32_t node_id,
> +	struct rte_tm_node_stats *stats,
> +	uint64_t *stats_mask,
> +	int clear,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager packet marking - VLAN DEI (IEEE 802.1Q)
> + *
> + * IEEE 802.1p maps the traffic class to the VLAN Priority Code Point (PCP)
> + * field (3 bits), while IEEE 802.1q maps the drop priority to the VLAN Drop
> + * Eligible Indicator (DEI) field (1 bit), which was previously named Canonical
> + * Format Indicator (CFI).
> + *
> + * All VLAN frames of a given color get their DEI bit set if marking is enabled
> + * for this color; otherwise, their DEI bit is left as is (either set or not).
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param mark_green
> + *   Set to non-zero value to enable marking of green packets and to zero to
> + *   disable it.
> + * @param mark_yellow
> + *   Set to non-zero value to enable marking of yellow packets and to zero to
> + *   disable it.
> + * @param mark_red
> + *   Set to non-zero value to enable marking of red packets and to zero to
> + *   disable it.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_mark_vlan_dei(uint8_t port_id,
> +	int mark_green,
> +	int mark_yellow,
> +	int mark_red,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager packet marking - IPv4 / IPv6 ECN (IETF RFC 3168)
> + *
> + * IETF RFCs 2474 and 3168 reorganize the IPv4 Type of Service (TOS) field
> + * (8 bits) and the IPv6 Traffic Class (TC) field (8 bits) into Differentiated
> + * Services Codepoint (DSCP) field (6 bits) and Explicit Congestion
> + * Notification (ECN) field (2 bits). The DSCP field is typically used to
> + * encode the traffic class and/or drop priority (RFC 2597), while the ECN
> + * field is used by RFC 3168 to implement a congestion notification mechanism
> + * to be leveraged by transport layer protocols such as TCP and SCTP that have
> + * congestion control mechanisms.
> + *
> + * When congestion is experienced, as alternative to dropping the packet,
> + * routers can change the ECN field of input packets from 2'b01 or 2'b10
> + * (values indicating that source endpoint is ECN-capable) to 2'b11 (meaning
> + * that congestion is experienced). The destination endpoint can use the
> + * ECN-Echo (ECE) TCP flag to relay the congestion indication back to the
> + * source endpoint, which acknowledges it back to the destination endpoint with
> + * the Congestion Window Reduced (CWR) TCP flag.
> + *
> + * All IPv4/IPv6 packets of a given color with ECN set to 2’b01 or 2’b10
> + * carrying TCP or SCTP have their ECN set to 2’b11 if the marking feature is
> + * enabled for the current color, otherwise the ECN field is left as is.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param mark_green
> + *   Set to non-zero value to enable marking of green packets and to zero to
> + *   disable it.
> + * @param mark_yellow
> + *   Set to non-zero value to enable marking of yellow packets and to zero to
> + *   disable it.
> + * @param mark_red
> + *   Set to non-zero value to enable marking of red packets and to zero to
> + *   disable it.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_mark_ip_ecn(uint8_t port_id,
> +	int mark_green,
> +	int mark_yellow,
> +	int mark_red,
> +	struct rte_tm_error *error);
> +
> +/**
> + * Traffic manager packet marking - IPv4 / IPv6 DSCP (IETF RFC 2597)
> + *
> + * IETF RFC 2597 maps the traffic class and the drop priority to the IPv4/IPv6
> + * Differentiated Services Codepoint (DSCP) field (6 bits). Here are the DSCP
> + * values proposed by this RFC:
> + *
> + *                       Class 1    Class 2    Class 3    Class 4
> + *                     +----------+----------+----------+----------+
> + *    Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
> + *    Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
> + *    High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
> + *                     +----------+----------+----------+----------+
> + *
> + * There are 4 traffic classes (classes 1 .. 4) encoded by DSCP bits 1 and 2,
> + * as well as 3 drop priorities (low/medium/high) encoded by DSCP bits 3 and 4.
> + *
> + * All IPv4/IPv6 packets have their color marked into DSCP bits 3 and 4 as
> + * follows: green mapped to Low Drop Precedence (2’b01), yellow to Medium
> + * (2’b10) and red to High (2’b11). Marking needs to be explicitly enabled
> + * for each color; when not enabled for a given color, the DSCP field of all
> + * packets with that color is left as is.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param mark_green
> + *   Set to non-zero value to enable marking of green packets and to zero to
> + *   disable it.
> + * @param mark_yellow
> + *   Set to non-zero value to enable marking of yellow packets and to zero to
> + *   disable it.
> + * @param mark_red
> + *   Set to non-zero value to enable marking of red packets and to zero to
> + *   disable it.
> + * @param error
> + *   Error details. Filled in only on error, when not NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_tm_mark_ip_dscp(uint8_t port_id,
> +	int mark_green,
> +	int mark_yellow,
> +	int mark_red,
> +	struct rte_tm_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* __INCLUDE_RTE_TM_H__ */
> diff --git a/lib/librte_ether/rte_tm_driver.h b/lib/librte_ether/rte_tm_driver.h
> new file mode 100644
> index 0000000..b3c9c15
> --- /dev/null
> +++ b/lib/librte_ether/rte_tm_driver.h
> @@ -0,0 +1,365 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef __INCLUDE_RTE_TM_DRIVER_H__
> +#define __INCLUDE_RTE_TM_DRIVER_H__
> +
> +/**
> + * @file
> + * RTE Generic Traffic Manager API (Driver Side)
> + *
> + * This file provides implementation helpers for internal use by PMDs, they
> + * are not intended to be exposed to applications and are not subject to ABI
> + * versioning.
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_errno.h>
> +#include "rte_ethdev.h"
> +#include "rte_tm.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +typedef int (*rte_tm_node_type_get_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	int *is_leaf,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node type get */
> +
> +typedef int (*rte_tm_capabilities_get_t)(struct rte_eth_dev *dev,
> +	struct rte_tm_capabilities *cap,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager capabilities get */
> +
> +typedef int (*rte_tm_level_capabilities_get_t)(struct rte_eth_dev *dev,
> +	uint32_t level_id,
> +	struct rte_tm_level_capabilities *cap,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager level capabilities get */
> +
> +typedef int (*rte_tm_node_capabilities_get_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	struct rte_tm_node_capabilities *cap,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node capabilities get */
> +
> +typedef int (*rte_tm_wred_profile_add_t)(struct rte_eth_dev *dev,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_wred_params *profile,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager WRED profile add */
> +
> +typedef int (*rte_tm_wred_profile_delete_t)(struct rte_eth_dev *dev,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager WRED profile delete */
> +
> +typedef int (*rte_tm_shared_wred_context_add_update_t)(
> +	struct rte_eth_dev *dev,
> +	uint32_t shared_wred_context_id,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager shared WRED context add */
> +
> +typedef int (*rte_tm_shared_wred_context_delete_t)(
> +	struct rte_eth_dev *dev,
> +	uint32_t shared_wred_context_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager shared WRED context delete */
> +
> +typedef int (*rte_tm_shaper_profile_add_t)(struct rte_eth_dev *dev,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_shaper_params *profile,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager shaper profile add */
> +
> +typedef int (*rte_tm_shaper_profile_delete_t)(struct rte_eth_dev *dev,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager shaper profile delete */
> +
> +typedef int (*rte_tm_shared_shaper_add_update_t)(struct rte_eth_dev *dev,
> +	uint32_t shared_shaper_id,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager shared shaper add/update */
> +
> +typedef int (*rte_tm_shared_shaper_delete_t)(struct rte_eth_dev *dev,
> +	uint32_t shared_shaper_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager shared shaper delete */
> +
> +typedef int (*rte_tm_node_add_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint32_t parent_node_id,
> +	uint32_t priority,
> +	uint32_t weight,
> +	struct rte_tm_node_params *params,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node add */
> +
> +typedef int (*rte_tm_node_delete_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node delete */
> +
> +typedef int (*rte_tm_node_suspend_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node suspend */
> +
> +typedef int (*rte_tm_node_resume_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node resume */
> +
> +typedef int (*rte_tm_hierarchy_set_t)(struct rte_eth_dev *dev,
> +	int clear_on_fail,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager hierarchy set */
> +
> +typedef int (*rte_tm_node_parent_update_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint32_t parent_node_id,
> +	uint32_t priority,
> +	uint32_t weight,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node parent update */
> +
> +typedef int (*rte_tm_node_shaper_update_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node shaper update */
> +
> +typedef int (*rte_tm_node_shared_shaper_update_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint32_t shared_shaper_id,
> +	int32_t add,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node shaper update */
> +
> +typedef int (*rte_tm_node_stats_update_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint64_t stats_mask,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node stats update */
> +
> +typedef int (*rte_tm_node_scheduling_mode_update_t)(
> +	struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	int *scheduling_mode_per_priority,
> +	uint32_t n_priorities,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node scheduling mode update */
> +
> +typedef int (*rte_tm_node_cman_update_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	enum rte_tm_cman_mode cman,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node congestion management mode update */
> +
> +typedef int (*rte_tm_node_wred_context_update_t)(
> +	struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint32_t wred_profile_id,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node WRED context update */
> +
> +typedef int (*rte_tm_node_shared_wred_context_update_t)(
> +	struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	uint32_t shared_wred_context_id,
> +	int add,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager node WRED context update */
> +
> +typedef int (*rte_tm_node_stats_read_t)(struct rte_eth_dev *dev,
> +	uint32_t node_id,
> +	struct rte_tm_node_stats *stats,
> +	uint64_t *stats_mask,
> +	int clear,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager read stats counters for specific node */
> +
> +typedef int (*rte_tm_mark_vlan_dei_t)(struct rte_eth_dev *dev,
> +	int mark_green,
> +	int mark_yellow,
> +	int mark_red,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager packet marking - VLAN DEI */
> +
> +typedef int (*rte_tm_mark_ip_ecn_t)(struct rte_eth_dev *dev,
> +	int mark_green,
> +	int mark_yellow,
> +	int mark_red,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager packet marking - IPv4/IPv6 ECN */
> +
> +typedef int (*rte_tm_mark_ip_dscp_t)(struct rte_eth_dev *dev,
> +	int mark_green,
> +	int mark_yellow,
> +	int mark_red,
> +	struct rte_tm_error *error);
> +/**< @internal Traffic manager packet marking - IPv4/IPv6 DSCP */
> +
> +struct rte_tm_ops {
> +	/** Traffic manager node type get */
> +	rte_tm_node_type_get_t node_type_get;
> +
> +	/** Traffic manager capabilities_get */
> +	rte_tm_capabilities_get_t capabilities_get;
> +	/** Traffic manager level capabilities_get */
> +	rte_tm_level_capabilities_get_t level_capabilities_get;
> +	/** Traffic manager node capabilities get */
> +	rte_tm_node_capabilities_get_t node_capabilities_get;
> +
> +	/** Traffic manager WRED profile add */
> +	rte_tm_wred_profile_add_t wred_profile_add;
> +	/** Traffic manager WRED profile delete */
> +	rte_tm_wred_profile_delete_t wred_profile_delete;
> +	/** Traffic manager shared WRED context add/update */
> +	rte_tm_shared_wred_context_add_update_t
> +		shared_wred_context_add_update;
> +	/** Traffic manager shared WRED context delete */
> +	rte_tm_shared_wred_context_delete_t
> +		shared_wred_context_delete;
> +
> +	/** Traffic manager shaper profile add */
> +	rte_tm_shaper_profile_add_t shaper_profile_add;
> +	/** Traffic manager shaper profile delete */
> +	rte_tm_shaper_profile_delete_t shaper_profile_delete;
> +	/** Traffic manager shared shaper add/update */
> +	rte_tm_shared_shaper_add_update_t shared_shaper_add_update;
> +	/** Traffic manager shared shaper delete */
> +	rte_tm_shared_shaper_delete_t shared_shaper_delete;
> +
> +	/** Traffic manager node add */
> +	rte_tm_node_add_t node_add;
> +	/** Traffic manager node delete */
> +	rte_tm_node_delete_t node_delete;
> +	/** Traffic manager node suspend */
> +	rte_tm_node_suspend_t node_suspend;
> +	/** Traffic manager node resume */
> +	rte_tm_node_resume_t node_resume;
> +	/** Traffic manager hierarchy set */
> +	rte_tm_hierarchy_set_t hierarchy_set;
> +
> +	/** Traffic manager node parent update */
> +	rte_tm_node_parent_update_t node_parent_update;
> +	/** Traffic manager node shaper update */
> +	rte_tm_node_shaper_update_t node_shaper_update;
> +	/** Traffic manager node shared shaper update */
> +	rte_tm_node_shared_shaper_update_t node_shared_shaper_update;
> +	/** Traffic manager node stats update */
> +	rte_tm_node_stats_update_t node_stats_update;
> +	/** Traffic manager node scheduling mode update */
> +	rte_tm_node_scheduling_mode_update_t node_scheduling_mode_update;
> +	/** Traffic manager node congestion management mode update */
> +	rte_tm_node_cman_update_t node_cman_update;
> +	/** Traffic manager node WRED context update */
> +	rte_tm_node_wred_context_update_t node_wred_context_update;
> +	/** Traffic manager node shared WRED context update */
> +	rte_tm_node_shared_wred_context_update_t
> +		node_shared_wred_context_update;
> +	/** Traffic manager read statistics counters for current node */
> +	rte_tm_node_stats_read_t node_stats_read;
> +
> +	/** Traffic manager packet marking - VLAN DEI */
> +	rte_tm_mark_vlan_dei_t mark_vlan_dei;
> +	/** Traffic manager packet marking - IPv4/IPv6 ECN */
> +	rte_tm_mark_ip_ecn_t mark_ip_ecn;
> +	/** Traffic manager packet marking - IPv4/IPv6 DSCP */
> +	rte_tm_mark_ip_dscp_t mark_ip_dscp;
> +};
> +
> +/**
> + * Initialize generic error structure.
> + *
> + * This function also sets rte_errno to a given value.
> + *
> + * @param error
> + *   Pointer to error structure (may be NULL).
> + * @param code
> + *   Related error code (rte_errno).
> + * @param type
> + *   Cause field and error type.
> + * @param cause
> + *   Object responsible for the error.
> + * @param message
> + *   Human-readable error message.
> + *
> + * @return
> + *   Error code.
> + */
> +static inline int
> +rte_tm_error_set(struct rte_tm_error *error,
> +		   int code,
> +		   enum rte_tm_error_type type,
> +		   const void *cause,
> +		   const char *message)
> +{
> +	if (error) {
> +		*error = (struct rte_tm_error){
> +			.type = type,
> +			.cause = cause,
> +			.message = message,
> +		};
> +	}
> +	rte_errno = code;
> +	return code;
> +}
> +
> +/**
> + * Get generic traffic manager operations structure from a port
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param error
> + *   Error details
> + *
> + * @return
> + *   The traffic manager operations structure associated with port_id on
> + *   success, NULL otherwise.
> + */
> +const struct rte_tm_ops *
> +rte_tm_ops_get(uint8_t port_id, struct rte_tm_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* __INCLUDE_RTE_TM_DRIVER_H__ */
>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v7 1/7] net/ark: stub PMD for Atomic Rules Arkville
  2017-03-29  1:04  3% ` [dpdk-dev] [PATCH v6 " Ed Czeck
@ 2017-03-29 21:32  3%   ` Ed Czeck
  2017-04-04 19:50  3%     ` [dpdk-dev] [PATCH v8 " Ed Czeck
  0 siblings, 1 reply; 200+ results
From: Ed Czeck @ 2017-03-29 21:32 UTC (permalink / raw)
  To: dev; +Cc: john.miller, shepard.siegel, ferruh.yigit, stephen, Ed Czeck

Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time

v6:
* Address review comments
* Unify messaging, logging and debug macros to ark_logs.h

v5:
* Address comments from Ferruh Yigit <ferruh.yigit@intel.com>
* Added documentation on driver args
* Makefile fixes
* Safe argument processing
* vdev args to dev args

v4:
* Address issues report from review
* Add internal comments on driver arg
* provide a bare-bones dev init to avoid compiler warnings

v3:
* Split large patch into several smaller ones

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
---
 MAINTAINERS                              |   8 +
 config/common_base                       |  10 ++
 config/defconfig_arm-armv7a-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                  | 296 +++++++++++++++++++++++++++++++
 doc/guides/nics/index.rst                |   1 +
 drivers/net/Makefile                     |   2 +
 drivers/net/ark/Makefile                 |  56 ++++++
 drivers/net/ark/ark_ethdev.c             | 288 ++++++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h             |  41 +++++
 drivers/net/ark/ark_global.h             | 116 ++++++++++++
 drivers/net/ark/ark_logs.h               | 119 +++++++++++++
 drivers/net/ark/rte_pmd_ark_version.map  |   4 +
 mk/rte.app.mk                            |   1 +
 13 files changed, 943 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/ark_logs.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 96d2cd0..0ccc4cf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -278,6 +278,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ARK
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index 8000665..7dbc0a8 100644
--- a/config/common_base
+++ b/config/common_base
@@ -385,6 +385,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+#
 # Compile WRS accelerated virtual port (AVP) guest PMD driver
 #
 CONFIG_RTE_LIBRTE_AVP_PMD=n
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..064ed11
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,296 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+The ARK PMD also supports optional user extensions, through dynamic linking.
+The ARK PMD user extensions are a feature of Arkville’s DPDK
+net/ark poll mode driver, allowing users to add their
+own code to extend the net/ark functionality without
+having to make source code changes to the driver. One motivation for
+this capability is that while DPDK provides a rich set of functions
+to interact with NIC-like capabilities (e.g. MAC addresses and statistics),
+the Arkville RTL IP does not include a MAC.  Users can supply their
+own MAC or custom FPGA applications, which may require control from
+the PMD.  The user extension is the means providing the control
+between the user's FPGA application and the existing DPDK features via
+the PMD.
+
+Device Parameters
+-------------------
+
+The ARK PMD supports device parameters that are used for packet
+routing and for internal packet generation and packet checking.  This
+section describes the supported parameters.  These features are
+primarily used for diagnostics, testing, and performance verification
+under the guidance of an Arkville specialist.  The nominal use of
+Arkville does not require any configuration using these parameters.
+
+"Pkt_dir"
+
+The Packet Director controls connectivity between Arkville's internal
+hardware components. The features of the Pkt_dir are only used for
+diagnostics and testing; it is not intended for nominal use.  The full
+set of features are not published at this level.
+
+Format:
+Pkt_dir=0x00110F10
+
+"Pkt_gen"
+
+The packet generator parameter takes a file as its argument.  The file
+contains configuration parameters used internally for regression
+testing and are not intended to be published at this level.  The
+packet generator is an internal Arkville hardware component.
+
+Format:
+Pkt_gen=./config/pg.conf
+
+"Pkt_chkr"
+
+The packet checker parameter takes a file as its argument.  The file
+contains configuration parameters used internally for regression
+testing and are not intended to be published at this level.  The
+packet checker is an internal Arkville hardware component.
+
+Format:
+Pkt_chkr=./config/pc.conf
+
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_PAD_TX** (default y):  When enabled TX
+     packets are padded to 60 bytes to support downstream MACS.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK
+Release Notes*.  ARM and PowerPC architectures are not supported at this time.
+
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 503fd82..03391d0 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     avp
     bnx2x
     bnxt
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 6c691a5..91f635c 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -36,6 +36,8 @@ core-libs += librte_net librte_kvargs
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
 DEPDIRS-af_packet = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
+DEPDIRS-ark = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_AVP_PMD) += avp
 DEPDIRS-avp = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..86a36ee
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,56 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS) -Werror
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+
+
+# this lib depends upon:
+LDLIBS += -lpthread
+LDLIBS += -ldl
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..fa5c9aa
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,288 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+
+#include "ark_global.h"
+#include "ark_logs.h"
+#include "ark_ethdev.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(struct ark_adapter *ark, const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+				 struct rte_eth_dev_info *dev_info);
+
+/*
+ * The packet generator is a functional block used to generate packet
+ * patterns for testing.  It is not intended for nominal use.
+ */
+#define ARK_PKTGEN_ARG "Pkt_gen"
+
+/*
+ * The packet checker is a functional block used to verify packet
+ * patterns for testing.  It is not intended for nominal use.
+ */
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+
+/*
+ * The packet director is used to select the internal ingress and
+ * egress packets paths during testing.  It is not intended for
+ * nominal use.
+ */
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+/* Devinfo configurations */
+#define ARK_RX_MAX_QUEUE (4096 * 4)
+#define ARK_RX_MIN_QUEUE (512)
+#define ARK_RX_MAX_PKT_LEN ((16 * 1024) - 128)
+#define ARK_RX_MIN_BUFSIZE (1024)
+
+#define ARK_TX_MAX_QUEUE (4096 * 4)
+#define ARK_TX_MIN_QUEUE (256)
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	NULL
+};
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
+	},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_infos_get = eth_ark_dev_info_get,
+};
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret = -1;
+
+	ark->eth_dev = dev;
+
+	PMD_FUNC_LOG(DEBUG, "\n");
+
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	ark->bar0 = (uint8_t *)pci_dev->mem_resource[0].addr;
+	ark->a_bar = (uint8_t *)pci_dev->mem_resource[2].addr;
+
+	dev->dev_ops = &ark_eth_dev_ops;
+	dev->data->dev_flags |= RTE_ETH_DEV_DETACHABLE;
+
+	if (pci_dev->device.devargs)
+		ret = eth_ark_check_args(ark, pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+	return ret;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	PMD_FUNC_LOG(DEBUG, "\n");
+	return 0;
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+		     struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_rx_pktlen = ARK_RX_MAX_PKT_LEN;
+	dev_info->min_rx_bufsize = ARK_RX_MIN_BUFSIZE;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_RX_MAX_QUEUE,
+		.nb_min = ARK_RX_MIN_QUEUE,
+		.nb_align = ARK_RX_MIN_QUEUE}; /* power of 2 */
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_TX_MAX_QUEUE,
+		.nb_min = ARK_TX_MIN_QUEUE,
+		.nb_align = ARK_TX_MIN_QUEUE}; /* power of 2 */
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
+				ETH_LINK_SPEED_10G |
+				ETH_LINK_SPEED_25G |
+				ETH_LINK_SPEED_40G |
+				ETH_LINK_SPEED_50G |
+				ETH_LINK_SPEED_100G);
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+		   void *extra_args)
+{
+	PMD_FUNC_LOG(DEBUG, "key = %s, value = %s\n",
+		    key, value);
+	struct ark_adapter *ark =
+		(struct ark_adapter *)extra_args;
+
+	ark->pkt_dir_v = strtol(value, NULL, 16);
+	PMD_FUNC_LOG(DEBUG, "pkt_dir_v = 0x%x\n", ark->pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	PMD_FUNC_LOG(DEBUG, "key = %s, value = %s\n",
+		    key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[ARK_MAX_ARG_LEN];
+	int  size = 0;
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		size += strlen(line);
+		if (size >= ARK_MAX_ARG_LEN) {
+			PMD_DRV_LOG(ERR, "Unable to parse file %s args, "
+				    "parameter list is too long\n", value);
+			fclose(file);
+			return -1;
+		}
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	PMD_FUNC_LOG(DEBUG, "file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(struct ark_adapter *ark, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	ark->pkt_gen_args[0] = 0;
+	ark->pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		PMD_FUNC_LOG(DEBUG, "**** Arg passed to PMD = %s:%s\n",
+			     pair->key,
+			     pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTDIR_ARG,
+			       &process_pktdir_arg,
+			       ark) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+		return -1;
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTGEN_ARG,
+			       &process_file_args,
+			       ark->pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+		return -1;
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTCHKR_ARG,
+			       &process_file_args,
+			       ark->pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+		return -1;
+	}
+
+	PMD_DRV_LOG(INFO, "packet director set to 0x%x\n", ark->pkt_dir_v);
+
+	return 0;
+}
+
+RTE_PMD_REGISTER_PCI(net_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(net_ark, pci_id_ark_map);
+RTE_PMD_REGISTER_PARAM_STRING(net_ark,
+			      ARK_PKTGEN_ARG "=<filename> "
+			      ARK_PKTCHKR_ARG "=<filename> "
+			      ARK_PKTDIR_ARG "=<bitmap>");
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..9f8d32f
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,41 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+#define ARK_DEV_TO_PCI(eth_dev)			\
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..033ac87
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,116 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/* Maximum length of arg list in bytes */
+#define ARK_MAX_ARG_LEN 256
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Packet generator/checker args */
+	char pkt_gen_args[ARK_MAX_ARG_LEN];
+	char pkt_chkr_args[ARK_MAX_ARG_LEN];
+	uint32_t pkt_dir_v;
+
+	/* eth device */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* Application Bar */
+	uint8_t *a_bar;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/ark_logs.h b/drivers/net/ark/ark_logs.h
new file mode 100644
index 0000000..8aff296
--- /dev/null
+++ b/drivers/net/ark/ark_logs.h
@@ -0,0 +1,119 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+
+/* Configuration option to pad TX packets to 60 bytes */
+#ifdef RTE_LIBRTE_ARK_PAD_TX
+#define ARK_TX_PAD_TO_60   1
+#else
+#define ARK_TX_PAD_TO_60   0
+#endif
+
+/* system camel case definition changed to upper case */
+#define PRIU32 PRIu32
+#define PRIU64 PRIu64
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIU32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIU64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+
+
+#define PMD_DRV_LOG(level, fmt, args...)	\
+	RTE_LOG(level, PMD, fmt, ## args)
+
+/* Conditional trace definitions */
+#define ARK_TRACE_ON(level, fmt, ...)		\
+	RTE_LOG(level, PMD, fmt, ##__VA_ARGS__)
+
+/* This pattern allows compiler check arguments even if disabled  */
+#define ARK_TRACE_OFF(level, fmt, ...)					\
+	do {if (0) RTE_LOG(level, PMD, fmt, ##__VA_ARGS__); }		\
+	while (0)
+
+
+/* tracing including the function name */
+#define ARK_FUNC_ON(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+
+/* tracing including the function name */
+#define ARK_FUNC_OFF(level, fmt, args...)				\
+	do { if (0) RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args); } \
+	while (0)
+
+
+/* Debug macro for tracing full behavior, function tracing and messages*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define PMD_FUNC_LOG(level, fmt, ...) ARK_FUNC_ON(level, fmt, ##__VA_ARGS__)
+#define PMD_DEBUG_LOG(level, fmt, ...) ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define PMD_FUNC_LOG(level, fmt, ...) ARK_FUNC_OFF(level, fmt, ##__VA_ARGS__)
+#define PMD_DEBUG_LOG(level, fmt, ...) ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+
+/* Debug macro for reporting FPGA statistics */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define PMD_STATS_LOG(level, fmt, ...) ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define PMD_STATS_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+
+/* Debug macro for RX path */
+#ifdef RTE_LIBRTE_ARK_DEBUG_RX
+#define ARK_RX_DEBUG 1
+#define PMD_RX_LOG(level, fmt, ...)  ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define ARK_RX_DEBUG 0
+#define PMD_RX_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for TX path */
+#ifdef RTE_LIBRTE_ARK_DEBUG_TX
+#define ARK_TX_DEBUG       1
+#define PMD_TX_LOG(level, fmt, ...)  ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define ARK_TX_DEBUG       0
+#define PMD_TX_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..1062e04
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index b67d509..bdc8ea5 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -102,6 +102,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_AVP_PMD)        += -lrte_pmd_avp
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v13 3/6] lib: add bitrate statistics library
    2017-03-29 18:28  1% ` [dpdk-dev] [PATCH v13 1/6] lib: add information metrics library Remy Horton
@ 2017-03-29 18:28  2% ` Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-29 18:28 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a library that calculates peak and average data-rate
statistics. For ethernet devices. These statistics are reported using
the metrics library.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                        |   4 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/metrics_lib.rst              |  65 +++++++++
 doc/guides/rel_notes/release_17_05.rst             |   6 +
 lib/Makefile                                       |   2 +
 lib/librte_bitratestats/Makefile                   |  53 ++++++++
 lib/librte_bitratestats/rte_bitrate.c              | 145 +++++++++++++++++++++
 lib/librte_bitratestats/rte_bitrate.h              |  84 ++++++++++++
 .../rte_bitratestats_version.map                   |   9 ++
 mk/rte.app.mk                                      |   1 +
 12 files changed, 376 insertions(+)
 create mode 100644 lib/librte_bitratestats/Makefile
 create mode 100644 lib/librte_bitratestats/rte_bitrate.c
 create mode 100644 lib/librte_bitratestats/rte_bitrate.h
 create mode 100644 lib/librte_bitratestats/rte_bitratestats_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index c8b4b41..8b01439 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -629,6 +629,10 @@ Metrics
 M: Remy Horton <remy.horton@intel.com>
 F: lib/librte_metrics/
 
+Bit-rate statistics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_bitratestats/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index 1e84c3b..e6163eb 100644
--- a/config/common_base
+++ b/config/common_base
@@ -635,3 +635,8 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
 # Compile the crypto performance application
 #
 CONFIG_RTE_APP_CRYPTO_PERF=y
+
+#
+# Compile the bitrate statistics library
+#
+CONFIG_RTE_LIBRTE_BITRATE=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 1cbe128..18c8676 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -157,4 +157,5 @@ There are many libraries, so their headers may be grouped by topics:
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
   [device metrics]     (@ref rte_metrics.h),
+  [bitrate statistics] (@ref rte_bitrate.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 0562814..7a971b6 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -36,6 +36,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_eal/common/include \
                           lib/librte_eal/common/include/generic \
                           lib/librte_acl \
+                          lib/librte_bitratestats \
                           lib/librte_cfgfile \
                           lib/librte_cmdline \
                           lib/librte_compat \
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
index 87f806d..71c3a1b 100644
--- a/doc/guides/prog_guide/metrics_lib.rst
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -178,3 +178,68 @@ print out all metrics for a given port:
         free(metrics);
         free(names);
     }
+
+
+Bit-rate statistics library
+---------------------------
+
+The bit-rate library calculates the exponentially-weighted moving
+average and peak bit-rates for each active port (i.e. network device).
+These statistics are reported via the metrics library using the
+following names:
+
+    - ``mean_bits_in``: Average inbound bit-rate
+    - ``mean_bits_out``:  Average outbound bit-rate
+    - ``ewma_bits_in``: Average inbound bit-rate (EWMA smoothed)
+    - ``ewma_bits_out``:  Average outbound bit-rate (EWMA smoothed)
+    - ``peak_bits_in``:  Peak inbound bit-rate
+    - ``peak_bits_out``:  Peak outbound bit-rate
+
+Once initialised and clocked at the appropriate frequency, these
+statistics can be obtained by querying the metrics library.
+
+Initialization
+~~~~~~~~~~~~~~
+
+Before the library can be used, it has to be initialised by calling
+``rte_stats_bitrate_create()``, which will return a bit-rate
+calculation object. Since the bit-rate library uses the metrics library
+to report the calculated statistics, the bit-rate library then needs to
+register the calculated statistics with the metrics library. This is
+done using the helper function ``rte_stats_bitrate_reg()``.
+
+.. code-block:: c
+
+    struct rte_stats_bitrates *bitrate_data;
+
+    bitrate_data = rte_stats_bitrate_create();
+    if (bitrate_data == NULL)
+        rte_exit(EXIT_FAILURE, "Could not allocate bit-rate data.\n");
+    rte_stats_bitrate_reg(bitrate_data);
+
+Controlling the sampling rate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since the library works by periodic sampling but does not use an
+internal thread, the application has to periodically call
+``rte_stats_bitrate_calc()``. The frequency at which this function
+is called should be the intended sampling rate required for the
+calculated statistics. For instance if per-second statistics are
+desired, this function should be called once a second.
+
+.. code-block:: c
+
+    tics_datum = rte_rdtsc();
+    tics_per_1sec = rte_get_timer_hz();
+
+    while( 1 ) {
+        /* ... */
+        tics_current = rte_rdtsc();
+	if (tics_current - tics_datum >= tics_per_1sec) {
+	    /* Periodic bitrate calculation */
+	    for (idx_port = 0; idx_port < cnt_ports; idx_port++)
+	            rte_stats_bitrate_calc(bitrate_data, idx_port);
+		tics_datum = tics_current;
+	    }
+        /* ... */
+    }
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 731363a..e8bcbc1 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -83,6 +83,11 @@ Resolved Issues
   reporting mechanism that is independent of other libraries such
   as ethdev.
 
+* **Added bit-rate calculation library.**
+
+  A library that can be used to calculate device bit-rates. Calculated
+  bitrates are reported using the metrics library.
+
 
 EAL
 ~~~
@@ -183,6 +188,7 @@ The libraries prepended with a plus sign were incremented in this version.
 .. code-block:: diff
 
      librte_acl.so.2
+   + librte_bitratestats.so.1
      librte_cfgfile.so.2
      librte_cmdline.so.2
      librte_cryptodev.so.2
diff --git a/lib/Makefile b/lib/Makefile
index 7a69e65..785f0d3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -70,6 +70,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
 DEPDIRS-librte_jobstats := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
 DEPDIRS-librte_metrics := librte_eal
+DIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += librte_bitratestats
+DEPDIRS-librte_bitratestats := librte_eal librte_metrics librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DEPDIRS-librte_power := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
diff --git a/lib/librte_bitratestats/Makefile b/lib/librte_bitratestats/Makefile
new file mode 100644
index 0000000..656b041
--- /dev/null
+++ b/lib/librte_bitratestats/Makefile
@@ -0,0 +1,53 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_bitratestats.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_bitratestats_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_BITRATE) := rte_bitrate.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_BITRATE)-include += rte_bitrate.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_metrics
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_bitratestats/rte_bitrate.c b/lib/librte_bitratestats/rte_bitrate.c
new file mode 100644
index 0000000..1e5fb07
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.c
@@ -0,0 +1,145 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_bitrate.h>
+
+/*
+ * Persistent bit-rate data.
+ * @internal
+ */
+struct rte_stats_bitrate {
+	uint64_t last_ibytes;
+	uint64_t last_obytes;
+	uint64_t peak_ibits;
+	uint64_t peak_obits;
+	uint64_t mean_ibits;
+	uint64_t mean_obits;
+	uint64_t ewma_ibits;
+	uint64_t ewma_obits;
+};
+
+struct rte_stats_bitrates {
+	struct rte_stats_bitrate port_stats[RTE_MAX_ETHPORTS];
+	uint16_t id_stats_set;
+};
+
+struct rte_stats_bitrates *
+rte_stats_bitrate_create(void)
+{
+	return rte_zmalloc(NULL, sizeof(struct rte_stats_bitrates),
+		RTE_CACHE_LINE_SIZE);
+}
+
+int
+rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data)
+{
+	const char * const names[] = {
+		"ewma_bits_in", "ewma_bits_out",
+		"mean_bits_in", "mean_bits_out",
+		"peak_bits_in", "peak_bits_out",
+	};
+	int return_value;
+
+	return_value = rte_metrics_reg_names(&names[0],
+		sizeof(names) / sizeof(names[0]));
+	if (return_value >= 0)
+		bitrate_data->id_stats_set = return_value;
+	return return_value;
+}
+
+int
+rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id)
+{
+	struct rte_stats_bitrate *port_data;
+	struct rte_eth_stats eth_stats;
+	int ret_code;
+	uint64_t cnt_bits;
+	int64_t delta;
+	const int64_t alpha_percent = 20;
+	uint64_t values[6];
+
+	ret_code = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret_code != 0)
+		return ret_code;
+
+	port_data = &bitrate_data->port_stats[port_id];
+
+	/* Incoming bitrate. This is an iteratively calculated EWMA
+	 * (Exponentially Weighted Moving Average) that uses a
+	 * weighting factor of alpha_percent. An unsmoothed mean
+	 * for just the current time delta is also calculated for the
+	 * benefit of people who don't understand signal processing.
+	 */
+	cnt_bits = (eth_stats.ibytes - port_data->last_ibytes) << 3;
+	port_data->last_ibytes = eth_stats.ibytes;
+	if (cnt_bits > port_data->peak_ibits)
+		port_data->peak_ibits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_ibits;
+	/* The +-50 fixes integer rounding during divison */
+	if (delta > 0)
+		delta = (delta * alpha_percent + 50) / 100;
+	else
+		delta = (delta * alpha_percent - 50) / 100;
+	port_data->ewma_ibits += delta;
+	port_data->mean_ibits = cnt_bits;
+
+	/* Outgoing bitrate (also EWMA) */
+	cnt_bits = (eth_stats.obytes - port_data->last_obytes) << 3;
+	port_data->last_obytes = eth_stats.obytes;
+	if (cnt_bits > port_data->peak_obits)
+		port_data->peak_obits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_obits;
+	if (delta > 0)
+		delta = (delta * alpha_percent + 50) / 100;
+	else
+		delta = (delta * alpha_percent - 50) / 100;
+	port_data->ewma_obits += delta;
+	port_data->mean_obits = cnt_bits;
+
+	values[0] = port_data->ewma_ibits;
+	values[1] = port_data->ewma_obits;
+	values[2] = port_data->mean_ibits;
+	values[3] = port_data->mean_obits;
+	values[4] = port_data->peak_ibits;
+	values[5] = port_data->peak_obits;
+	rte_metrics_update_values(port_id, bitrate_data->id_stats_set,
+		values, sizeof(values) / sizeof(values[0]));
+	return 0;
+}
diff --git a/lib/librte_bitratestats/rte_bitrate.h b/lib/librte_bitratestats/rte_bitrate.h
new file mode 100644
index 0000000..b9f11bd
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.h
@@ -0,0 +1,84 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BITRATE_H_
+#define _RTE_BITRATE_H_
+
+/**
+ *  Bitrate statistics data structure.
+ *  This data structure is intentionally opaque.
+ */
+struct rte_stats_bitrates;
+
+
+/**
+ * Allocate a bitrate statistics structure
+ *
+ * @return
+ *   - Pointer to structure on success
+ *   - NULL on error (zmalloc failure)
+ */
+struct rte_stats_bitrates *rte_stats_bitrate_create(void);
+
+
+/**
+ * Register bitrate statistics with the metric library.
+ *
+ * @param bitrate_data
+ *   Pointer allocated by rte_stats_create()
+ *
+ * @return
+ *   Zero on success
+ *   Negative on error
+ */
+int rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data);
+
+
+/**
+ * Calculate statistics for current time window. The period with which
+ * this function is called should be the intended sampling window width.
+ *
+ * @param bitrate_data
+ *   Bitrate statistics data pointer
+ *
+ * @param port_id
+ *   Port id to calculate statistics for
+ *
+ * @return
+ *  - Zero on success
+ *  - Negative value on error
+ */
+int rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id);
+
+#endif /* _RTE_BITRATE_H_ */
diff --git a/lib/librte_bitratestats/rte_bitratestats_version.map b/lib/librte_bitratestats/rte_bitratestats_version.map
new file mode 100644
index 0000000..fe74544
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitratestats_version.map
@@ -0,0 +1,9 @@
+DPDK_17.05 {
+	global:
+
+	rte_stats_bitrate_calc;
+	rte_stats_bitrate_create;
+	rte_stats_bitrate_reg;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index ed3315c..7104a8b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -99,6 +99,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
+_LDLIBS-$(CONFIG_RTE_LIBRTE_BITRATE)        += -lrte_bitratestats
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
-- 
2.5.5

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v13 1/6] lib: add information metrics library
  @ 2017-03-29 18:28  1% ` Remy Horton
  2017-03-29 18:28  2% ` [dpdk-dev] [PATCH v13 3/6] lib: add bitrate statistics library Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-29 18:28 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a new information metrics library. This Metrics
library implements a mechanism by which producers can publish
numeric information for later querying by consumers. Metrics
themselves are statistics that are not generated by PMDs, and
hence are not reported via ethdev extended statistics.

Metric information is populated using a push model, where
producers update the values contained within the metric
library by calling an update function on the relevant metrics.
Consumers receive metric information by querying the central
metric data, which is held in shared memory.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                |   4 +
 config/common_base                         |   5 +
 doc/api/doxy-api-index.md                  |   3 +-
 doc/api/doxy-api.conf                      |   3 +-
 doc/guides/prog_guide/index.rst            |   3 +-
 doc/guides/prog_guide/metrics_lib.rst      | 180 +++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst     |   9 +
 lib/Makefile                               |   2 +
 lib/librte_metrics/Makefile                |  51 +++++
 lib/librte_metrics/rte_metrics.c           | 302 +++++++++++++++++++++++++++++
 lib/librte_metrics/rte_metrics.h           | 240 +++++++++++++++++++++++
 lib/librte_metrics/rte_metrics_version.map |  13 ++
 mk/rte.app.mk                              |   1 +
 13 files changed, 813 insertions(+), 3 deletions(-)
 create mode 100644 doc/guides/prog_guide/metrics_lib.rst
 create mode 100644 lib/librte_metrics/Makefile
 create mode 100644 lib/librte_metrics/rte_metrics.c
 create mode 100644 lib/librte_metrics/rte_metrics.h
 create mode 100644 lib/librte_metrics/rte_metrics_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 0b1524d..c8b4b41 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -625,6 +625,10 @@ F: lib/librte_jobstats/
 F: examples/l2fwd-jobstats/
 F: doc/guides/sample_app_ug/l2_forward_job_stats.rst
 
+Metrics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_metrics/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index 37aa1e1..1e84c3b 100644
--- a/config/common_base
+++ b/config/common_base
@@ -506,6 +506,11 @@ CONFIG_RTE_LIBRTE_EFD=y
 CONFIG_RTE_LIBRTE_JOBSTATS=y
 
 #
+# Compile the device metrics library
+#
+CONFIG_RTE_LIBRTE_METRICS=y
+
+#
 # Compile librte_lpm
 #
 CONFIG_RTE_LIBRTE_LPM=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index eb39f69..1cbe128 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -4,7 +4,7 @@ API {#index}
 <!--
   BSD LICENSE
 
-  Copyright 2013 6WIND S.A.
+  Copyright 2013-2017 6WIND S.A.
 
   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
@@ -156,4 +156,5 @@ There are many libraries, so their headers may be grouped by topics:
   [common]             (@ref rte_common.h),
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
+  [device metrics]     (@ref rte_metrics.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fdcf13c..0562814 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -1,6 +1,6 @@
 # BSD LICENSE
 #
-# Copyright 2013 6WIND S.A.
+# Copyright 2013-2017 6WIND S.A.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -52,6 +52,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_mbuf \
                           lib/librte_mempool \
                           lib/librte_meter \
+                          lib/librte_metrics \
                           lib/librte_net \
                           lib/librte_pdump \
                           lib/librte_pipeline \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 77f427e..ef5a02a 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-    Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+    Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
     All rights reserved.
 
     Redistribution and use in source and binary forms, with or without
@@ -62,6 +62,7 @@ Programmer's Guide
     packet_classif_access_ctrl
     packet_framework
     vhost_lib
+    metrics_lib
     port_hotplug_framework
     source_org
     dev_kit_build_system
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
new file mode 100644
index 0000000..87f806d
--- /dev/null
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -0,0 +1,180 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Metrics_Library:
+
+Metrics Library
+===============
+
+The Metrics library implements a mechanism by which *producers* can
+publish numeric information for later querying by *consumers*. In
+practice producers will typically be other libraries or primary
+processes, whereas consumers will typically be applications.
+
+Metrics themselves are statistics that are not generated by PMDs. Metric
+information is populated using a push model, where producers update the
+values contained within the metric library by calling an update function
+on the relevant metrics. Consumers receive metric information by querying
+the central metric data, which is held in shared memory.
+
+For each metric, a separate value is maintained for each port id, and
+when publishing metric values the producers need to specify which port is
+being updated. In addition there is a special id ``RTE_METRICS_GLOBAL``
+that is intended for global statistics that are not associated with any
+individual device. Since the metrics library is self-contained, the only
+restriction on port numbers is that they are less than ``RTE_MAX_ETHPORTS``
+- there is no requirement for the ports to actually exist.
+
+Initialising the library
+------------------------
+
+Before the library can be used, it has to be initialized by calling
+``rte_metrics_init()`` which sets up the metric store in shared memory.
+This is where producers will publish metric information to, and where
+consumers will query it from.
+
+.. code-block:: c
+
+    rte_metrics_init(rte_socket_id());
+
+This function **must** be called from a primary process, but otherwise
+producers and consumers can be in either primary or secondary processes.
+
+Registering metrics
+-------------------
+
+Metrics must first be *registered*, which is the way producers declare
+the names of the metrics they will be publishing. Registration can either
+be done individually, or a set of metrics can be registered as a group.
+Individual registration is done using ``rte_metrics_reg_name()``:
+
+.. code-block:: c
+
+    id_1 = rte_metrics_reg_name("mean_bits_in");
+    id_2 = rte_metrics_reg_name("mean_bits_out");
+    id_3 = rte_metrics_reg_name("peak_bits_in");
+    id_4 = rte_metrics_reg_name("peak_bits_out");
+
+or alternatively, a set of metrics can be registered together using
+``rte_metrics_reg_names()``:
+
+.. code-block:: c
+
+    const char * const names[] = {
+        "mean_bits_in", "mean_bits_out",
+        "peak_bits_in", "peak_bits_out",
+    };
+    id_set = rte_metrics_reg_names(&names[0], 4);
+
+If the return value is negative, it means registration failed. Otherwise
+the return value is the *key* for the metric, which is used when updating
+values. A table mapping together these key values and the metrics' names
+can be obtained using ``rte_metrics_get_names()``.
+
+Updating metric values
+----------------------
+
+Once registered, producers can update the metric for a given port using
+the ``rte_metrics_update_value()`` function. This uses the metric key
+that is returned when registering the metric, and can also be looked up
+using ``rte_metrics_get_names()``.
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_1, values[0]);
+    rte_metrics_update_value(port_id, id_2, values[1]);
+    rte_metrics_update_value(port_id, id_3, values[2]);
+    rte_metrics_update_value(port_id, id_4, values[3]);
+
+if metrics were registered as a single set, they can either be updated
+individually using ``rte_metrics_update_value()``, or updated together
+using the ``rte_metrics_update_values()`` function:
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_set, values[0]);
+    rte_metrics_update_value(port_id, id_set + 1, values[1]);
+    rte_metrics_update_value(port_id, id_set + 2, values[2]);
+    rte_metrics_update_value(port_id, id_set + 3, values[3]);
+
+    rte_metrics_update_values(port_id, id_set, values, 4);
+
+Note that ``rte_metrics_update_values()`` cannot be used to update
+metric values from *multiple* *sets*, as there is no guarantee two
+sets registered one after the other have contiguous id values.
+
+Querying metrics
+----------------
+
+Consumers can obtain metric values by querying the metrics library using
+the ``rte_metrics_get_values()`` function that return an array of
+``struct rte_metric_value``. Each entry within this array contains a metric
+value and its associated key. A key-name mapping can be obtained using the
+``rte_metrics_get_names()`` function that returns an array of
+``struct rte_metric_name`` that is indexed by the key. The following will
+print out all metrics for a given port:
+
+.. code-block:: c
+
+    void print_metrics() {
+        struct rte_metric_name *names;
+        int len;
+
+        len = rte_metrics_get_names(NULL, 0);
+        if (len < 0) {
+            printf("Cannot get metrics count\n");
+            return;
+        }
+        if (len == 0) {
+            printf("No metrics to display (none have been registered)\n");
+            return;
+        }
+        metrics = malloc(sizeof(struct rte_metric_value) * len);
+        names =  malloc(sizeof(struct rte_metric_name) * len);
+        if (metrics == NULL || names == NULL) {
+            printf("Cannot allocate memory\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        ret = rte_metrics_get_values(port_id, metrics, len);
+        if (ret < 0 || ret > len) {
+            printf("Cannot get metrics values\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        printf("Metrics for port %i:\n", port_id);
+        for (i = 0; i < len; i++)
+            printf("  %s: %"PRIu64"\n",
+                names[metrics[i].key].name, metrics[i].value);
+        free(metrics);
+        free(names);
+    }
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2a045b3..731363a 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -75,6 +75,14 @@ Resolved Issues
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Added information metric library.**
+
+  A library that allows information metrics to be added and updated
+  by producers, typically other libraries, for later retrieval by
+  consumers such as applications. It is intended to provide a
+  reporting mechanism that is independent of other libraries such
+  as ethdev.
+
 
 EAL
 ~~~
@@ -190,6 +198,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_mbuf.so.2
      librte_mempool.so.2
      librte_meter.so.1
+   + librte_metrics.so.1
      librte_net.so.1
      librte_pdump.so.1
      librte_pipeline.so.3
diff --git a/lib/Makefile b/lib/Makefile
index 5ad3c7c..7a69e65 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -68,6 +68,8 @@ DEPDIRS-librte_ip_frag := librte_eal librte_mempool librte_mbuf librte_ether
 DEPDIRS-librte_ip_frag += librte_hash
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
 DEPDIRS-librte_jobstats := librte_eal
+DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
+DEPDIRS-librte_metrics := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DEPDIRS-librte_power := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
diff --git a/lib/librte_metrics/Makefile b/lib/librte_metrics/Makefile
new file mode 100644
index 0000000..156ea95
--- /dev/null
+++ b/lib/librte_metrics/Makefile
@@ -0,0 +1,51 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_metrics.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_metrics_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_METRICS) := rte_metrics.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_METRICS)-include += rte_metrics.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_METRICS) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_metrics/rte_metrics.c b/lib/librte_metrics/rte_metrics.c
new file mode 100644
index 0000000..e9a122c
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.c
@@ -0,0 +1,302 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_lcore.h>
+#include <rte_memzone.h>
+#include <rte_spinlock.h>
+
+#define RTE_METRICS_MAX_METRICS 256
+#define RTE_METRICS_MEMZONE_NAME "RTE_METRICS"
+
+/**
+ * Internal stats metadata and value entry.
+ *
+ * @internal
+ */
+struct rte_metrics_meta_s {
+	/** Name of metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+	/** Current value for metric */
+	uint64_t value[RTE_MAX_ETHPORTS];
+	/** Used for global metrics */
+	uint64_t global_value;
+	/** Index of next root element (zero for none) */
+	uint16_t idx_next_set;
+	/** Index of next metric in set (zero for none) */
+	uint16_t idx_next_stat;
+};
+
+/**
+ * Internal stats info structure.
+ *
+ * @internal
+ * Offsets into metadata are used instead of pointers because ASLR
+ * means that having the same physical addresses in different
+ * processes is not guaranteed.
+ */
+struct rte_metrics_data_s {
+	/**   Index of last metadata entry with valid data.
+	 * This value is not valid if cnt_stats is zero.
+	 */
+	uint16_t idx_last_set;
+	/**   Number of metrics. */
+	uint16_t cnt_stats;
+	/** Metric data memory block. */
+	struct rte_metrics_meta_s metadata[RTE_METRICS_MAX_METRICS];
+	/** Metric data access lock */
+	rte_spinlock_t lock;
+};
+
+void
+rte_metrics_init(int socket_id)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone != NULL)
+		return;
+	memzone = rte_memzone_reserve(RTE_METRICS_MEMZONE_NAME,
+		sizeof(struct rte_metrics_data_s), socket_id, 0);
+	if (memzone == NULL)
+		rte_exit(EXIT_FAILURE, "Unable to allocate stats memzone\n");
+	stats = memzone->addr;
+	memset(stats, 0, sizeof(struct rte_metrics_data_s));
+	rte_spinlock_init(&stats->lock);
+}
+
+int
+rte_metrics_reg_name(const char *name)
+{
+	const char * const list_names[] = {name};
+
+	return rte_metrics_reg_names(list_names, 1);
+}
+
+int
+rte_metrics_reg_names(const char * const *names, uint16_t cnt_names)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	uint16_t idx_base;
+
+	/* Some sanity checks */
+	if (cnt_names < 1 || names == NULL)
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	if (stats->cnt_stats + cnt_names >= RTE_METRICS_MAX_METRICS)
+		return -ENOMEM;
+
+	rte_spinlock_lock(&stats->lock);
+
+	/* Overwritten later if this is actually first set.. */
+	stats->metadata[stats->idx_last_set].idx_next_set = stats->cnt_stats;
+
+	stats->idx_last_set = idx_base = stats->cnt_stats;
+
+	for (idx_name = 0; idx_name < cnt_names; idx_name++) {
+		entry = &stats->metadata[idx_name + stats->cnt_stats];
+		strncpy(entry->name, names[idx_name],
+			RTE_METRICS_MAX_NAME_LEN);
+		memset(entry->value, 0, sizeof(entry->value));
+		entry->idx_next_stat = idx_name + stats->cnt_stats + 1;
+	}
+	entry->idx_next_stat = 0;
+	entry->idx_next_set = 0;
+	stats->cnt_stats += cnt_names;
+
+	rte_spinlock_unlock(&stats->lock);
+
+	return idx_base;
+}
+
+int
+rte_metrics_update_value(int port_id, uint16_t key, const uint64_t value)
+{
+	return rte_metrics_update_values(port_id, key, &value, 1);
+}
+
+int
+rte_metrics_update_values(int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_metric;
+	uint16_t idx_value;
+	uint16_t cnt_setsize;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	if (values == NULL)
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	rte_spinlock_lock(&stats->lock);
+	idx_metric = key;
+	cnt_setsize = 1;
+	while (idx_metric < stats->cnt_stats) {
+		entry = &stats->metadata[idx_metric];
+		if (entry->idx_next_stat == 0)
+			break;
+		cnt_setsize++;
+		idx_metric++;
+	}
+	/* Check update does not cross set border */
+	if (count > cnt_setsize) {
+		rte_spinlock_unlock(&stats->lock);
+		return -ERANGE;
+	}
+
+	if (port_id == RTE_METRICS_GLOBAL)
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].global_value =
+				values[idx_value];
+		}
+	else
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].value[port_id] =
+				values[idx_value];
+		}
+	rte_spinlock_unlock(&stats->lock);
+	return 0;
+}
+
+int
+rte_metrics_get_names(struct rte_metric_name *names,
+	uint16_t capacity)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+	if (names != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		for (idx_name = 0; idx_name < stats->cnt_stats; idx_name++)
+			strncpy(names[idx_name].name,
+				stats->metadata[idx_name].name,
+				RTE_METRICS_MAX_NAME_LEN);
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
+
+int
+rte_metrics_get_values(int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+
+	if (values != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		if (port_id == RTE_METRICS_GLOBAL)
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->global_value;
+			}
+		else
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->value[port_id];
+			}
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
diff --git a/lib/librte_metrics/rte_metrics.h b/lib/librte_metrics/rte_metrics.h
new file mode 100644
index 0000000..fd0154f
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.h
@@ -0,0 +1,240 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ *
+ * DPDK Metrics module
+ *
+ * Metrics are statistics that are not generated by PMDs, and hence
+ * are better reported through a mechanism that is independent from
+ * the ethdev-based extended statistics. Providers will typically
+ * be other libraries and consumers will typically be applications.
+ *
+ * Metric information is populated using a push model, where producers
+ * update the values contained within the metric library by calling
+ * an update function on the relevant metrics. Consumers receive
+ * metric information by querying the central metric data, which is
+ * held in shared memory. Currently only bulk querying of metrics
+ * by consumers is supported.
+ */
+
+#ifndef _RTE_METRICS_H_
+#define _RTE_METRICS_H_
+
+/** Maximum length of metric name (including null-terminator) */
+#define RTE_METRICS_MAX_NAME_LEN 64
+
+/**
+ * Global metric special id.
+ *
+ * When used for the port_id parameter when calling
+ * rte_metrics_update_metric() or rte_metrics_update_metric(),
+ * the global metric, which are not associated with any specific
+ * port (i.e. device), are updated.
+ */
+#define RTE_METRICS_GLOBAL -1
+
+
+/**
+ * A name-key lookup for metrics.
+ *
+ * An array of this structure is returned by rte_metrics_get_names().
+ * The struct rte_metric_value references these names via their array index.
+ */
+struct rte_metric_name {
+	/** String describing metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+};
+
+
+/**
+ * Metric value structure.
+ *
+ * This structure is used by rte_metrics_get_values() to return metrics,
+ * which are statistics that are not generated by PMDs. It maps a name key,
+ * which corresponds to an index in the array returned by
+ * rte_metrics_get_names().
+ */
+struct rte_metric_value {
+	/** Numeric identifier of metric. */
+	uint16_t key;
+	/** Value for metric */
+	uint64_t value;
+};
+
+
+/**
+ * Initializes metric module. This function must be called from
+ * a primary process before metrics are used.
+ *
+ * @param socket_id
+ *   Socket to use for shared memory allocation.
+ */
+void rte_metrics_init(int socket_id);
+
+/**
+ * Register a metric, making it available as a reporting parameter.
+ *
+ * Registering a metric is the way producers declare a parameter
+ * that they wish to be reported. Once registered, the associated
+ * numeric key can be obtained via rte_metrics_get_names(), which
+ * is required for updating said metric's value.
+ *
+ * @param name
+ *   Metric name
+ *
+ * @return
+ *  - Zero or positive: Success (index key of new metric)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_name(const char *name);
+
+/**
+ * Register a set of metrics.
+ *
+ * This is a bulk version of rte_metrics_reg_names() and aside from
+ * handling multiple keys at once is functionally identical.
+ *
+ * @param names
+ *   List of metric names
+ *
+ * @param cnt_names
+ *   Number of metrics in set
+ *
+ * @return
+ *  - Zero or positive: Success (index key of start of set)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_names(const char * const *names, uint16_t cnt_names);
+
+/**
+ * Get metric name-key lookup table.
+ *
+ * @param names
+ *   A struct rte_metric_name array of at least *capacity* in size to
+ *   receive key names. If this is NULL, function returns the required
+ *   number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_name array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *names* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_names(
+	struct rte_metric_name *names,
+	uint16_t capacity);
+
+/**
+ * Get metric value table.
+ *
+ * @param port_id
+ *   Port id to query
+ *
+ * @param values
+ *   A struct rte_metric_value array of at least *capacity* in size to
+ *   receive metric ids and values. If this is NULL, function returns
+ *   the required number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_value array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *values* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_values(
+	int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity);
+
+/**
+ * Updates a metric
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Id of metric to update
+ * @param value
+ *   New value
+ *
+ * @return
+ *   - -EIO if unable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_value(
+	int port_id,
+	uint16_t key,
+	const uint64_t value);
+
+/**
+ * Updates a metric set. Note that it is an error to try to
+ * update across a set boundary.
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Base id of metrics set to update
+ * @param values
+ *   Set of new values
+ * @param count
+ *   Number of new values
+ *
+ * @return
+ *   - -ERANGE if count exceeds metric set size
+ *   - -EIO if unable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_values(
+	int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count);
+
+#endif
diff --git a/lib/librte_metrics/rte_metrics_version.map b/lib/librte_metrics/rte_metrics_version.map
new file mode 100644
index 0000000..4c5234c
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics_version.map
@@ -0,0 +1,13 @@
+DPDK_17.05 {
+	global:
+
+	rte_metrics_get_names;
+	rte_metrics_get_values;
+	rte_metrics_init;
+	rte_metrics_reg_name;
+	rte_metrics_reg_names;
+	rte_metrics_update_value;
+	rte_metrics_update_values;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 62a2a1a..ed3315c 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
-- 
2.5.5

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v5 09/14] ring: allow dequeue fns to return remaining entry count
                             ` (5 preceding siblings ...)
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
@ 2017-03-29 13:09  1%         ` Bruce Richardson
  6 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

Add an extra parameter to the ring dequeue burst/bulk functions so that
those functions can optionally return the amount of remaining objs in the
ring. This information can be used by applications in a number of ways,
for instance, with single-consumer queues, it provides a max
dequeue size which is guaranteed to work.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V5: Added missing updates to perf thread example.
    Added missing doxygen comments to new functions.
V4: added in missing updates to crypto PMDs
---
 app/pdump/main.c                                   |   2 +-
 doc/guides/prog_guide/writing_efficient_code.rst   |   2 +-
 doc/guides/rel_notes/release_17_05.rst             |   8 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c           |   2 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         |   2 +-
 drivers/crypto/armv8/rte_armv8_pmd.c               |   2 +-
 drivers/crypto/kasumi/rte_kasumi_pmd.c             |   2 +-
 drivers/crypto/null/null_crypto_pmd.c              |   2 +-
 drivers/crypto/openssl/rte_openssl_pmd.c           |   2 +-
 drivers/crypto/snow3g/rte_snow3g_pmd.c             |   2 +-
 drivers/crypto/zuc/rte_zuc_pmd.c                   |   2 +-
 drivers/net/bonding/rte_eth_bond_pmd.c             |   3 +-
 drivers/net/ring/rte_eth_ring.c                    |   2 +-
 examples/distributor/main.c                        |   2 +-
 examples/load_balancer/runtime.c                   |   6 +-
 .../client_server_mp/mp_client/client.c            |   3 +-
 examples/packet_ordering/main.c                    |   6 +-
 examples/performance-thread/l3fwd-thread/main.c    |   4 +-
 examples/qos_sched/app_thread.c                    |   6 +-
 examples/quota_watermark/qw/main.c                 |   5 +-
 examples/server_node_efd/node/node.c               |   2 +-
 lib/librte_hash/rte_cuckoo_hash.c                  |   3 +-
 lib/librte_mempool/rte_mempool_ring.c              |   4 +-
 lib/librte_port/rte_port_frag.c                    |   3 +-
 lib/librte_port/rte_port_ring.c                    |   6 +-
 lib/librte_ring/rte_ring.h                         | 108 +++++++++++++--------
 test/test-pipeline/runtime.c                       |   6 +-
 test/test/test_link_bonding_mode4.c                |   3 +-
 test/test/test_pmd_ring_perf.c                     |   7 +-
 test/test/test_ring.c                              |  54 ++++++-----
 test/test/test_ring_perf.c                         |  20 ++--
 test/test/test_table_acl.c                         |   2 +-
 test/test/test_table_pipeline.c                    |   2 +-
 test/test/test_table_ports.c                       |   8 +-
 test/test/virtual_pmd.c                            |   4 +-
 35 files changed, 173 insertions(+), 124 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index b88090d..3b13753 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -496,7 +496,7 @@ pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
 
 	/* first dequeue packets from ring of primary process */
 	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
-			(void *)rxtx_bufs, BURST_SIZE);
+			(void *)rxtx_bufs, BURST_SIZE, NULL);
 	stats->dequeue_pkts += nb_in_deq;
 
 	if (nb_in_deq) {
diff --git a/doc/guides/prog_guide/writing_efficient_code.rst b/doc/guides/prog_guide/writing_efficient_code.rst
index 78d2afa..8223ace 100644
--- a/doc/guides/prog_guide/writing_efficient_code.rst
+++ b/doc/guides/prog_guide/writing_efficient_code.rst
@@ -124,7 +124,7 @@ The code algorithm that dequeues messages may be something similar to the follow
 
     while (1) {
         /* Process as many elements as can be dequeued. */
-        count = rte_ring_dequeue_burst(ring, obj_table, MAX_BULK);
+        count = rte_ring_dequeue_burst(ring, obj_table, MAX_BULK, NULL);
         if (unlikely(count == 0))
             continue;
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index b361a98..c67e468 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -140,6 +140,8 @@ API Changes
   * added an extra parameter to the burst/bulk enqueue functions to
     return the number of free spaces in the ring after enqueue. This can
     be used by an application to implement its own watermark functionality.
+  * added an extra parameter to the burst/bulk dequeue functions to return
+    the number elements remaining in the ring after dequeue.
   * changed the return value of the enqueue and dequeue bulk functions to
     match that of the burst equivalents. In all cases, ring functions which
     operate on multiple packets now return the number of elements enqueued
@@ -152,6 +154,12 @@ API Changes
     - ``rte_ring_sc_dequeue_bulk``
     - ``rte_ring_dequeue_bulk``
 
+    NOTE: the above functions all have different parameters as well as
+    different return values, due to the other listed changes above. This
+    means that all instances of the functions in existing code will be
+    flagged by the compiler. The return value usage should be checked
+    while fixing the compiler error due to the extra parameter.
+
 ABI Changes
 -----------
 
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index a2d10a5..638a95d 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -420,7 +420,7 @@ aesni_gcm_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_pkts,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index 432d239..05edb6c 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -622,7 +622,7 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/armv8/rte_armv8_pmd.c b/drivers/crypto/armv8/rte_armv8_pmd.c
index 37ecd7b..6376e9e 100644
--- a/drivers/crypto/armv8/rte_armv8_pmd.c
+++ b/drivers/crypto/armv8/rte_armv8_pmd.c
@@ -765,7 +765,7 @@ armv8_crypto_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned int nb_dequeued = 0;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/kasumi/rte_kasumi_pmd.c b/drivers/crypto/kasumi/rte_kasumi_pmd.c
index 1dd05cb..55bdb29 100644
--- a/drivers/crypto/kasumi/rte_kasumi_pmd.c
+++ b/drivers/crypto/kasumi/rte_kasumi_pmd.c
@@ -542,7 +542,7 @@ kasumi_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)c_ops, nb_ops);
+			(void **)c_ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/null/null_crypto_pmd.c b/drivers/crypto/null/null_crypto_pmd.c
index ed5a9fc..f68ec8d 100644
--- a/drivers/crypto/null/null_crypto_pmd.c
+++ b/drivers/crypto/null/null_crypto_pmd.c
@@ -155,7 +155,7 @@ null_crypto_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_pkts,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c
index e74c5cf..09173b2 100644
--- a/drivers/crypto/openssl/rte_openssl_pmd.c
+++ b/drivers/crypto/openssl/rte_openssl_pmd.c
@@ -1119,7 +1119,7 @@ openssl_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned int nb_dequeued = 0;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/snow3g/rte_snow3g_pmd.c b/drivers/crypto/snow3g/rte_snow3g_pmd.c
index 01c4e1c..1042b31 100644
--- a/drivers/crypto/snow3g/rte_snow3g_pmd.c
+++ b/drivers/crypto/snow3g/rte_snow3g_pmd.c
@@ -533,7 +533,7 @@ snow3g_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)c_ops, nb_ops);
+			(void **)c_ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/zuc/rte_zuc_pmd.c b/drivers/crypto/zuc/rte_zuc_pmd.c
index 5e2dbf5..06ff503 100644
--- a/drivers/crypto/zuc/rte_zuc_pmd.c
+++ b/drivers/crypto/zuc/rte_zuc_pmd.c
@@ -433,7 +433,7 @@ zuc_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)c_ops, nb_ops);
+			(void **)c_ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index f3ac9e2..96638af 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1008,7 +1008,8 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		struct port *port = &mode_8023ad_ports[slaves[i]];
 
 		slave_slow_nb_pkts[i] = rte_ring_dequeue_burst(port->tx_ring,
-				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS);
+				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS,
+				NULL);
 		slave_nb_pkts[i] = slave_slow_nb_pkts[i];
 
 		for (j = 0; j < slave_slow_nb_pkts[i]; j++)
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index adbf478..77ef3a1 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -88,7 +88,7 @@ eth_ring_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
 	void **ptrs = (void *)&bufs[0];
 	struct ring_queue *r = q;
 	const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng,
-			ptrs, nb_bufs);
+			ptrs, nb_bufs, NULL);
 	if (r->rng->flags & RING_F_SC_DEQ)
 		r->rx_pkts.cnt += nb_rx;
 	else
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index bb84f13..90c9613 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -330,7 +330,7 @@ lcore_tx(struct rte_ring *in_r)
 
 			struct rte_mbuf *bufs[BURST_SIZE];
 			const uint16_t nb_rx = rte_ring_dequeue_burst(in_r,
-					(void *)bufs, BURST_SIZE);
+					(void *)bufs, BURST_SIZE, NULL);
 			app_stats.tx.dequeue_pkts += nb_rx;
 
 			/* if we get no traffic, flush anything we have */
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 1645994..8192c08 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -349,7 +349,8 @@ app_lcore_io_tx(
 			ret = rte_ring_sc_dequeue_bulk(
 				ring,
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
-				bsz_rd);
+				bsz_rd,
+				NULL);
 
 			if (unlikely(ret == 0))
 				continue;
@@ -504,7 +505,8 @@ app_lcore_worker(
 		ret = rte_ring_sc_dequeue_bulk(
 			ring_in,
 			(void **) lp->mbuf_in.array,
-			bsz_rd);
+			bsz_rd,
+			NULL);
 
 		if (unlikely(ret == 0))
 			continue;
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index dca9eb9..01b535c 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -279,7 +279,8 @@ main(int argc, char *argv[])
 		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts,
+				PKT_READ_SIZE, NULL);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c
index 569b6da..49ae35b 100644
--- a/examples/packet_ordering/main.c
+++ b/examples/packet_ordering/main.c
@@ -462,7 +462,7 @@ worker_thread(void *args_ptr)
 
 		/* dequeue the mbufs from rx_to_workers ring */
 		burst_size = rte_ring_dequeue_burst(ring_in,
-				(void *)burst_buffer, MAX_PKTS_BURST);
+				(void *)burst_buffer, MAX_PKTS_BURST, NULL);
 		if (unlikely(burst_size == 0))
 			continue;
 
@@ -510,7 +510,7 @@ send_thread(struct send_thread_args *args)
 
 		/* deque the mbufs from workers_to_tx ring */
 		nb_dq_mbufs = rte_ring_dequeue_burst(args->ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(nb_dq_mbufs == 0))
 			continue;
@@ -595,7 +595,7 @@ tx_thread(struct rte_ring *ring_in)
 
 		/* deque the mbufs from workers_to_tx ring */
 		dqnum = rte_ring_dequeue_burst(ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(dqnum == 0))
 			continue;
diff --git a/examples/performance-thread/l3fwd-thread/main.c b/examples/performance-thread/l3fwd-thread/main.c
index b4c0df1..06f8962 100644
--- a/examples/performance-thread/l3fwd-thread/main.c
+++ b/examples/performance-thread/l3fwd-thread/main.c
@@ -2077,7 +2077,7 @@ lthread_tx_per_ring(void *dummy)
 		 */
 		SET_CPU_BUSY(tx_conf, CPU_POLL);
 		nb_rx = rte_ring_sc_dequeue_burst(ring, (void **)pkts_burst,
-				MAX_PKT_BURST);
+				MAX_PKT_BURST, NULL);
 		SET_CPU_IDLE(tx_conf, CPU_POLL);
 
 		if (nb_rx > 0) {
@@ -2381,7 +2381,7 @@ pthread_tx(void *dummy)
 		 */
 		SET_CPU_BUSY(tx_conf, CPU_POLL);
 		nb_rx = rte_ring_sc_dequeue_burst(tx_conf->ring,
-				(void **)pkts_burst, MAX_PKT_BURST);
+				(void **)pkts_burst, MAX_PKT_BURST, NULL);
 		SET_CPU_IDLE(tx_conf, CPU_POLL);
 
 		if (unlikely(nb_rx == 0)) {
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 0c81a15..15f117f 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -179,7 +179,7 @@ app_tx_thread(struct thread_conf **confs)
 
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
-					burst_conf.qos_dequeue);
+					burst_conf.qos_dequeue, NULL);
 		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
@@ -218,7 +218,7 @@ app_worker_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
@@ -254,7 +254,7 @@ app_mixed_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
diff --git a/examples/quota_watermark/qw/main.c b/examples/quota_watermark/qw/main.c
index 57df8ef..2dcddea 100644
--- a/examples/quota_watermark/qw/main.c
+++ b/examples/quota_watermark/qw/main.c
@@ -247,7 +247,8 @@ pipeline_stage(__attribute__((unused)) void *args)
 			}
 
 			/* Dequeue up to quota mbuf from rx */
-			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts, *quota);
+			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts,
+					*quota, NULL);
 			if (unlikely(nb_dq_pkts < 0))
 				continue;
 
@@ -305,7 +306,7 @@ send_stage(__attribute__((unused)) void *args)
 
 			/* Dequeue packets from tx and send them */
 			nb_dq_pkts = (uint16_t) rte_ring_dequeue_burst(tx,
-					(void *) tx_pkts, *quota);
+					(void *) tx_pkts, *quota, NULL);
 			rte_eth_tx_burst(dest_port_id, 0, tx_pkts, nb_dq_pkts);
 
 			/* TODO: Check if nb_dq_pkts == nb_tx_pkts? */
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index 9ec6a05..f780b92 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) == 0))
+					rx_pkts, NULL) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 6552199..645c0cf 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -536,7 +536,8 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
 		if (cached_free_slots->len == 0) {
 			/* Need to get another burst of free slots from global ring */
 			n_slots = rte_ring_mc_dequeue_burst(h->free_slots,
-					cached_free_slots->objs, LCORE_CACHE_SIZE);
+					cached_free_slots->objs,
+					LCORE_CACHE_SIZE, NULL);
 			if (n_slots == 0)
 				return -ENOSPC;
 
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index 9b8fd2b..5c132bf 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -58,14 +58,14 @@ static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_mc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_sc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_port/rte_port_frag.c b/lib/librte_port/rte_port_frag.c
index 0fcace9..320407e 100644
--- a/lib/librte_port/rte_port_frag.c
+++ b/lib/librte_port/rte_port_frag.c
@@ -186,7 +186,8 @@ rte_port_ring_reader_frag_rx(void *port,
 		/* If "pkts" buffer is empty, read packet burst from ring */
 		if (p->n_pkts == 0) {
 			p->n_pkts = rte_ring_sc_dequeue_burst(p->ring,
-				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX);
+				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX,
+				NULL);
 			RTE_PORT_RING_READER_FRAG_STATS_PKTS_IN_ADD(p, p->n_pkts);
 			if (p->n_pkts == 0)
 				return n_pkts_out;
diff --git a/lib/librte_port/rte_port_ring.c b/lib/librte_port/rte_port_ring.c
index c5dbe07..85fad44 100644
--- a/lib/librte_port/rte_port_ring.c
+++ b/lib/librte_port/rte_port_ring.c
@@ -111,7 +111,8 @@ rte_port_ring_reader_rx(void *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
@@ -124,7 +125,8 @@ rte_port_ring_multi_reader_rx(void *port, struct rte_mbuf **pkts,
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 439698b..2b6896b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -488,7 +488,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -497,11 +498,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	unsigned int i;
 	uint32_t mask = r->mask;
 
-	/* Avoid the unnecessary cmpset operation below, which is also
-	 * potentially harmful when n equals 0. */
-	if (n == 0)
-		return 0;
-
 	/* move cons.head atomically */
 	do {
 		/* Restore n as it may change every loop */
@@ -516,15 +512,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		entries = (prod_tail - cons_head);
 
 		/* Set the actual entries for dequeue */
-		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED)
-				return 0;
-			else {
-				if (unlikely(entries == 0))
-					return 0;
-				n = entries;
-			}
-		}
+		if (n > entries)
+			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+		if (unlikely(n == 0))
+			goto end;
 
 		cons_next = cons_head + n;
 		success = rte_atomic32_cmpset(&r->cons.head, cons_head,
@@ -543,7 +535,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		rte_pause();
 
 	r->cons.tail = cons_next;
-
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -567,7 +561,8 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  */
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -582,15 +577,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * and size(ring)-1. */
 	entries = prod_tail - cons_head;
 
-	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED)
-			return 0;
-		else {
-			if (unlikely(entries == 0))
-				return 0;
-			n = entries;
-		}
-	}
+	if (n > entries)
+		n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+	if (unlikely(entries == 0))
+		goto end;
 
 	cons_next = cons_head + n;
 	r->cons.head = cons_next;
@@ -600,6 +591,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -751,13 +745,18 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to a table of void * pointers (objects) that will be filled.
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
  * @return
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -770,13 +769,18 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table,
  *   must be strictly positive.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
  * @return
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -792,16 +796,20 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   A pointer to a table of void * pointers (objects) that will be filled.
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
  * @return
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned int n,
+		unsigned int *available)
 {
 	if (r->cons.single)
-		return rte_ring_sc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_sc_dequeue_bulk(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_mc_dequeue_bulk(r, obj_table, n, available);
 }
 
 /**
@@ -822,7 +830,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1, NULL)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -840,7 +848,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -862,7 +870,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -1057,13 +1065,18 @@ rte_ring_enqueue_burst(struct rte_ring *r, void * const *obj_table,
  *   A pointer to a table of void * pointers (objects) that will be filled.
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
  * @return
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1077,13 +1090,18 @@ rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   A pointer to a table of void * pointers (objects) that will be filled.
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
  * @return
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1099,16 +1117,20 @@ rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   A pointer to a table of void * pointers (objects) that will be filled.
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
+ * @param available
+ *   If non-NULL, returns the number of remaining ring entries after the
+ *   dequeue has finished.
  * @return
  *   - Number of objects dequeued
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
 	if (r->cons.single)
-		return rte_ring_sc_dequeue_burst(r, obj_table, n);
+		return rte_ring_sc_dequeue_burst(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_burst(r, obj_table, n);
+		return rte_ring_mc_dequeue_burst(r, obj_table, n, available);
 }
 
 #ifdef __cplusplus
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index c06ff54..8970e1c 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -121,7 +121,8 @@ app_main_loop_worker(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_rx[i],
 			(void **) worker_mbuf->array,
-			app.burst_size_worker_read);
+			app.burst_size_worker_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
@@ -151,7 +152,8 @@ app_main_loop_tx(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_tx[i],
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
-			app.burst_size_tx_read);
+			app.burst_size_tx_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
diff --git a/test/test/test_link_bonding_mode4.c b/test/test/test_link_bonding_mode4.c
index 8df28b4..15091b1 100644
--- a/test/test/test_link_bonding_mode4.c
+++ b/test/test/test_link_bonding_mode4.c
@@ -193,7 +193,8 @@ static uint8_t lacpdu_rx_count[RTE_MAX_ETHPORTS] = {0, };
 static int
 slave_get_pkts(struct slave_conf *slave, struct rte_mbuf **buf, uint16_t size)
 {
-	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf, size);
+	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf,
+			size, NULL);
 }
 
 /*
diff --git a/test/test/test_pmd_ring_perf.c b/test/test/test_pmd_ring_perf.c
index 045a7f2..004882a 100644
--- a/test/test/test_pmd_ring_perf.c
+++ b/test/test/test_pmd_ring_perf.c
@@ -67,7 +67,7 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t eth_start = rte_rdtsc();
@@ -99,7 +99,7 @@ test_single_enqueue_dequeue(void)
 	rte_compiler_barrier();
 	for (i = 0; i < iterations; i++) {
 		rte_ring_enqueue_bulk(r, &burst, 1, NULL);
-		rte_ring_dequeue_bulk(r, &burst, 1);
+		rte_ring_dequeue_bulk(r, &burst, 1, NULL);
 	}
 	const uint64_t sc_end = rte_rdtsc_precise();
 	rte_compiler_barrier();
@@ -133,7 +133,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, (void *)burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, (void *)burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, (void *)burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index b0ca88b..858ebc1 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -119,7 +119,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		    __func__, i, rand);
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand,
 				NULL) != 0);
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand,
+				NULL) == rand);
 
 		/* fill the ring */
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz, NULL) != 0);
@@ -129,7 +130,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz,
+				NULL) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -186,19 +188,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -232,19 +234,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -265,7 +267,7 @@ test_ring_basic(void)
 		cur_src += MAX_BULK;
 		if (ret == 0)
 			goto fail;
-		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if (ret == 0)
 			goto fail;
@@ -303,13 +305,13 @@ test_ring_basic(void)
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue2\n");
@@ -390,19 +392,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1) ;
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -451,19 +453,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available memory space for the exact MAX_BULK entries */
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -505,19 +507,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -539,7 +541,7 @@ test_ring_burst_basic(void)
 		cur_src += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
@@ -578,19 +580,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available objects - the exact MAX_BULK */
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -613,7 +615,7 @@ test_ring_burst_basic(void)
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret != 2)
 		goto fail;
@@ -753,7 +755,7 @@ test_ring_basic_ex(void)
 		goto fail_test;
 	}
 
-	ret = rte_ring_dequeue_burst(rp, obj, 2);
+	ret = rte_ring_dequeue_burst(rp, obj, 2, NULL);
 	if (ret != 2) {
 		printf("test_ring_basic_ex: rte_ring_dequeue_burst fails \n");
 		goto fail_test;
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index f95a8e9..ed89896 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -152,12 +152,12 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t mc_end = rte_rdtsc();
 
 	printf("SC empty dequeue: %.2F\n",
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
@@ -325,7 +325,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -333,7 +334,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
@@ -361,7 +363,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -369,7 +372,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
diff --git a/test/test/test_table_acl.c b/test/test/test_table_acl.c
index b3bfda4..4d43be7 100644
--- a/test/test/test_table_acl.c
+++ b/test/test/test_table_acl.c
@@ -713,7 +713,7 @@ test_pipeline_single_filter(int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0) {
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_pipeline.c b/test/test/test_table_pipeline.c
index 36bfeda..b58aa5d 100644
--- a/test/test/test_table_pipeline.c
+++ b/test/test/test_table_pipeline.c
@@ -494,7 +494,7 @@ test_pipeline_single_filter(int test_type, int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0)
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_ports.c b/test/test/test_table_ports.c
index 395f4f3..39592ce 100644
--- a/test/test/test_table_ports.c
+++ b/test/test/test_table_ports.c
@@ -163,7 +163,7 @@ test_port_ring_writer(void)
 	rte_port_ring_writer_ops.f_flush(port);
 	expected_pkts = 1;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -7;
@@ -178,7 +178,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -193,7 +193,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -208,7 +208,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -9;
diff --git a/test/test/virtual_pmd.c b/test/test/virtual_pmd.c
index 39e070c..b209355 100644
--- a/test/test/virtual_pmd.c
+++ b/test/test/virtual_pmd.c
@@ -342,7 +342,7 @@ virtual_ethdev_rx_burst_success(void *queue __rte_unused,
 	dev_private = vrtl_eth_dev->data->dev_private;
 
 	rx_count = rte_ring_dequeue_burst(dev_private->rx_queue, (void **) bufs,
-			nb_pkts);
+			nb_pkts, NULL);
 
 	/* increments ipackets count */
 	dev_private->eth_stats.ipackets += rx_count;
@@ -508,7 +508,7 @@ virtual_ethdev_get_mbufs_from_tx_queue(uint8_t port_id,
 
 	dev_private = vrtl_eth_dev->data->dev_private;
 	return rte_ring_dequeue_burst(dev_private->tx_queue, (void **)pkt_burst,
-		burst_length);
+		burst_length, NULL);
 }
 
 static uint8_t
-- 
2.9.3

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent
                             ` (4 preceding siblings ...)
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 06/14] ring: remove watermark support Bruce Richardson
@ 2017-03-29 13:09  2%         ` Bruce Richardson
  2017-04-13  6:42  0%           ` Wang, Zhihong
  2017-03-29 13:09  1%         ` [dpdk-dev] [PATCH v5 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
  6 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

The bulk fns for rings returns 0 for all elements enqueued and negative
for no space. Change that to make them consistent with the burst functions
in returning the number of elements enqueued/dequeued, i.e. 0 or N.
This change also allows the return value from enq/deq to be used directly
without a branch for error checking.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/release_17_05.rst             |  11 +++
 doc/guides/sample_app_ug/server_node_efd.rst       |   2 +-
 examples/load_balancer/runtime.c                   |  16 ++-
 .../client_server_mp/mp_client/client.c            |   8 +-
 .../client_server_mp/mp_server/main.c              |   2 +-
 examples/qos_sched/app_thread.c                    |   8 +-
 examples/server_node_efd/node/node.c               |   2 +-
 examples/server_node_efd/server/main.c             |   2 +-
 lib/librte_mempool/rte_mempool_ring.c              |  12 ++-
 lib/librte_ring/rte_ring.h                         | 109 +++++++--------------
 test/test-pipeline/pipeline_hash.c                 |   2 +-
 test/test-pipeline/runtime.c                       |   8 +-
 test/test/test_ring.c                              |  46 +++++----
 test/test/test_ring_perf.c                         |   8 +-
 14 files changed, 106 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 084b359..6da2612 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -137,6 +137,17 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
   * removed the function ``rte_ring_set_water_mark`` as part of a general
     removal of watermarks support in the library.
+  * changed the return value of the enqueue and dequeue bulk functions to
+    match that of the burst equivalents. In all cases, ring functions which
+    operate on multiple packets now return the number of elements enqueued
+    or dequeued, as appropriate. The updated functions are:
+
+    - ``rte_ring_mp_enqueue_bulk``
+    - ``rte_ring_sp_enqueue_bulk``
+    - ``rte_ring_enqueue_bulk``
+    - ``rte_ring_mc_dequeue_bulk``
+    - ``rte_ring_sc_dequeue_bulk``
+    - ``rte_ring_dequeue_bulk``
 
 ABI Changes
 -----------
diff --git a/doc/guides/sample_app_ug/server_node_efd.rst b/doc/guides/sample_app_ug/server_node_efd.rst
index 9b69cfe..e3a63c8 100644
--- a/doc/guides/sample_app_ug/server_node_efd.rst
+++ b/doc/guides/sample_app_ug/server_node_efd.rst
@@ -286,7 +286,7 @@ repeated infinitely.
 
         cl = &nodes[node];
         if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-                cl_rx_buf[node].count) != 0){
+                cl_rx_buf[node].count) != cl_rx_buf[node].count){
             for (j = 0; j < cl_rx_buf[node].count; j++)
                 rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
             cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 6944325..82b10bc 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -146,7 +146,7 @@ app_lcore_io_rx_buffer_to_send (
 		(void **) lp->rx.mbuf_out[worker].array,
 		bsz);
 
-	if (unlikely(ret == -ENOBUFS)) {
+	if (unlikely(ret == 0)) {
 		uint32_t k;
 		for (k = 0; k < bsz; k ++) {
 			struct rte_mbuf *m = lp->rx.mbuf_out[worker].array[k];
@@ -312,7 +312,7 @@ app_lcore_io_rx_flush(struct app_lcore_params_io *lp, uint32_t n_workers)
 			(void **) lp->rx.mbuf_out[worker].array,
 			lp->rx.mbuf_out[worker].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->rx.mbuf_out[worker].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->rx.mbuf_out[worker].array[k];
@@ -349,9 +349,8 @@ app_lcore_io_tx(
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
 				bsz_rd);
 
-			if (unlikely(ret == -ENOENT)) {
+			if (unlikely(ret == 0))
 				continue;
-			}
 
 			n_mbufs += bsz_rd;
 
@@ -505,9 +504,8 @@ app_lcore_worker(
 			(void **) lp->mbuf_in.array,
 			bsz_rd);
 
-		if (unlikely(ret == -ENOENT)) {
+		if (unlikely(ret == 0))
 			continue;
-		}
 
 #if APP_WORKER_DROP_ALL_PACKETS
 		for (j = 0; j < bsz_rd; j ++) {
@@ -559,7 +557,7 @@ app_lcore_worker(
 
 #if APP_STATS
 			lp->rings_out_iters[port] ++;
-			if (ret == 0) {
+			if (ret > 0) {
 				lp->rings_out_count[port] += 1;
 			}
 			if (lp->rings_out_iters[port] == APP_STATS){
@@ -572,7 +570,7 @@ app_lcore_worker(
 			}
 #endif
 
-			if (unlikely(ret == -ENOBUFS)) {
+			if (unlikely(ret == 0)) {
 				uint32_t k;
 				for (k = 0; k < bsz_wr; k ++) {
 					struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
@@ -609,7 +607,7 @@ app_lcore_worker_flush(struct app_lcore_params_worker *lp)
 			(void **) lp->mbuf_out[port].array,
 			lp->mbuf_out[port].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->mbuf_out[port].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index d4f9ca3..dca9eb9 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -276,14 +276,10 @@ main(int argc, char *argv[])
 	printf("[Press Ctrl-C to quit ...]\n");
 
 	for (;;) {
-		uint16_t i, rx_pkts = PKT_READ_SIZE;
+		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		/* try dequeuing max possible packets first, if that fails, get the
-		 * most we can. Loop body should only execute once, maximum */
-		while (rx_pkts > 0 &&
-				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts, rx_pkts) != 0))
-			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring), PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/multi_process/client_server_mp/mp_server/main.c b/examples/multi_process/client_server_mp/mp_server/main.c
index a6dc12d..19c95b2 100644
--- a/examples/multi_process/client_server_mp/mp_server/main.c
+++ b/examples/multi_process/client_server_mp/mp_server/main.c
@@ -227,7 +227,7 @@ flush_rx_queue(uint16_t client)
 
 	cl = &clients[client];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[client].buffer,
-			cl_rx_buf[client].count) != 0){
+			cl_rx_buf[client].count) == 0){
 		for (j = 0; j < cl_rx_buf[client].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[client].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[client].count;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 70fdcdb..dab4594 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -107,7 +107,7 @@ app_rx_thread(struct thread_conf **confs)
 			}
 
 			if (unlikely(rte_ring_sp_enqueue_bulk(conf->rx_ring,
-								(void **)rx_mbufs, nb_rx) != 0)) {
+					(void **)rx_mbufs, nb_rx) == 0)) {
 				for(i = 0; i < nb_rx; i++) {
 					rte_pktmbuf_free(rx_mbufs[i]);
 
@@ -180,7 +180,7 @@ app_tx_thread(struct thread_conf **confs)
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
 					burst_conf.qos_dequeue);
-		if (likely(retval == 0)) {
+		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
 			conf->counter = 0; /* reset empty read loop counter */
@@ -230,7 +230,9 @@ app_worker_thread(struct thread_conf **confs)
 		nb_pkt = rte_sched_port_dequeue(conf->sched_port, mbufs,
 					burst_conf.qos_dequeue);
 		if (likely(nb_pkt > 0))
-			while (rte_ring_sp_enqueue_bulk(conf->tx_ring, (void **)mbufs, nb_pkt) != 0);
+			while (rte_ring_sp_enqueue_bulk(conf->tx_ring,
+					(void **)mbufs, nb_pkt) == 0)
+				; /* empty body */
 
 		conf_idx++;
 		if (confs[conf_idx] == NULL)
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index a6c0c70..9ec6a05 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) != 0))
+					rx_pkts) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/examples/server_node_efd/server/main.c b/examples/server_node_efd/server/main.c
index 1a54d1b..3eb7fac 100644
--- a/examples/server_node_efd/server/main.c
+++ b/examples/server_node_efd/server/main.c
@@ -247,7 +247,7 @@ flush_rx_queue(uint16_t node)
 
 	cl = &nodes[node];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-			cl_rx_buf[node].count) != 0){
+			cl_rx_buf[node].count) != cl_rx_buf[node].count){
 		for (j = 0; j < cl_rx_buf[node].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index b9aa64d..409b860 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -42,26 +42,30 @@ static int
 common_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_mp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_sp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_mc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_sc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 906e8ae..34b438c 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -349,14 +349,10 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -388,7 +384,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOBUFS;
+				return 0;
 			else {
 				/* No free entry available */
 				if (unlikely(free_entries == 0))
@@ -414,7 +410,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -430,14 +426,10 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -457,7 +449,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOBUFS;
+			return 0;
 		else {
 			/* No free entry available */
 			if (unlikely(free_entries == 0))
@@ -474,7 +466,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -495,16 +487,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
 
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -536,7 +523,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOENT;
+				return 0;
 			else {
 				if (unlikely(entries == 0))
 					return 0;
@@ -562,7 +549,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	r->cons.tail = cons_next;
 
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -580,15 +567,10 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -607,7 +589,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	if (n > entries) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOENT;
+			return 0;
 		else {
 			if (unlikely(entries == 0))
 				return 0;
@@ -623,7 +605,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -639,10 +621,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -659,10 +640,9 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -683,10 +663,9 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 		      unsigned n)
 {
@@ -713,7 +692,7 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 static inline int __attribute__((always_inline))
 rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_mp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_mp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -730,7 +709,7 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_sp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_sp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -751,10 +730,7 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_enqueue(struct rte_ring *r, void *obj)
 {
-	if (r->prod.single)
-		return rte_ring_sp_enqueue(r, obj);
-	else
-		return rte_ring_mp_enqueue(r, obj);
+	return rte_ring_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -770,11 +746,9 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -791,11 +765,9 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects to dequeue from the ring to the obj_table,
  *   must be strictly positive.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -815,11 +787,9 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue, no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	if (r->cons.single)
@@ -846,7 +816,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -864,7 +834,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -886,10 +856,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	if (r->cons.single)
-		return rte_ring_sc_dequeue(r, obj_p);
-	else
-		return rte_ring_mc_dequeue(r, obj_p);
+	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
diff --git a/test/test-pipeline/pipeline_hash.c b/test/test-pipeline/pipeline_hash.c
index 10d2869..1ac0aa8 100644
--- a/test/test-pipeline/pipeline_hash.c
+++ b/test/test-pipeline/pipeline_hash.c
@@ -547,6 +547,6 @@ app_main_loop_rx_metadata(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index 42a6142..4e20669 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -98,7 +98,7 @@ app_main_loop_rx(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -123,7 +123,7 @@ app_main_loop_worker(void) {
 			(void **) worker_mbuf->array,
 			app.burst_size_worker_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		do {
@@ -131,7 +131,7 @@ app_main_loop_worker(void) {
 				app.rings_tx[i ^ 1],
 				(void **) worker_mbuf->array,
 				app.burst_size_worker_write);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -152,7 +152,7 @@ app_main_loop_tx(void) {
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
 			app.burst_size_tx_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		n_mbufs += app.burst_size_tx_read;
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 666a451..112433b 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -117,20 +117,18 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		rand = RTE_MAX(rte_rand() % RING_SIZE, 1UL);
 		printf("%s: iteration %u, random shift: %u;\n",
 		    __func__, i, rand);
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rand));
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rand));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand) != 0);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
 
 		/* fill the ring */
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rsz));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz) != 0);
 		TEST_RING_VERIFY(0 == rte_ring_free_count(r));
 		TEST_RING_VERIFY(rsz == rte_ring_count(r));
 		TEST_RING_VERIFY(rte_ring_full(r));
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rsz));
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -171,37 +169,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -217,37 +215,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -264,11 +262,11 @@ test_ring_basic(void)
 	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
 		ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 		cur_src += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 		cur_dst += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 	}
 
@@ -294,25 +292,25 @@ test_ring_basic(void)
 
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue2\n");
 		goto fail;
 	}
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index 320c20c..8ccbdef 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -195,13 +195,13 @@ enqueue_bulk(void *p)
 
 	const uint64_t sp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_sp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sp_end = rte_rdtsc();
 
 	const uint64_t mp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_mp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mp_end = rte_rdtsc();
 
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v5 05/14] ring: remove the yield when waiting for tail update
                             ` (2 preceding siblings ...)
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 04/14] ring: remove debug setting Bruce Richardson
@ 2017-03-29 13:09  4%         ` Bruce Richardson
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 06/14] ring: remove watermark support Bruce Richardson
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base                              |  1 -
 doc/guides/prog_guide/env_abstraction_layer.rst |  5 ----
 doc/guides/rel_notes/release_17_05.rst          |  1 +
 lib/librte_ring/rte_ring.h                      | 35 +++++--------------------
 4 files changed, 7 insertions(+), 35 deletions(-)

diff --git a/config/common_base b/config/common_base
index 69e91ae..2d54ddf 100644
--- a/config/common_base
+++ b/config/common_base
@@ -452,7 +452,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
 # Compile librte_mempool
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 10a10a8..7c39cd2 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -352,11 +352,6 @@ Known Issues
 
   3. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
 
-  ``RTE_RING_PAUSE_REP_COUNT`` is defined for rte_ring to reduce contention. It's mainly for case 2, a yield is issued after number of times pause repeat.
-
-  It adds a sched_yield() syscall if the thread spins for too long while waiting on the other thread to finish its operations on the ring.
-  This gives the preempted thread a chance to proceed and finish with the ring enqueue/dequeue operation.
-
 + rte_timer
 
   Running  ``rte_timer_manager()`` on a non-EAL pthread is not allowed. However, resetting/stopping the timer from a non-EAL pthread is allowed.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 50123c2..25d8549 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -134,6 +134,7 @@ API Changes
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
+  * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 2777b41..f8ac7f5 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -114,11 +114,6 @@ enum rte_ring_queue_behavior {
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
 			   sizeof(RTE_RING_MZ_PREFIX) + 1)
 
-#ifndef RTE_RING_PAUSE_REP_COUNT
-#define RTE_RING_PAUSE_REP_COUNT 0 /**< Yield after pause num of times, no yield
-                                    *   if RTE_RING_PAUSE_REP not defined. */
-#endif
-
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
 #if RTE_CACHE_LINE_SIZE < 128
@@ -393,7 +388,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -447,18 +442,9 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head)) {
+	while (unlikely(r->prod.tail != prod_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->prod.tail = prod_next;
 	return ret;
 }
@@ -491,7 +477,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 {
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -568,7 +554,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -613,18 +599,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head)) {
+	while (unlikely(r->cons.tail != cons_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -659,7 +636,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
-- 
2.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5 06/14] ring: remove watermark support
                             ` (3 preceding siblings ...)
  2017-03-29 13:09  4%         ` [dpdk-dev] [PATCH v5 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
@ 2017-03-29 13:09  2%         ` Bruce Richardson
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
  2017-03-29 13:09  1%         ` [dpdk-dev] [PATCH v5 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
  6 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

Remove the watermark support. A future commit will add support for having
enqueue functions return the amount of free space in the ring, which will
allow applications to implement their own watermark checks, while also
being more useful to the app.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V2: fix missed references to watermarks in v1
---
 doc/guides/prog_guide/ring_lib.rst     |   8 --
 doc/guides/rel_notes/release_17_05.rst |   2 +
 examples/Makefile                      |   2 +-
 lib/librte_ring/rte_ring.c             |  23 -----
 lib/librte_ring/rte_ring.h             |  58 +------------
 test/test/autotest_test_funcs.py       |   7 --
 test/test/commands.c                   |  52 ------------
 test/test/test_ring.c                  | 149 +--------------------------------
 8 files changed, 8 insertions(+), 293 deletions(-)

diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index d4ab502..b31ab7a 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -102,14 +102,6 @@ Name
 A ring is identified by a unique name.
 It is not possible to create two rings with the same name (rte_ring_create() returns NULL if this is attempted).
 
-Water Marking
-~~~~~~~~~~~~~
-
-The ring can have a high water mark (threshold).
-Once an enqueue operation reaches the high water mark, the producer is notified, if the water mark is configured.
-
-This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 25d8549..084b359 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -135,6 +135,8 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
+  * removed the function ``rte_ring_set_water_mark`` as part of a general
+    removal of watermarks support in the library.
 
 ABI Changes
 -----------
diff --git a/examples/Makefile b/examples/Makefile
index da2bfdd..19cd5ad 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -81,7 +81,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += packet_ordering
 DIRS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ptpclient
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
-DIRS-y += quota_watermark
+#DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
 ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 934ce87..25f64f0 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -138,7 +138,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->watermark = count;
 	r->prod.single = !!(flags & RING_F_SP_ENQ);
 	r->cons.single = !!(flags & RING_F_SC_DEQ);
 	r->size = count;
@@ -256,24 +255,6 @@ rte_ring_free(struct rte_ring *r)
 	rte_free(te);
 }
 
-/*
- * change the high water mark. If *count* is 0, water marking is
- * disabled
- */
-int
-rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
-{
-	if (count >= r->size)
-		return -EINVAL;
-
-	/* if count is 0, disable the watermarking */
-	if (count == 0)
-		count = r->size;
-
-	r->watermark = count;
-	return 0;
-}
-
 /* dump the status of the ring on the console */
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
@@ -287,10 +268,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->watermark == r->size)
-		fprintf(f, "  watermark=0\n");
-	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index f8ac7f5..906e8ae 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -153,7 +153,6 @@ struct rte_ring {
 			/**< Memzone, if any, containing the rte_ring */
 	uint32_t size;           /**< Size of ring. */
 	uint32_t mask;           /**< Mask (size-1) of ring. */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -168,7 +167,6 @@ struct rte_ring {
 
 #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer". */
 #define RING_F_SC_DEQ 0x0002 /**< The default dequeue is "single-consumer". */
-#define RTE_RING_QUOT_EXCEED (1 << 31)  /**< Quota exceed for burst ops */
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
@@ -274,26 +272,6 @@ struct rte_ring *rte_ring_create(const char *name, unsigned count,
 void rte_ring_free(struct rte_ring *r);
 
 /**
- * Change the high water mark.
- *
- * If *count* is 0, water marking is disabled. Otherwise, it is set to the
- * *count* value. The *count* value must be greater than 0 and less
- * than the ring size.
- *
- * This function can be called at any time (not necessarily at
- * initialization).
- *
- * @param r
- *   A pointer to the ring structure.
- * @param count
- *   The new water mark value.
- * @return
- *   - 0: Success; water mark changed.
- *   - -EINVAL: Invalid water mark value.
- */
-int rte_ring_set_water_mark(struct rte_ring *r, unsigned count);
-
-/**
  * Dump the status of the ring to a file.
  *
  * @param f
@@ -374,8 +352,6 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -390,7 +366,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	int success;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -431,13 +406,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-				(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	/*
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
@@ -446,7 +414,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -465,8 +433,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -479,7 +445,6 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_next, free_entries;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	prod_head = r->prod.head;
 	cons_tail = r->cons.tail;
@@ -508,15 +473,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-			(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -682,8 +640,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -704,8 +660,6 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -730,8 +684,6 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -756,8 +708,6 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -775,8 +725,6 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -798,8 +746,6 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
diff --git a/test/test/autotest_test_funcs.py b/test/test/autotest_test_funcs.py
index 1c5f390..8da8fcd 100644
--- a/test/test/autotest_test_funcs.py
+++ b/test/test/autotest_test_funcs.py
@@ -292,11 +292,4 @@ def ring_autotest(child, test_name):
     elif index == 2:
         return -1, "Fail [Timeout]"
 
-    child.sendline("set_watermark test 100")
-    child.sendline("dump_ring test")
-    index = child.expect(["  watermark=100",
-                          pexpect.TIMEOUT], timeout=1)
-    if index != 0:
-        return -1, "Fail [Bad watermark]"
-
     return 0, "Success"
diff --git a/test/test/commands.c b/test/test/commands.c
index 2df46b0..551c81d 100644
--- a/test/test/commands.c
+++ b/test/test/commands.c
@@ -228,57 +228,6 @@ cmdline_parse_inst_t cmd_dump_one = {
 
 /****************/
 
-struct cmd_set_ring_result {
-	cmdline_fixed_string_t set;
-	cmdline_fixed_string_t name;
-	uint32_t value;
-};
-
-static void cmd_set_ring_parsed(void *parsed_result, struct cmdline *cl,
-				__attribute__((unused)) void *data)
-{
-	struct cmd_set_ring_result *res = parsed_result;
-	struct rte_ring *r;
-	int ret;
-
-	r = rte_ring_lookup(res->name);
-	if (r == NULL) {
-		cmdline_printf(cl, "Cannot find ring\n");
-		return;
-	}
-
-	if (!strcmp(res->set, "set_watermark")) {
-		ret = rte_ring_set_water_mark(r, res->value);
-		if (ret != 0)
-			cmdline_printf(cl, "Cannot set water mark\n");
-	}
-}
-
-cmdline_parse_token_string_t cmd_set_ring_set =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, set,
-				 "set_watermark");
-
-cmdline_parse_token_string_t cmd_set_ring_name =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, name, NULL);
-
-cmdline_parse_token_num_t cmd_set_ring_value =
-	TOKEN_NUM_INITIALIZER(struct cmd_set_ring_result, value, UINT32);
-
-cmdline_parse_inst_t cmd_set_ring = {
-	.f = cmd_set_ring_parsed,  /* function to call */
-	.data = NULL,      /* 2nd arg of func */
-	.help_str = "set watermark: "
-			"set_watermark <ring_name> <value>",
-	.tokens = {        /* token list, NULL terminated */
-		(void *)&cmd_set_ring_set,
-		(void *)&cmd_set_ring_name,
-		(void *)&cmd_set_ring_value,
-		NULL,
-	},
-};
-
-/****************/
-
 struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
@@ -419,7 +368,6 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_autotest,
 	(cmdline_parse_inst_t *)&cmd_dump,
 	(cmdline_parse_inst_t *)&cmd_dump_one,
-	(cmdline_parse_inst_t *)&cmd_set_ring,
 	(cmdline_parse_inst_t *)&cmd_quit,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx_anchor,
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 3891f5d..666a451 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -78,21 +78,6 @@
  *      - Dequeue one object, two objects, MAX_BULK objects
  *      - Check that dequeued pointers are correct
  *
- *    - Test watermark and default bulk enqueue/dequeue:
- *
- *      - Set watermark
- *      - Set default bulk value
- *      - Enqueue objects, check that -EDQUOT is returned when
- *        watermark is exceeded
- *      - Check that dequeued pointers are correct
- *
- * #. Check live watermark change
- *
- *    - Start a loop on another lcore that will enqueue and dequeue
- *      objects in a ring. It will monitor the value of watermark.
- *    - At the same time, change the watermark on the master lcore.
- *    - The slave lcore will check that watermark changes from 16 to 32.
- *
  * #. Performance tests.
  *
  * Tests done in test_ring_perf.c
@@ -115,123 +100,6 @@ static struct rte_ring *r;
 
 #define	TEST_RING_FULL_EMTPY_ITER	8
 
-static int
-check_live_watermark_change(__attribute__((unused)) void *dummy)
-{
-	uint64_t hz = rte_get_timer_hz();
-	void *obj_table[MAX_BULK];
-	unsigned watermark, watermark_old = 16;
-	uint64_t cur_time, end_time;
-	int64_t diff = 0;
-	int i, ret;
-	unsigned count = 4;
-
-	/* init the object table */
-	memset(obj_table, 0, sizeof(obj_table));
-	end_time = rte_get_timer_cycles() + (hz / 4);
-
-	/* check that bulk and watermark are 4 and 32 (respectively) */
-	while (diff >= 0) {
-
-		/* add in ring until we reach watermark */
-		ret = 0;
-		for (i = 0; i < 16; i ++) {
-			if (ret != 0)
-				break;
-			ret = rte_ring_enqueue_bulk(r, obj_table, count);
-		}
-
-		if (ret != -EDQUOT) {
-			printf("Cannot enqueue objects, or watermark not "
-			       "reached (ret=%d)\n", ret);
-			return -1;
-		}
-
-		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->watermark;
-		if (watermark != watermark_old &&
-		    (watermark_old != 16 || watermark != 32)) {
-			printf("Bad watermark change %u -> %u\n", watermark_old,
-			       watermark);
-			return -1;
-		}
-		watermark_old = watermark;
-
-		/* dequeue objects from ring */
-		while (i--) {
-			ret = rte_ring_dequeue_bulk(r, obj_table, count);
-			if (ret != 0) {
-				printf("Cannot dequeue (ret=%d)\n", ret);
-				return -1;
-			}
-		}
-
-		cur_time = rte_get_timer_cycles();
-		diff = end_time - cur_time;
-	}
-
-	if (watermark_old != 32 ) {
-		printf(" watermark was not updated (wm=%u)\n",
-		       watermark_old);
-		return -1;
-	}
-
-	return 0;
-}
-
-static int
-test_live_watermark_change(void)
-{
-	unsigned lcore_id = rte_lcore_id();
-	unsigned lcore_id2 = rte_get_next_lcore(lcore_id, 0, 1);
-
-	printf("Test watermark live modification\n");
-	rte_ring_set_water_mark(r, 16);
-
-	/* launch a thread that will enqueue and dequeue, checking
-	 * watermark and quota */
-	rte_eal_remote_launch(check_live_watermark_change, NULL, lcore_id2);
-
-	rte_delay_ms(100);
-	rte_ring_set_water_mark(r, 32);
-	rte_delay_ms(100);
-
-	if (rte_eal_wait_lcore(lcore_id2) < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Test for catch on invalid watermark values */
-static int
-test_set_watermark( void ){
-	unsigned count;
-	int setwm;
-
-	struct rte_ring *r = rte_ring_lookup("test_ring_basic_ex");
-	if(r == NULL){
-		printf( " ring lookup failed\n" );
-		goto error;
-	}
-	count = r->size * 2;
-	setwm = rte_ring_set_water_mark(r, count);
-	if (setwm != -EINVAL){
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-
-	count = 0;
-	rte_ring_set_water_mark(r, count);
-	if (r->watermark != r->size) {
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-	return 0;
-
-error:
-	return -1;
-}
-
 /*
  * helper routine for test_ring_basic
  */
@@ -418,8 +286,7 @@ test_ring_basic(void)
 	cur_src = src;
 	cur_dst = dst;
 
-	printf("test watermark and default bulk enqueue / dequeue\n");
-	rte_ring_set_water_mark(r, 20);
+	printf("test default bulk enqueue / dequeue\n");
 	num_elems = 16;
 
 	cur_src = src;
@@ -433,8 +300,8 @@ test_ring_basic(void)
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != -EDQUOT) {
-		printf("Watermark not exceeded\n");
+	if (ret != 0) {
+		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
@@ -930,16 +797,6 @@ test_ring(void)
 		return -1;
 
 	/* basic operations */
-	if (test_live_watermark_change() < 0)
-		return -1;
-
-	if ( test_set_watermark() < 0){
-		printf ("Test failed to detect invalid parameter\n");
-		return -1;
-	}
-	else
-		printf ( "Test detected forced bad watermark values\n");
-
 	if ( test_create_count_odd() < 0){
 			printf ("Test failed to detect odd count\n");
 			return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v5 04/14] ring: remove debug setting
    2017-03-29 13:09  5%         ` [dpdk-dev] [PATCH v5 01/14] ring: remove split cacheline build setting Bruce Richardson
  2017-03-29 13:09  3%         ` [dpdk-dev] [PATCH v5 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
@ 2017-03-29 13:09  2%         ` Bruce Richardson
  2017-03-29 13:09  4%         ` [dpdk-dev] [PATCH v5 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base                     |   1 -
 doc/guides/prog_guide/ring_lib.rst     |   7 -
 doc/guides/rel_notes/release_17_05.rst |   1 +
 lib/librte_ring/rte_ring.c             |  41 ----
 lib/librte_ring/rte_ring.h             |  97 +-------
 test/test/test_ring.c                  | 410 ---------------------------------
 6 files changed, 13 insertions(+), 544 deletions(-)

diff --git a/config/common_base b/config/common_base
index c394651..69e91ae 100644
--- a/config/common_base
+++ b/config/common_base
@@ -452,7 +452,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index 9f69753..d4ab502 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -110,13 +110,6 @@ Once an enqueue operation reaches the high water mark, the producer is notified,
 
 This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
 
-Debug
-~~~~~
-
-When debug is enabled (CONFIG_RTE_LIBRTE_RING_DEBUG is set),
-the library stores some per-ring statistic counters about the number of enqueues/dequeues.
-These statistics are per-core to avoid concurrent accesses or atomic operations.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 8b66ac3..50123c2 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -133,6 +133,7 @@ API Changes
   have been made to it:
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+  * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 93485d4..934ce87 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -131,12 +131,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 			  RTE_CACHE_LINE_MASK) != 0);
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_LIBRTE_RING_DEBUG
-	RTE_BUILD_BUG_ON((sizeof(struct rte_ring_debug_stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
@@ -284,11 +278,6 @@ rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
 {
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats sum;
-	unsigned lcore_id;
-#endif
-
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
 	fprintf(f, "  size=%"PRIu32"\n", r->size);
@@ -302,36 +291,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		fprintf(f, "  watermark=0\n");
 	else
 		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
-
-	/* sum and dump statistics */
-#ifdef RTE_LIBRTE_RING_DEBUG
-	memset(&sum, 0, sizeof(sum));
-	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
-		sum.enq_success_bulk += r->stats[lcore_id].enq_success_bulk;
-		sum.enq_success_objs += r->stats[lcore_id].enq_success_objs;
-		sum.enq_quota_bulk += r->stats[lcore_id].enq_quota_bulk;
-		sum.enq_quota_objs += r->stats[lcore_id].enq_quota_objs;
-		sum.enq_fail_bulk += r->stats[lcore_id].enq_fail_bulk;
-		sum.enq_fail_objs += r->stats[lcore_id].enq_fail_objs;
-		sum.deq_success_bulk += r->stats[lcore_id].deq_success_bulk;
-		sum.deq_success_objs += r->stats[lcore_id].deq_success_objs;
-		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
-		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
-	}
-	fprintf(f, "  size=%"PRIu32"\n", r->size);
-	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
-	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
-	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
-	fprintf(f, "  enq_quota_objs=%"PRIu64"\n", sum.enq_quota_objs);
-	fprintf(f, "  enq_fail_bulk=%"PRIu64"\n", sum.enq_fail_bulk);
-	fprintf(f, "  enq_fail_objs=%"PRIu64"\n", sum.enq_fail_objs);
-	fprintf(f, "  deq_success_bulk=%"PRIu64"\n", sum.deq_success_bulk);
-	fprintf(f, "  deq_success_objs=%"PRIu64"\n", sum.deq_success_objs);
-	fprintf(f, "  deq_fail_bulk=%"PRIu64"\n", sum.deq_fail_bulk);
-	fprintf(f, "  deq_fail_objs=%"PRIu64"\n", sum.deq_fail_objs);
-#else
-	fprintf(f, "  no statistics available\n");
-#endif
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index d650215..2777b41 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -109,24 +109,6 @@ enum rte_ring_queue_behavior {
 	RTE_RING_QUEUE_VARIABLE   /* Enq/Deq as many items as possible from ring */
 };
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-/**
- * A structure that stores the ring statistics (per-lcore).
- */
-struct rte_ring_debug_stats {
-	uint64_t enq_success_bulk; /**< Successful enqueues number. */
-	uint64_t enq_success_objs; /**< Objects successfully enqueued. */
-	uint64_t enq_quota_bulk;   /**< Successful enqueues above watermark. */
-	uint64_t enq_quota_objs;   /**< Objects enqueued above watermark. */
-	uint64_t enq_fail_bulk;    /**< Failed enqueues number. */
-	uint64_t enq_fail_objs;    /**< Objects that failed to be enqueued. */
-	uint64_t deq_success_bulk; /**< Successful dequeues number. */
-	uint64_t deq_success_objs; /**< Objects successfully dequeued. */
-	uint64_t deq_fail_bulk;    /**< Failed dequeues number. */
-	uint64_t deq_fail_objs;    /**< Objects that failed to be dequeued. */
-} __rte_cache_aligned;
-#endif
-
 #define RTE_RING_MZ_PREFIX "RG_"
 /**< The maximum length of a ring name. */
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
@@ -184,10 +166,6 @@ struct rte_ring {
 	/** Ring consumer status. */
 	struct rte_ring_headtail cons __rte_aligned(CONS_ALIGN);
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-#endif
-
 	void *ring[] __rte_cache_aligned;   /**< Memory space of ring starts here.
 	                                     * not volatile so need to be careful
 	                                     * about compiler re-ordering */
@@ -199,27 +177,6 @@ struct rte_ring {
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
- * @internal When debug is enabled, store ring statistics.
- * @param r
- *   A pointer to the ring.
- * @param name
- *   The name of the statistics field to increment in the ring.
- * @param n
- *   The number to add to the object-oriented statistics.
- */
-#ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {                        \
-		unsigned __lcore_id = rte_lcore_id();           \
-		if (__lcore_id < RTE_MAX_LCORE) {               \
-			r->stats[__lcore_id].name##_objs += n;  \
-			r->stats[__lcore_id].name##_bulk += 1;  \
-		}                                               \
-	} while(0)
-#else
-#define __RING_STAT_ADD(r, name, n) do {} while(0)
-#endif
-
-/**
  * Calculate the memory size needed for a ring
  *
  * This function returns the number of bytes needed for a ring, given
@@ -460,17 +417,12 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOBUFS;
-			}
 			else {
 				/* No free entry available */
-				if (unlikely(free_entries == 0)) {
-					__RING_STAT_ADD(r, enq_fail, n);
+				if (unlikely(free_entries == 0))
 					return 0;
-				}
-
 				n = free_entries;
 			}
 		}
@@ -485,15 +437,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	/*
 	 * If there are other enqueues in progress that preceded us,
@@ -557,17 +505,12 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, enq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOBUFS;
-		}
 		else {
 			/* No free entry available */
-			if (unlikely(free_entries == 0)) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (unlikely(free_entries == 0))
 				return 0;
-			}
-
 			n = free_entries;
 		}
 	}
@@ -580,15 +523,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	r->prod.tail = prod_next;
 	return ret;
@@ -652,16 +591,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOENT;
-			}
 			else {
-				if (unlikely(entries == 0)){
-					__RING_STAT_ADD(r, deq_fail, n);
+				if (unlikely(entries == 0))
 					return 0;
-				}
-
 				n = entries;
 			}
 		}
@@ -691,7 +625,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 			sched_yield();
 		}
 	}
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -738,16 +671,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	entries = prod_tail - cons_head;
 
 	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, deq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOENT;
-		}
 		else {
-			if (unlikely(entries == 0)){
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (unlikely(entries == 0))
 				return 0;
-			}
-
 			n = entries;
 		}
 	}
@@ -759,7 +687,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	DEQUEUE_PTRS();
 	rte_smp_rmb();
 
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
 }
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 5f09097..3891f5d 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -763,412 +763,6 @@ test_ring_burst_basic(void)
 	return -1;
 }
 
-static int
-test_ring_stats(void)
-{
-
-#ifndef RTE_LIBRTE_RING_DEBUG
-	printf("Enable RTE_LIBRTE_RING_DEBUG to test ring stats.\n");
-	return 0;
-#else
-	void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
-	int ret;
-	unsigned i;
-	unsigned num_items            = 0;
-	unsigned failed_enqueue_ops   = 0;
-	unsigned failed_enqueue_items = 0;
-	unsigned failed_dequeue_ops   = 0;
-	unsigned failed_dequeue_items = 0;
-	unsigned last_enqueue_ops     = 0;
-	unsigned last_enqueue_items   = 0;
-	unsigned last_quota_ops       = 0;
-	unsigned last_quota_items     = 0;
-	unsigned lcore_id = rte_lcore_id();
-	struct rte_ring_debug_stats *ring_stats = &r->stats[lcore_id];
-
-	printf("Test the ring stats.\n");
-
-	/* Reset the watermark in case it was set in another test. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Allocate some dummy object pointers. */
-	src = malloc(RING_SIZE*2*sizeof(void *));
-	if (src == NULL)
-		goto fail;
-
-	for (i = 0; i < RING_SIZE*2 ; i++) {
-		src[i] = (void *)(unsigned long)i;
-	}
-
-	/* Allocate some memory for copied objects. */
-	dst = malloc(RING_SIZE*2*sizeof(void *));
-	if (dst == NULL)
-		goto fail;
-
-	memset(dst, 0, RING_SIZE*2*sizeof(void *));
-
-	/* Set the head and tail pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	/* Do Enqueue tests. */
-	printf("Test the dequeue stats.\n");
-
-	/* Fill the ring up to RING_SIZE -1. */
-	printf("Fill the ring.\n");
-	for (i = 0; i< (RING_SIZE/MAX_BULK); i++) {
-		rte_ring_sp_enqueue_burst(r, cur_src, MAX_BULK);
-		cur_src += MAX_BULK;
-	}
-
-	/* Adjust for final enqueue = MAX_BULK -1. */
-	cur_src--;
-
-	printf("Verify that the ring is full.\n");
-	if (rte_ring_full(r) != 1)
-		goto fail;
-
-
-	printf("Verify the enqueue success stats.\n");
-	/* Stats should match above enqueue operations to fill the ring. */
-	if (ring_stats->enq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Current max objects is RING_SIZE -1. */
-	if (ring_stats->enq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any failures yet. */
-	if (ring_stats->enq_fail_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_fail_objs != 0)
-		goto fail;
-
-
-	printf("Test stats for SP burst enqueue to a full ring.\n");
-	num_items = 2;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for SP bulk enqueue to a full ring.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP burst enqueue to a full ring.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP bulk enqueue to a full ring.\n");
-	num_items = 16;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	/* Do Dequeue tests. */
-	printf("Test the dequeue stats.\n");
-
-	printf("Empty the ring.\n");
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* There was only RING_SIZE -1 objects to dequeue. */
-	cur_dst++;
-
-	printf("Verify ring is empty.\n");
-	if (1 != rte_ring_empty(r))
-		goto fail;
-
-	printf("Verify the dequeue success stats.\n");
-	/* Stats should match above dequeue operations. */
-	if (ring_stats->deq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Objects dequeued is RING_SIZE -1. */
-	if (ring_stats->deq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any dequeue failure stats yet. */
-	if (ring_stats->deq_fail_bulk != 0)
-		goto fail;
-
-	printf("Test stats for SC burst dequeue with an empty ring.\n");
-	num_items = 2;
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for SC bulk dequeue with an empty ring.\n");
-	num_items = 4;
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC burst dequeue with an empty ring.\n");
-	num_items = 8;
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC bulk dequeue with an empty ring.\n");
-	num_items = 16;
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test total enqueue/dequeue stats.\n");
-	/* At this point the enqueue and dequeue stats should be the same. */
-	if (ring_stats->enq_success_bulk != ring_stats->deq_success_bulk)
-		goto fail;
-	if (ring_stats->enq_success_objs != ring_stats->deq_success_objs)
-		goto fail;
-	if (ring_stats->enq_fail_bulk    != ring_stats->deq_fail_bulk)
-		goto fail;
-	if (ring_stats->enq_fail_objs    != ring_stats->deq_fail_objs)
-		goto fail;
-
-
-	/* Watermark Tests. */
-	printf("Test the watermark/quota stats.\n");
-
-	printf("Verify the initial watermark stats.\n");
-	/* Watermark stats should be 0 since there is no watermark. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Set a watermark. */
-	rte_ring_set_water_mark(r, 16);
-
-	/* Reset pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue below watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should still be 0. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Success stats should have increased. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops + 1)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items + num_items)
-		goto fail;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue at watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != 1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP burst enqueue above watermark.\n");
-	num_items = 1;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP burst enqueue above watermark.\n");
-	num_items = 2;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP bulk enqueue above watermark.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP bulk enqueue above watermark.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	printf("Test watermark success stats.\n");
-	/* Success stats should be same as last non-watermarked enqueue. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items)
-		goto fail;
-
-
-	/* Cleanup. */
-
-	/* Empty the ring. */
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* Reset the watermark. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Free memory before test completed */
-	free(src);
-	free(dst);
-	return 0;
-
-fail:
-	free(src);
-	free(dst);
-	return -1;
-#endif
-}
-
 /*
  * it will always fail to create ring with a wrong ring size number in this function
  */
@@ -1335,10 +929,6 @@ test_ring(void)
 	if (test_ring_basic() < 0)
 		return -1;
 
-	/* ring stats */
-	if (test_ring_stats() < 0)
-		return -1;
-
 	/* basic operations */
 	if (test_live_watermark_change() < 0)
 		return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v5 01/14] ring: remove split cacheline build setting
  @ 2017-03-29 13:09  5%         ` Bruce Richardson
  2017-03-29 13:09  3%         ` [dpdk-dev] [PATCH v5 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure.  Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V2: Limit the cacheline * 2 alignment to platforms with < 128B line size
---
 config/common_base                     |  1 -
 doc/guides/rel_notes/release_17_05.rst |  7 +++++++
 lib/librte_ring/rte_ring.c             |  2 --
 lib/librte_ring/rte_ring.h             | 16 ++++++++++------
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/config/common_base b/config/common_base
index 37aa1e1..c394651 100644
--- a/config/common_base
+++ b/config/common_base
@@ -453,7 +453,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 #
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
-CONFIG_RTE_RING_SPLIT_PROD_CONS=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2a045b3..8b66ac3 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -127,6 +127,13 @@ API Changes
 * The LPM ``next_hop`` field is extended from 8 bits to 21 bits for IPv6
   while keeping ABI compatibility.
 
+* **Reworked rte_ring library**
+
+  The rte_ring library has been reworked and updated. The following changes
+  have been made to it:
+
+  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+
 ABI Changes
 -----------
 
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index ca0a108..4bc6da1 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_RING_SPLIT_PROD_CONS
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
 #ifdef RTE_LIBRTE_RING_DEBUG
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 72ccca5..399ae3b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -139,6 +139,14 @@ struct rte_ring_debug_stats {
 
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
+#if RTE_CACHE_LINE_SIZE < 128
+#define PROD_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#define CONS_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#else
+#define PROD_ALIGN RTE_CACHE_LINE_SIZE
+#define CONS_ALIGN RTE_CACHE_LINE_SIZE
+#endif
+
 /**
  * An RTE ring structure.
  *
@@ -168,7 +176,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Producer head. */
 		volatile uint32_t tail;  /**< Producer tail. */
-	} prod __rte_cache_aligned;
+	} prod __rte_aligned(PROD_ALIGN);
 
 	/** Ring consumer status. */
 	struct cons {
@@ -177,11 +185,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Consumer head. */
 		volatile uint32_t tail;  /**< Consumer tail. */
-#ifdef RTE_RING_SPLIT_PROD_CONS
-	} cons __rte_cache_aligned;
-#else
-	} cons;
-#endif
+	} cons __rte_aligned(CONS_ALIGN);
 
 #ifdef RTE_LIBRTE_RING_DEBUG
 	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-- 
2.9.3

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v5 03/14] ring: eliminate duplication of size and mask fields
    2017-03-29 13:09  5%         ` [dpdk-dev] [PATCH v5 01/14] ring: remove split cacheline build setting Bruce Richardson
@ 2017-03-29 13:09  3%         ` Bruce Richardson
  2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 04/14] ring: remove debug setting Bruce Richardson
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-29 13:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, Bruce Richardson

The size and mask fields are duplicated in both the producer and
consumer data structures. Move them out of that into the top level
structure so they are not duplicated.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_ring/rte_ring.c | 20 ++++++++++----------
 lib/librte_ring/rte_ring.h | 32 ++++++++++++++++----------------
 test/test/test_ring.c      |  6 +++---
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 93a8692..93485d4 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -144,11 +144,11 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->prod.watermark = count;
+	r->watermark = count;
 	r->prod.single = !!(flags & RING_F_SP_ENQ);
 	r->cons.single = !!(flags & RING_F_SC_DEQ);
-	r->prod.size = r->cons.size = count;
-	r->prod.mask = r->cons.mask = count-1;
+	r->size = count;
+	r->mask = count - 1;
 	r->prod.head = r->cons.head = 0;
 	r->prod.tail = r->cons.tail = 0;
 
@@ -269,14 +269,14 @@ rte_ring_free(struct rte_ring *r)
 int
 rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 {
-	if (count >= r->prod.size)
+	if (count >= r->size)
 		return -EINVAL;
 
 	/* if count is 0, disable the watermarking */
 	if (count == 0)
-		count = r->prod.size;
+		count = r->size;
 
-	r->prod.watermark = count;
+	r->watermark = count;
 	return 0;
 }
 
@@ -291,17 +291,17 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  ct=%"PRIu32"\n", r->cons.tail);
 	fprintf(f, "  ch=%"PRIu32"\n", r->cons.head);
 	fprintf(f, "  pt=%"PRIu32"\n", r->prod.tail);
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->prod.watermark == r->prod.size)
+	if (r->watermark == r->size)
 		fprintf(f, "  watermark=0\n");
 	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->prod.watermark);
+		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 
 	/* sum and dump statistics */
 #ifdef RTE_LIBRTE_RING_DEBUG
@@ -318,7 +318,7 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
 		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
 	}
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
 	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
 	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 331c94f..d650215 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -151,10 +151,7 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 struct rte_ring_headtail {
 	volatile uint32_t head;  /**< Prod/consumer head. */
 	volatile uint32_t tail;  /**< Prod/consumer tail. */
-	uint32_t size;           /**< Size of ring. */
-	uint32_t mask;           /**< Mask (size-1) of ring. */
 	uint32_t single;         /**< True if single prod/cons */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 };
 
 /**
@@ -174,9 +171,12 @@ struct rte_ring {
 	 * next time the ABI changes
 	 */
 	char name[RTE_MEMZONE_NAMESIZE];    /**< Name of the ring. */
-	int flags;                       /**< Flags supplied at creation. */
+	int flags;               /**< Flags supplied at creation. */
 	const struct rte_memzone *memzone;
 			/**< Memzone, if any, containing the rte_ring */
+	uint32_t size;           /**< Size of ring. */
+	uint32_t mask;           /**< Mask (size-1) of ring. */
+	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -355,7 +355,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * Placed here since identical code needed in both
  * single and multi producer enqueue functions */
 #define ENQUEUE_PTRS() do { \
-	const uint32_t size = r->prod.size; \
+	const uint32_t size = r->size; \
 	uint32_t idx = prod_head & mask; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & ((~(unsigned)0x3))); i+=4, idx+=4) { \
@@ -382,7 +382,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * single and multi consumer dequeue functions */
 #define DEQUEUE_PTRS() do { \
 	uint32_t idx = cons_head & mask; \
-	const uint32_t size = r->cons.size; \
+	const uint32_t size = r->size; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & (~(unsigned)0x3)); i+=4, idx+=4) {\
 			obj_table[i] = r->ring[idx]; \
@@ -437,7 +437,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -485,7 +485,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -544,7 +544,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	prod_head = r->prod.head;
@@ -580,7 +580,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -630,7 +630,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -727,7 +727,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
 	prod_tail = r->prod.tail;
@@ -1056,7 +1056,7 @@ rte_ring_full(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return ((cons_tail - prod_tail - 1) & r->prod.mask) == 0;
+	return ((cons_tail - prod_tail - 1) & r->mask) == 0;
 }
 
 /**
@@ -1089,7 +1089,7 @@ rte_ring_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (prod_tail - cons_tail) & r->prod.mask;
+	return (prod_tail - cons_tail) & r->mask;
 }
 
 /**
@@ -1105,7 +1105,7 @@ rte_ring_free_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (cons_tail - prod_tail - 1) & r->prod.mask;
+	return (cons_tail - prod_tail - 1) & r->mask;
 }
 
 /**
@@ -1119,7 +1119,7 @@ rte_ring_free_count(const struct rte_ring *r)
 static inline unsigned int
 rte_ring_get_size(const struct rte_ring *r)
 {
-	return r->prod.size;
+	return r->size;
 }
 
 /**
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index ebcb896..5f09097 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -148,7 +148,7 @@ check_live_watermark_change(__attribute__((unused)) void *dummy)
 		}
 
 		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->prod.watermark;
+		watermark = r->watermark;
 		if (watermark != watermark_old &&
 		    (watermark_old != 16 || watermark != 32)) {
 			printf("Bad watermark change %u -> %u\n", watermark_old,
@@ -213,7 +213,7 @@ test_set_watermark( void ){
 		printf( " ring lookup failed\n" );
 		goto error;
 	}
-	count = r->prod.size*2;
+	count = r->size * 2;
 	setwm = rte_ring_set_water_mark(r, count);
 	if (setwm != -EINVAL){
 		printf("Test failed to detect invalid watermark count value\n");
@@ -222,7 +222,7 @@ test_set_watermark( void ){
 
 	count = 0;
 	rte_ring_set_water_mark(r, count);
-	if (r->prod.watermark != r->prod.size) {
+	if (r->watermark != r->size) {
 		printf("Test failed to detect invalid watermark count value\n");
 		goto error;
 	}
-- 
2.9.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper function
  2017-03-29  0:52  3%         ` Lu, Wenzhuo
@ 2017-03-29  9:11  0%           ` Iremonger, Bernard
  0 siblings, 0 replies; 200+ results
From: Iremonger, Bernard @ 2017-03-29  9:11 UTC (permalink / raw)
  To: Lu, Wenzhuo, dev, Xing, Beilei, Wu, Jingjing; +Cc: Zhang, Helin, Stroe, Laura

Hi Wenzhuo,

> -----Original Message-----
> From: Lu, Wenzhuo
> Sent: Wednesday, March 29, 2017 1:53 AM
> To: Iremonger, Bernard <bernard.iremonger@intel.com>; dev@dpdk.org;
> Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; Stroe, Laura
> <laura.stroe@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper function
> 
> Hi Bernard,
> 
> > -----Original Message-----
> > From: Iremonger, Bernard
> > Sent: Tuesday, March 28, 2017 9:23 PM
> > To: Iremonger, Bernard; Lu, Wenzhuo; dev@dpdk.org; Xing, Beilei; Wu,
> > Jingjing
> > Cc: Zhang, Helin; Stroe, Laura
> > Subject: RE: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper
> > function
> >
> > Hi Wenzhuo,
> >
> > <snip>
> >
> > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bernard
> > > > Iremonger
> > > > > Sent: Friday, March 24, 2017 12:39 AM
> > > > > To: dev@dpdk.org; Xing, Beilei; Wu, Jingjing
> > > > > Cc: Zhang, Helin; Iremonger, Bernard; Stroe, Laura
> > > > > Subject: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper
> > > > > function
> > > > >
> > > > > Add i40e_dev_cloud_filter_qinq function, and call it from
> > > > > i40e_dev_consistent_tunnel_filter_set function.
> > > > > Replace filter 0x1 with QinQ filter.
> > > > Better not use 0x1 here. Maybe it should be change to outer IP filter?
> > > Ok, I will change to out IP filter.
> > > > And seems better to add more explanation about what's QinQ filter
> here.
> > > Ok, I will add some information.
> > > >
> > > > > +
> > > > > +/* Create a QinQ cloud filter
> > > > > + *
> > > > > + * The Fortville NIC has limited resources for tunnel filters,
> > > > > + * so we can only reuse existing filters.
> > > > > + *
> > > > > + * In step 1 we define which Field Vector fields can be used
> > > > > +for
> > > > > + * filter types.
> > > > > + * As we do not have the inner tag defined as a field,
> > > > > + * we have to define it first, by reusing one of L1 entries.
> > > > > + *
> > > > > + * In step 2 we are replacing one of existing filter types with
> > > > > + * a new one for QinQ.
> > > > > + * As we reusing L1 and replacing L2, some of the default
> > > > > +filter
> > > > > + * types will disappear,which depends on L1 and L2 entries we reuse.
> > > > > + *
> > > > > + * Step 1: Create L1 filter of outer vlan (12b) + inner vlan
> > > > > +(12b)
> > > > > + *
> > > > > + * 1.	Create L1 filter of outer vlan (12b) which will be in use
> > > > > + *		later when we define the cloud filter.
> > > > > + *	a.	Valid_flags.replace_cloud = 0
> > > > > + *	b.	Old_filter = 10 (Stag_Inner_Vlan)
> > > > > + *	c.	New_filter = 0x10
> > > > > + *	d.	TR bit = 0xff (optional, not used here)
> > > > > + *	e.	Buffer â€“ 2 entries:
> > > > > + *		i.	Byte0 = 8 (outer vlan FV index). Byte1 =0 (rsv)
> > Byte 2-
> > > > > 3 = 0x0fff
> > > > > + *		ii.	Byte0 = 37 (inner vlan FV index). Byte1 =0
> (rsv)
> > Byte
> > > > > 2-3 = 0x0fff
> > > > > + *
> > > > > + * Step 2:
> > > > > + * 2.	Create cloud filter using two L1 filters entries: stag and
> > > > > + *		new filter(outer vlan+ inner vlan)
> > > > > + *	a.	Valid_flags.replace_cloud = 1
> > > > > + *	b.	Old_filter = 1 (instead of outer IP)
> > > > > + *	c.	New_filter = 0x10
> > > > > + *	d.	Buffer â€“ 2 entries:
> > > > > + *		i.	Byte0 = 0x80 | 7 (valid | Stag). Byte13 = 0 (rsv)
> > > > > + *		ii.	Byte8 = 0x80 | 0x10 (valid | new l1 filter
> > step1).
> > > > > Byte9-11 = 0 (rsv)
> > > > > + */
> > > > > +static int
> > > > > +i40e_dev_cloud_filter_qinq(struct i40e_pf *pf) {
> > > > Is it better to change the function name to
> > > > i40e_dev_create_cloud_filter_qinq?
> >
> > Ok, how about i40e_cloud_filter_qinq_create()?
> > It is more in line with the existing code.
> It's fine to me:)
> 
> >
> > > >
> > > > > diff --git a/drivers/net/i40e/i40e_ethdev.h
> > > > > b/drivers/net/i40e/i40e_ethdev.h index 934c679..020c5a2 100644
> > > > > --- a/drivers/net/i40e/i40e_ethdev.h
> > > > > +++ b/drivers/net/i40e/i40e_ethdev.h
> > > > >
> > > > >  /**
> > > > > + * filter type of tunneling packet  */ #define
> > > > > +ETH_TUNNEL_FILTER_OMAC  0x01 /**< filter by outer MAC
> > > addr
> > > > */
> > > > > +#define ETH_TUNNEL_FILTER_OIP   0x02 /**< filter by outer IP Addr
> */
> > > > > +#define ETH_TUNNEL_FILTER_TENID 0x04 /**< filter by tenant ID
> > > > > +*/ #define ETH_TUNNEL_FILTER_IMAC  0x08 /**< filter by inner
> > > > > +MAC addr
> > > > */
> > > > > +#define ETH_TUNNEL_FILTER_IVLAN 0x10 /**< filter by inner VLAN
> > > > > +ID
> > > */
> > > > > +#define ETH_TUNNEL_FILTER_IIP   0x20 /**< filter by inner IP addr
> */
> > > > Seems not necessary to redefine these macros.
> > > >
> > > > > +#define ETH_TUNNEL_FILTER_OVLAN 0x40 /**< filter by outer
> VLAN
> > > > > +ID
> > > > */
> > > > > +
> > > > > +#define I40E_TUNNEL_FILTER_IMAC_IVLAN
> > > (ETH_TUNNEL_FILTER_IMAC
> > > > | \
> > > > > +					ETH_TUNNEL_FILTER_IVLAN)
> > > > > +#define I40E_TUNNEL_FILTER_IMAC_IVLAN_TENID
> > > > > (ETH_TUNNEL_FILTER_IMAC | \
> > > > > +					ETH_TUNNEL_FILTER_IVLAN
> | \
> > > > > +					ETH_TUNNEL_FILTER_TENID)
> > > > > +#define I40E_TUNNEL_FILTER_IMAC_TENID
> > > (ETH_TUNNEL_FILTER_IMAC
> > > > | \
> > > > > +					ETH_TUNNEL_FILTER_TENID)
> > > > > +#define I40E_TUNNEL_FILTER_OMAC_TENID_IMAC
> > > > > (ETH_TUNNEL_FILTER_OMAC | \
> > > > > +					ETH_TUNNEL_FILTER_TENID
> | \
> > > > > +					ETH_TUNNEL_FILTER_IMAC)
> > > > I don't think it's necessary the redefine these macros either.
> > > > Maybe we only need to add I40E_TUNNEL_FILTER_CUSTOM_QINQ.
> > >
> > > I will check this.
> >
> > These are now used in the i40e_ethdev code replacing RTE_* macros.
> I think these macros are not NIC specific. Better define them in RTE. But if it's
> not the good time to add them to RTE, maybe the concern of ABI change. We
> need only define ETH_TUNNEL_FILTER_OVLAN and
> I40E_TUNNEL_FILTER_OMAC_TENID_IMAC here.

If I just define the above filters, then the following switch statement has mixed RTE and I40E macros, it seems cleaner to me to use all  I40E macros. I can change back to mixed macros if you think it is better.

static int
i40e_dev_get_filter_type(uint16_t filter_type, uint16_t *flag)
{
	switch (filter_type) {
	case I40E_TUNNEL_FILTER_IMAC_IVLAN:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN;
		break;
	case I40E_TUNNEL_FILTER_IMAC_IVLAN_TENID:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN_TEN_ID;
		break;
	case I40E_TUNNEL_FILTER_CUSTOM_QINQ:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_CUSTOM_QINQ;
		break;
	case I40E_TUNNEL_FILTER_IMAC_TENID:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_TEN_ID;
		break;
	case I40E_TUNNEL_FILTER_OMAC_TENID_IMAC:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_OMAC_TEN_ID_IMAC;
		break;
	case ETH_TUNNEL_FILTER_IMAC:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC;
		break;
	case ETH_TUNNEL_FILTER_OIP:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_OIP;
		break;
	case ETH_TUNNEL_FILTER_IIP:
		*flag = I40E_AQC_ADD_CLOUD_FILTER_IIP;
		break;
	default:
		PMD_DRV_LOG(ERR, "invalid tunnel filter type");
		return -EINVAL;
	}

	return 0;
> 
> >
> > > >
> > > > > +#define I40E_TUNNEL_FILTER_CUSTOM_QINQ
> > > > (ETH_TUNNEL_FILTER_OMAC
> > > > > | \
> > > > > +					ETH_TUNNEL_FILTER_OVLAN
> |
> > > > > ETH_TUNNEL_FILTER_IVLAN)
> > > > > +
> > > > > +/**
> > > > >   * Tunneling Packet filter configuration.
> > > > >   */
> > > > >  struct i40e_tunnel_filter_conf {
> > > > > --
> > > > > 2.10.1
> > >
> > > Thanks for the review.
> > >
> > > Regards,
> > >
> > > Bernard.

Regards,

Bernard.


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v6 1/7] net/ark: PMD for Atomic Rules Arkville driver stub
  2017-03-23  1:03  3% [dpdk-dev] [PATCH v4 " Ed Czeck
  2017-03-23 22:59  3% ` [dpdk-dev] [PATCH v5 " Ed Czeck
@ 2017-03-29  1:04  3% ` Ed Czeck
  2017-03-29 21:32  3%   ` [dpdk-dev] [PATCH v7 1/7] net/ark: stub PMD for Atomic Rules Arkville Ed Czeck
  1 sibling, 1 reply; 200+ results
From: Ed Czeck @ 2017-03-29  1:04 UTC (permalink / raw)
  To: dev; +Cc: john.miller, shepard.siegel, ferruh.yigit, stephen, Ed Czeck

Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time

v6:
* Address review comments
* Unify messaging, logging and debug macros to ark_logs.h

v5:
* Address comments from Ferruh Yigit <ferruh.yigit@intel.com>
* Added documentation on driver args
* Makefile fixes
* Safe argument processing
* vdev args to dev args

v4:
* Address issues report from review
* Add internal comments on driver arg
* provide a bare-bones dev init to avoid compiler warnings

v3:
* Split large patch into several smaller ones

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
---
 MAINTAINERS                                 |   8 +
 config/common_base                          |  10 +
 config/defconfig_arm-armv7a-linuxapp-gcc    |   1 +
 config/defconfig_ppc_64-power8-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                     | 296 ++++++++++++++++++++++++++++
 doc/guides/nics/index.rst                   |   1 +
 drivers/net/Makefile                        |   2 +
 drivers/net/ark/Makefile                    |  62 ++++++
 drivers/net/ark/ark_ethdev.c                | 288 +++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h                |  41 ++++
 drivers/net/ark/ark_global.h                | 116 +++++++++++
 drivers/net/ark/ark_logs.h                  | 119 +++++++++++
 drivers/net/ark/rte_pmd_ark_version.map     |   4 +
 mk/rte.app.mk                               |   1 +
 14 files changed, 950 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/ark_logs.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 0b1524d..ec67c71 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -277,6 +277,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ARK
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index 37aa1e1..4feb5e4 100644
--- a/config/common_base
+++ b/config/common_base
@@ -353,6 +353,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+#
 # Compile the TAP PMD
 # It is enabled by default for Linux only.
 #
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 35f7fb6..89bc396 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
 # Note: Initially, all of the PMD drivers compilation are turned off on Power
 # Will turn on them only after the successful testing on Power
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..064ed11
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,296 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+The ARK PMD also supports optional user extensions, through dynamic linking.
+The ARK PMD user extensions are a feature of Arkville’s DPDK
+net/ark poll mode driver, allowing users to add their
+own code to extend the net/ark functionality without
+having to make source code changes to the driver. One motivation for
+this capability is that while DPDK provides a rich set of functions
+to interact with NIC-like capabilities (e.g. MAC addresses and statistics),
+the Arkville RTL IP does not include a MAC.  Users can supply their
+own MAC or custom FPGA applications, which may require control from
+the PMD.  The user extension is the means providing the control
+between the user's FPGA application and the existing DPDK features via
+the PMD.
+
+Device Parameters
+-------------------
+
+The ARK PMD supports device parameters that are used for packet
+routing and for internal packet generation and packet checking.  This
+section describes the supported parameters.  These features are
+primarily used for diagnostics, testing, and performance verification
+under the guidance of an Arkville specialist.  The nominal use of
+Arkville does not require any configuration using these parameters.
+
+"Pkt_dir"
+
+The Packet Director controls connectivity between Arkville's internal
+hardware components. The features of the Pkt_dir are only used for
+diagnostics and testing; it is not intended for nominal use.  The full
+set of features are not published at this level.
+
+Format:
+Pkt_dir=0x00110F10
+
+"Pkt_gen"
+
+The packet generator parameter takes a file as its argument.  The file
+contains configuration parameters used internally for regression
+testing and are not intended to be published at this level.  The
+packet generator is an internal Arkville hardware component.
+
+Format:
+Pkt_gen=./config/pg.conf
+
+"Pkt_chkr"
+
+The packet checker parameter takes a file as its argument.  The file
+contains configuration parameters used internally for regression
+testing and are not intended to be published at this level.  The
+packet checker is an internal Arkville hardware component.
+
+Format:
+Pkt_chkr=./config/pc.conf
+
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_PAD_TX** (default y):  When enabled TX
+     packets are padded to 60 bytes to support downstream MACS.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK
+Release Notes*.  ARM and PowerPC architectures are not supported at this time.
+
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 87f9334..381d82c 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     bnx2x
     bnxt
     cxgbe
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 5b0bc34..0e3f7d7 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -36,6 +36,8 @@ core-libs += librte_net librte_kvargs
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
 DEPDIRS-af_packet = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
+DEPDIRS-ark = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DEPDIRS-bnx2x = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..afe69c4
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,62 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS) -Werror
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_kvargs
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mempool
+
+LDLIBS += -lpthread
+LDLIBS += -ldl
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..fa5c9aa
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,288 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+
+#include "ark_global.h"
+#include "ark_logs.h"
+#include "ark_ethdev.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(struct ark_adapter *ark, const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+				 struct rte_eth_dev_info *dev_info);
+
+/*
+ * The packet generator is a functional block used to generate packet
+ * patterns for testing.  It is not intended for nominal use.
+ */
+#define ARK_PKTGEN_ARG "Pkt_gen"
+
+/*
+ * The packet checker is a functional block used to verify packet
+ * patterns for testing.  It is not intended for nominal use.
+ */
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+
+/*
+ * The packet director is used to select the internal ingress and
+ * egress packets paths during testing.  It is not intended for
+ * nominal use.
+ */
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+/* Devinfo configurations */
+#define ARK_RX_MAX_QUEUE (4096 * 4)
+#define ARK_RX_MIN_QUEUE (512)
+#define ARK_RX_MAX_PKT_LEN ((16 * 1024) - 128)
+#define ARK_RX_MIN_BUFSIZE (1024)
+
+#define ARK_TX_MAX_QUEUE (4096 * 4)
+#define ARK_TX_MIN_QUEUE (256)
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	NULL
+};
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
+	},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_infos_get = eth_ark_dev_info_get,
+};
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret = -1;
+
+	ark->eth_dev = dev;
+
+	PMD_FUNC_LOG(DEBUG, "\n");
+
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	ark->bar0 = (uint8_t *)pci_dev->mem_resource[0].addr;
+	ark->a_bar = (uint8_t *)pci_dev->mem_resource[2].addr;
+
+	dev->dev_ops = &ark_eth_dev_ops;
+	dev->data->dev_flags |= RTE_ETH_DEV_DETACHABLE;
+
+	if (pci_dev->device.devargs)
+		ret = eth_ark_check_args(ark, pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+	return ret;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	PMD_FUNC_LOG(DEBUG, "\n");
+	return 0;
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+		     struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_rx_pktlen = ARK_RX_MAX_PKT_LEN;
+	dev_info->min_rx_bufsize = ARK_RX_MIN_BUFSIZE;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_RX_MAX_QUEUE,
+		.nb_min = ARK_RX_MIN_QUEUE,
+		.nb_align = ARK_RX_MIN_QUEUE}; /* power of 2 */
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_TX_MAX_QUEUE,
+		.nb_min = ARK_TX_MIN_QUEUE,
+		.nb_align = ARK_TX_MIN_QUEUE}; /* power of 2 */
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
+				ETH_LINK_SPEED_10G |
+				ETH_LINK_SPEED_25G |
+				ETH_LINK_SPEED_40G |
+				ETH_LINK_SPEED_50G |
+				ETH_LINK_SPEED_100G);
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+		   void *extra_args)
+{
+	PMD_FUNC_LOG(DEBUG, "key = %s, value = %s\n",
+		    key, value);
+	struct ark_adapter *ark =
+		(struct ark_adapter *)extra_args;
+
+	ark->pkt_dir_v = strtol(value, NULL, 16);
+	PMD_FUNC_LOG(DEBUG, "pkt_dir_v = 0x%x\n", ark->pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	PMD_FUNC_LOG(DEBUG, "key = %s, value = %s\n",
+		    key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[ARK_MAX_ARG_LEN];
+	int  size = 0;
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		size += strlen(line);
+		if (size >= ARK_MAX_ARG_LEN) {
+			PMD_DRV_LOG(ERR, "Unable to parse file %s args, "
+				    "parameter list is too long\n", value);
+			fclose(file);
+			return -1;
+		}
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	PMD_FUNC_LOG(DEBUG, "file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(struct ark_adapter *ark, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	ark->pkt_gen_args[0] = 0;
+	ark->pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		PMD_FUNC_LOG(DEBUG, "**** Arg passed to PMD = %s:%s\n",
+			     pair->key,
+			     pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTDIR_ARG,
+			       &process_pktdir_arg,
+			       ark) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+		return -1;
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTGEN_ARG,
+			       &process_file_args,
+			       ark->pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+		return -1;
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTCHKR_ARG,
+			       &process_file_args,
+			       ark->pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+		return -1;
+	}
+
+	PMD_DRV_LOG(INFO, "packet director set to 0x%x\n", ark->pkt_dir_v);
+
+	return 0;
+}
+
+RTE_PMD_REGISTER_PCI(net_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(net_ark, pci_id_ark_map);
+RTE_PMD_REGISTER_PARAM_STRING(net_ark,
+			      ARK_PKTGEN_ARG "=<filename> "
+			      ARK_PKTCHKR_ARG "=<filename> "
+			      ARK_PKTDIR_ARG "=<bitmap>");
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..9f8d32f
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,41 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+#define ARK_DEV_TO_PCI(eth_dev)			\
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..033ac87
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,116 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/* Maximum length of arg list in bytes */
+#define ARK_MAX_ARG_LEN 256
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Packet generator/checker args */
+	char pkt_gen_args[ARK_MAX_ARG_LEN];
+	char pkt_chkr_args[ARK_MAX_ARG_LEN];
+	uint32_t pkt_dir_v;
+
+	/* eth device */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* Application Bar */
+	uint8_t *a_bar;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/ark_logs.h b/drivers/net/ark/ark_logs.h
new file mode 100644
index 0000000..8aff296
--- /dev/null
+++ b/drivers/net/ark/ark_logs.h
@@ -0,0 +1,119 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+
+/* Configuration option to pad TX packets to 60 bytes */
+#ifdef RTE_LIBRTE_ARK_PAD_TX
+#define ARK_TX_PAD_TO_60   1
+#else
+#define ARK_TX_PAD_TO_60   0
+#endif
+
+/* system camel case definition changed to upper case */
+#define PRIU32 PRIu32
+#define PRIU64 PRIu64
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIU32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIU64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+
+
+#define PMD_DRV_LOG(level, fmt, args...)	\
+	RTE_LOG(level, PMD, fmt, ## args)
+
+/* Conditional trace definitions */
+#define ARK_TRACE_ON(level, fmt, ...)		\
+	RTE_LOG(level, PMD, fmt, ##__VA_ARGS__)
+
+/* This pattern allows compiler check arguments even if disabled  */
+#define ARK_TRACE_OFF(level, fmt, ...)					\
+	do {if (0) RTE_LOG(level, PMD, fmt, ##__VA_ARGS__); }		\
+	while (0)
+
+
+/* tracing including the function name */
+#define ARK_FUNC_ON(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+
+/* tracing including the function name */
+#define ARK_FUNC_OFF(level, fmt, args...)				\
+	do { if (0) RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args); } \
+	while (0)
+
+
+/* Debug macro for tracing full behavior, function tracing and messages*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define PMD_FUNC_LOG(level, fmt, ...) ARK_FUNC_ON(level, fmt, ##__VA_ARGS__)
+#define PMD_DEBUG_LOG(level, fmt, ...) ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define PMD_FUNC_LOG(level, fmt, ...) ARK_FUNC_OFF(level, fmt, ##__VA_ARGS__)
+#define PMD_DEBUG_LOG(level, fmt, ...) ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+
+/* Debug macro for reporting FPGA statistics */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define PMD_STATS_LOG(level, fmt, ...) ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define PMD_STATS_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+
+/* Debug macro for RX path */
+#ifdef RTE_LIBRTE_ARK_DEBUG_RX
+#define ARK_RX_DEBUG 1
+#define PMD_RX_LOG(level, fmt, ...)  ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define ARK_RX_DEBUG 0
+#define PMD_RX_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for TX path */
+#ifdef RTE_LIBRTE_ARK_DEBUG_TX
+#define ARK_TX_DEBUG       1
+#define PMD_TX_LOG(level, fmt, ...)  ARK_TRACE_ON(level, fmt, ##__VA_ARGS__)
+#else
+#define ARK_TX_DEBUG       0
+#define PMD_TX_LOG(level, fmt, ...)  ARK_TRACE_OFF(level, fmt, ##__VA_ARGS__)
+#endif
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..1062e04
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 62a2a1a..13a1287 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -103,6 +103,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper function
  @ 2017-03-29  0:52  3%         ` Lu, Wenzhuo
  2017-03-29  9:11  0%           ` Iremonger, Bernard
  0 siblings, 1 reply; 200+ results
From: Lu, Wenzhuo @ 2017-03-29  0:52 UTC (permalink / raw)
  To: Iremonger, Bernard, dev, Xing, Beilei, Wu, Jingjing
  Cc: Zhang, Helin, Stroe, Laura

Hi Bernard,

> -----Original Message-----
> From: Iremonger, Bernard
> Sent: Tuesday, March 28, 2017 9:23 PM
> To: Iremonger, Bernard; Lu, Wenzhuo; dev@dpdk.org; Xing, Beilei; Wu,
> Jingjing
> Cc: Zhang, Helin; Stroe, Laura
> Subject: RE: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper function
> 
> Hi Wenzhuo,
> 
> <snip>
> 
> -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bernard
> > > Iremonger
> > > > Sent: Friday, March 24, 2017 12:39 AM
> > > > To: dev@dpdk.org; Xing, Beilei; Wu, Jingjing
> > > > Cc: Zhang, Helin; Iremonger, Bernard; Stroe, Laura
> > > > Subject: [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper
> > > > function
> > > >
> > > > Add i40e_dev_cloud_filter_qinq function, and call it from
> > > > i40e_dev_consistent_tunnel_filter_set function.
> > > > Replace filter 0x1 with QinQ filter.
> > > Better not use 0x1 here. Maybe it should be change to outer IP filter?
> > Ok, I will change to out IP filter.
> > > And seems better to add more explanation about what's QinQ filter here.
> > Ok, I will add some information.
> > >
> > > > +
> > > > +/* Create a QinQ cloud filter
> > > > + *
> > > > + * The Fortville NIC has limited resources for tunnel filters,
> > > > + * so we can only reuse existing filters.
> > > > + *
> > > > + * In step 1 we define which Field Vector fields can be used for
> > > > + * filter types.
> > > > + * As we do not have the inner tag defined as a field,
> > > > + * we have to define it first, by reusing one of L1 entries.
> > > > + *
> > > > + * In step 2 we are replacing one of existing filter types with
> > > > + * a new one for QinQ.
> > > > + * As we reusing L1 and replacing L2, some of the default filter
> > > > + * types will disappear,which depends on L1 and L2 entries we reuse.
> > > > + *
> > > > + * Step 1: Create L1 filter of outer vlan (12b) + inner vlan
> > > > +(12b)
> > > > + *
> > > > + * 1.	Create L1 filter of outer vlan (12b) which will be in use
> > > > + *		later when we define the cloud filter.
> > > > + *	a.	Valid_flags.replace_cloud = 0
> > > > + *	b.	Old_filter = 10 (Stag_Inner_Vlan)
> > > > + *	c.	New_filter = 0x10
> > > > + *	d.	TR bit = 0xff (optional, not used here)
> > > > + *	e.	Buffer â€“ 2 entries:
> > > > + *		i.	Byte0 = 8 (outer vlan FV index). Byte1 =0 (rsv)
> Byte 2-
> > > > 3 = 0x0fff
> > > > + *		ii.	Byte0 = 37 (inner vlan FV index). Byte1 =0 (rsv)
> Byte
> > > > 2-3 = 0x0fff
> > > > + *
> > > > + * Step 2:
> > > > + * 2.	Create cloud filter using two L1 filters entries: stag and
> > > > + *		new filter(outer vlan+ inner vlan)
> > > > + *	a.	Valid_flags.replace_cloud = 1
> > > > + *	b.	Old_filter = 1 (instead of outer IP)
> > > > + *	c.	New_filter = 0x10
> > > > + *	d.	Buffer â€“ 2 entries:
> > > > + *		i.	Byte0 = 0x80 | 7 (valid | Stag). Byte13 = 0 (rsv)
> > > > + *		ii.	Byte8 = 0x80 | 0x10 (valid | new l1 filter
> step1).
> > > > Byte9-11 = 0 (rsv)
> > > > + */
> > > > +static int
> > > > +i40e_dev_cloud_filter_qinq(struct i40e_pf *pf) {
> > > Is it better to change the function name to
> > > i40e_dev_create_cloud_filter_qinq?
> 
> Ok, how about i40e_cloud_filter_qinq_create()?
> It is more in line with the existing code.
It's fine to me:)

> 
> > >
> > > > diff --git a/drivers/net/i40e/i40e_ethdev.h
> > > > b/drivers/net/i40e/i40e_ethdev.h index 934c679..020c5a2 100644
> > > > --- a/drivers/net/i40e/i40e_ethdev.h
> > > > +++ b/drivers/net/i40e/i40e_ethdev.h
> > > >
> > > >  /**
> > > > + * filter type of tunneling packet  */ #define
> > > > +ETH_TUNNEL_FILTER_OMAC  0x01 /**< filter by outer MAC
> > addr
> > > */
> > > > +#define ETH_TUNNEL_FILTER_OIP   0x02 /**< filter by outer IP Addr */
> > > > +#define ETH_TUNNEL_FILTER_TENID 0x04 /**< filter by tenant ID */
> > > > +#define ETH_TUNNEL_FILTER_IMAC  0x08 /**< filter by inner MAC
> > > > +addr
> > > */
> > > > +#define ETH_TUNNEL_FILTER_IVLAN 0x10 /**< filter by inner VLAN ID
> > */
> > > > +#define ETH_TUNNEL_FILTER_IIP   0x20 /**< filter by inner IP addr */
> > > Seems not necessary to redefine these macros.
> > >
> > > > +#define ETH_TUNNEL_FILTER_OVLAN 0x40 /**< filter by outer VLAN ID
> > > */
> > > > +
> > > > +#define I40E_TUNNEL_FILTER_IMAC_IVLAN
> > (ETH_TUNNEL_FILTER_IMAC
> > > | \
> > > > +					ETH_TUNNEL_FILTER_IVLAN)
> > > > +#define I40E_TUNNEL_FILTER_IMAC_IVLAN_TENID
> > > > (ETH_TUNNEL_FILTER_IMAC | \
> > > > +					ETH_TUNNEL_FILTER_IVLAN | \
> > > > +					ETH_TUNNEL_FILTER_TENID)
> > > > +#define I40E_TUNNEL_FILTER_IMAC_TENID
> > (ETH_TUNNEL_FILTER_IMAC
> > > | \
> > > > +					ETH_TUNNEL_FILTER_TENID)
> > > > +#define I40E_TUNNEL_FILTER_OMAC_TENID_IMAC
> > > > (ETH_TUNNEL_FILTER_OMAC | \
> > > > +					ETH_TUNNEL_FILTER_TENID | \
> > > > +					ETH_TUNNEL_FILTER_IMAC)
> > > I don't think it's necessary the redefine these macros either. Maybe
> > > we only need to add I40E_TUNNEL_FILTER_CUSTOM_QINQ.
> >
> > I will check this.
> 
> These are now used in the i40e_ethdev code replacing RTE_* macros.
I think these macros are not NIC specific. Better define them in RTE. But if it's not the good time to add them to RTE, maybe the concern of ABI change. We need only define ETH_TUNNEL_FILTER_OVLAN and I40E_TUNNEL_FILTER_OMAC_TENID_IMAC here.

> 
> > >
> > > > +#define I40E_TUNNEL_FILTER_CUSTOM_QINQ
> > > (ETH_TUNNEL_FILTER_OMAC
> > > > | \
> > > > +					ETH_TUNNEL_FILTER_OVLAN |
> > > > ETH_TUNNEL_FILTER_IVLAN)
> > > > +
> > > > +/**
> > > >   * Tunneling Packet filter configuration.
> > > >   */
> > > >  struct i40e_tunnel_filter_conf {
> > > > --
> > > > 2.10.1
> >
> > Thanks for the review.
> >
> > Regards,
> >
> > Bernard.


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 09/14] ring: allow dequeue fns to return remaining entry count
                           ` (5 preceding siblings ...)
  2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
@ 2017-03-28 20:36  1%       ` Bruce Richardson
    7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:36 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

Add an extra parameter to the ring dequeue burst/bulk functions so that
those functions can optionally return the amount of remaining objs in the
ring. This information can be used by applications in a number of ways,
for instance, with single-consumer queues, it provides a max
dequeue size which is guaranteed to work.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
v4: added in missing updates to crypto PMDs
---
 app/pdump/main.c                                   |  2 +-
 doc/guides/prog_guide/writing_efficient_code.rst   |  2 +-
 doc/guides/rel_notes/release_17_05.rst             |  8 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c           |  2 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         |  2 +-
 drivers/crypto/armv8/rte_armv8_pmd.c               |  2 +-
 drivers/crypto/kasumi/rte_kasumi_pmd.c             |  2 +-
 drivers/crypto/null/null_crypto_pmd.c              |  2 +-
 drivers/crypto/openssl/rte_openssl_pmd.c           |  2 +-
 drivers/crypto/snow3g/rte_snow3g_pmd.c             |  2 +-
 drivers/crypto/zuc/rte_zuc_pmd.c                   |  2 +-
 drivers/net/bonding/rte_eth_bond_pmd.c             |  3 +-
 drivers/net/ring/rte_eth_ring.c                    |  2 +-
 examples/distributor/main.c                        |  2 +-
 examples/load_balancer/runtime.c                   |  6 +-
 .../client_server_mp/mp_client/client.c            |  3 +-
 examples/packet_ordering/main.c                    |  6 +-
 examples/qos_sched/app_thread.c                    |  6 +-
 examples/quota_watermark/qw/main.c                 |  5 +-
 examples/server_node_efd/node/node.c               |  2 +-
 lib/librte_hash/rte_cuckoo_hash.c                  |  3 +-
 lib/librte_mempool/rte_mempool_ring.c              |  4 +-
 lib/librte_port/rte_port_frag.c                    |  3 +-
 lib/librte_port/rte_port_ring.c                    |  6 +-
 lib/librte_ring/rte_ring.h                         | 90 +++++++++++-----------
 test/test-pipeline/runtime.c                       |  6 +-
 test/test/test_link_bonding_mode4.c                |  3 +-
 test/test/test_pmd_ring_perf.c                     |  7 +-
 test/test/test_ring.c                              | 54 ++++++-------
 test/test/test_ring_perf.c                         | 20 +++--
 test/test/test_table_acl.c                         |  2 +-
 test/test/test_table_pipeline.c                    |  2 +-
 test/test/test_table_ports.c                       |  8 +-
 test/test/virtual_pmd.c                            |  4 +-
 34 files changed, 153 insertions(+), 122 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index b88090d..3b13753 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -496,7 +496,7 @@ pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
 
 	/* first dequeue packets from ring of primary process */
 	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
-			(void *)rxtx_bufs, BURST_SIZE);
+			(void *)rxtx_bufs, BURST_SIZE, NULL);
 	stats->dequeue_pkts += nb_in_deq;
 
 	if (nb_in_deq) {
diff --git a/doc/guides/prog_guide/writing_efficient_code.rst b/doc/guides/prog_guide/writing_efficient_code.rst
index 78d2afa..8223ace 100644
--- a/doc/guides/prog_guide/writing_efficient_code.rst
+++ b/doc/guides/prog_guide/writing_efficient_code.rst
@@ -124,7 +124,7 @@ The code algorithm that dequeues messages may be something similar to the follow
 
     while (1) {
         /* Process as many elements as can be dequeued. */
-        count = rte_ring_dequeue_burst(ring, obj_table, MAX_BULK);
+        count = rte_ring_dequeue_burst(ring, obj_table, MAX_BULK, NULL);
         if (unlikely(count == 0))
             continue;
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index b361a98..c67e468 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -140,6 +140,8 @@ API Changes
   * added an extra parameter to the burst/bulk enqueue functions to
     return the number of free spaces in the ring after enqueue. This can
     be used by an application to implement its own watermark functionality.
+  * added an extra parameter to the burst/bulk dequeue functions to return
+    the number elements remaining in the ring after dequeue.
   * changed the return value of the enqueue and dequeue bulk functions to
     match that of the burst equivalents. In all cases, ring functions which
     operate on multiple packets now return the number of elements enqueued
@@ -152,6 +154,12 @@ API Changes
     - ``rte_ring_sc_dequeue_bulk``
     - ``rte_ring_dequeue_bulk``
 
+    NOTE: the above functions all have different parameters as well as
+    different return values, due to the other listed changes above. This
+    means that all instances of the functions in existing code will be
+    flagged by the compiler. The return value usage should be checked
+    while fixing the compiler error due to the extra parameter.
+
 ABI Changes
 -----------
 
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index a2d10a5..638a95d 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -420,7 +420,7 @@ aesni_gcm_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_pkts,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index 432d239..05edb6c 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -622,7 +622,7 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/armv8/rte_armv8_pmd.c b/drivers/crypto/armv8/rte_armv8_pmd.c
index 37ecd7b..6376e9e 100644
--- a/drivers/crypto/armv8/rte_armv8_pmd.c
+++ b/drivers/crypto/armv8/rte_armv8_pmd.c
@@ -765,7 +765,7 @@ armv8_crypto_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned int nb_dequeued = 0;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/kasumi/rte_kasumi_pmd.c b/drivers/crypto/kasumi/rte_kasumi_pmd.c
index 1dd05cb..55bdb29 100644
--- a/drivers/crypto/kasumi/rte_kasumi_pmd.c
+++ b/drivers/crypto/kasumi/rte_kasumi_pmd.c
@@ -542,7 +542,7 @@ kasumi_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)c_ops, nb_ops);
+			(void **)c_ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/null/null_crypto_pmd.c b/drivers/crypto/null/null_crypto_pmd.c
index ed5a9fc..f68ec8d 100644
--- a/drivers/crypto/null/null_crypto_pmd.c
+++ b/drivers/crypto/null/null_crypto_pmd.c
@@ -155,7 +155,7 @@ null_crypto_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_pkts,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c
index e74c5cf..09173b2 100644
--- a/drivers/crypto/openssl/rte_openssl_pmd.c
+++ b/drivers/crypto/openssl/rte_openssl_pmd.c
@@ -1119,7 +1119,7 @@ openssl_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned int nb_dequeued = 0;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/snow3g/rte_snow3g_pmd.c b/drivers/crypto/snow3g/rte_snow3g_pmd.c
index 01c4e1c..1042b31 100644
--- a/drivers/crypto/snow3g/rte_snow3g_pmd.c
+++ b/drivers/crypto/snow3g/rte_snow3g_pmd.c
@@ -533,7 +533,7 @@ snow3g_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)c_ops, nb_ops);
+			(void **)c_ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/crypto/zuc/rte_zuc_pmd.c b/drivers/crypto/zuc/rte_zuc_pmd.c
index 5e2dbf5..06ff503 100644
--- a/drivers/crypto/zuc/rte_zuc_pmd.c
+++ b/drivers/crypto/zuc/rte_zuc_pmd.c
@@ -433,7 +433,7 @@ zuc_pmd_dequeue_burst(void *queue_pair,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_ops,
-			(void **)c_ops, nb_ops);
+			(void **)c_ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index f3ac9e2..96638af 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1008,7 +1008,8 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		struct port *port = &mode_8023ad_ports[slaves[i]];
 
 		slave_slow_nb_pkts[i] = rte_ring_dequeue_burst(port->tx_ring,
-				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS);
+				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS,
+				NULL);
 		slave_nb_pkts[i] = slave_slow_nb_pkts[i];
 
 		for (j = 0; j < slave_slow_nb_pkts[i]; j++)
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index adbf478..77ef3a1 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -88,7 +88,7 @@ eth_ring_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
 	void **ptrs = (void *)&bufs[0];
 	struct ring_queue *r = q;
 	const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng,
-			ptrs, nb_bufs);
+			ptrs, nb_bufs, NULL);
 	if (r->rng->flags & RING_F_SC_DEQ)
 		r->rx_pkts.cnt += nb_rx;
 	else
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index bb84f13..90c9613 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -330,7 +330,7 @@ lcore_tx(struct rte_ring *in_r)
 
 			struct rte_mbuf *bufs[BURST_SIZE];
 			const uint16_t nb_rx = rte_ring_dequeue_burst(in_r,
-					(void *)bufs, BURST_SIZE);
+					(void *)bufs, BURST_SIZE, NULL);
 			app_stats.tx.dequeue_pkts += nb_rx;
 
 			/* if we get no traffic, flush anything we have */
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 1645994..8192c08 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -349,7 +349,8 @@ app_lcore_io_tx(
 			ret = rte_ring_sc_dequeue_bulk(
 				ring,
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
-				bsz_rd);
+				bsz_rd,
+				NULL);
 
 			if (unlikely(ret == 0))
 				continue;
@@ -504,7 +505,8 @@ app_lcore_worker(
 		ret = rte_ring_sc_dequeue_bulk(
 			ring_in,
 			(void **) lp->mbuf_in.array,
-			bsz_rd);
+			bsz_rd,
+			NULL);
 
 		if (unlikely(ret == 0))
 			continue;
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index dca9eb9..01b535c 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -279,7 +279,8 @@ main(int argc, char *argv[])
 		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts,
+				PKT_READ_SIZE, NULL);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c
index 569b6da..49ae35b 100644
--- a/examples/packet_ordering/main.c
+++ b/examples/packet_ordering/main.c
@@ -462,7 +462,7 @@ worker_thread(void *args_ptr)
 
 		/* dequeue the mbufs from rx_to_workers ring */
 		burst_size = rte_ring_dequeue_burst(ring_in,
-				(void *)burst_buffer, MAX_PKTS_BURST);
+				(void *)burst_buffer, MAX_PKTS_BURST, NULL);
 		if (unlikely(burst_size == 0))
 			continue;
 
@@ -510,7 +510,7 @@ send_thread(struct send_thread_args *args)
 
 		/* deque the mbufs from workers_to_tx ring */
 		nb_dq_mbufs = rte_ring_dequeue_burst(args->ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(nb_dq_mbufs == 0))
 			continue;
@@ -595,7 +595,7 @@ tx_thread(struct rte_ring *ring_in)
 
 		/* deque the mbufs from workers_to_tx ring */
 		dqnum = rte_ring_dequeue_burst(ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(dqnum == 0))
 			continue;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 0c81a15..15f117f 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -179,7 +179,7 @@ app_tx_thread(struct thread_conf **confs)
 
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
-					burst_conf.qos_dequeue);
+					burst_conf.qos_dequeue, NULL);
 		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
@@ -218,7 +218,7 @@ app_worker_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
@@ -254,7 +254,7 @@ app_mixed_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
diff --git a/examples/quota_watermark/qw/main.c b/examples/quota_watermark/qw/main.c
index 57df8ef..2dcddea 100644
--- a/examples/quota_watermark/qw/main.c
+++ b/examples/quota_watermark/qw/main.c
@@ -247,7 +247,8 @@ pipeline_stage(__attribute__((unused)) void *args)
 			}
 
 			/* Dequeue up to quota mbuf from rx */
-			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts, *quota);
+			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts,
+					*quota, NULL);
 			if (unlikely(nb_dq_pkts < 0))
 				continue;
 
@@ -305,7 +306,7 @@ send_stage(__attribute__((unused)) void *args)
 
 			/* Dequeue packets from tx and send them */
 			nb_dq_pkts = (uint16_t) rte_ring_dequeue_burst(tx,
-					(void *) tx_pkts, *quota);
+					(void *) tx_pkts, *quota, NULL);
 			rte_eth_tx_burst(dest_port_id, 0, tx_pkts, nb_dq_pkts);
 
 			/* TODO: Check if nb_dq_pkts == nb_tx_pkts? */
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index 9ec6a05..f780b92 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) == 0))
+					rx_pkts, NULL) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 6552199..645c0cf 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -536,7 +536,8 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
 		if (cached_free_slots->len == 0) {
 			/* Need to get another burst of free slots from global ring */
 			n_slots = rte_ring_mc_dequeue_burst(h->free_slots,
-					cached_free_slots->objs, LCORE_CACHE_SIZE);
+					cached_free_slots->objs,
+					LCORE_CACHE_SIZE, NULL);
 			if (n_slots == 0)
 				return -ENOSPC;
 
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index 9b8fd2b..5c132bf 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -58,14 +58,14 @@ static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_mc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_sc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_port/rte_port_frag.c b/lib/librte_port/rte_port_frag.c
index 0fcace9..320407e 100644
--- a/lib/librte_port/rte_port_frag.c
+++ b/lib/librte_port/rte_port_frag.c
@@ -186,7 +186,8 @@ rte_port_ring_reader_frag_rx(void *port,
 		/* If "pkts" buffer is empty, read packet burst from ring */
 		if (p->n_pkts == 0) {
 			p->n_pkts = rte_ring_sc_dequeue_burst(p->ring,
-				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX);
+				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX,
+				NULL);
 			RTE_PORT_RING_READER_FRAG_STATS_PKTS_IN_ADD(p, p->n_pkts);
 			if (p->n_pkts == 0)
 				return n_pkts_out;
diff --git a/lib/librte_port/rte_port_ring.c b/lib/librte_port/rte_port_ring.c
index c5dbe07..85fad44 100644
--- a/lib/librte_port/rte_port_ring.c
+++ b/lib/librte_port/rte_port_ring.c
@@ -111,7 +111,8 @@ rte_port_ring_reader_rx(void *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
@@ -124,7 +125,8 @@ rte_port_ring_multi_reader_rx(void *port, struct rte_mbuf **pkts,
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 61a4dc8..b05fecb 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -488,7 +488,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -497,11 +498,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	unsigned int i;
 	uint32_t mask = r->mask;
 
-	/* Avoid the unnecessary cmpset operation below, which is also
-	 * potentially harmful when n equals 0. */
-	if (n == 0)
-		return 0;
-
 	/* move cons.head atomically */
 	do {
 		/* Restore n as it may change every loop */
@@ -516,15 +512,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		entries = (prod_tail - cons_head);
 
 		/* Set the actual entries for dequeue */
-		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED)
-				return 0;
-			else {
-				if (unlikely(entries == 0))
-					return 0;
-				n = entries;
-			}
-		}
+		if (n > entries)
+			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+		if (unlikely(n == 0))
+			goto end;
 
 		cons_next = cons_head + n;
 		success = rte_atomic32_cmpset(&r->cons.head, cons_head,
@@ -543,7 +535,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		rte_pause();
 
 	r->cons.tail = cons_next;
-
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -567,7 +561,8 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  */
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -582,15 +577,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * and size(ring)-1. */
 	entries = prod_tail - cons_head;
 
-	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED)
-			return 0;
-		else {
-			if (unlikely(entries == 0))
-				return 0;
-			n = entries;
-		}
-	}
+	if (n > entries)
+		n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+	if (unlikely(entries == 0))
+		goto end;
 
 	cons_next = cons_head + n;
 	r->cons.head = cons_next;
@@ -600,6 +591,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -746,9 +740,11 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -765,9 +761,11 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -787,12 +785,13 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned int n,
+		unsigned int *available)
 {
 	if (r->cons.single)
-		return rte_ring_sc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_sc_dequeue_bulk(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_mc_dequeue_bulk(r, obj_table, n, available);
 }
 
 /**
@@ -813,7 +812,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1, NULL)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -831,7 +830,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -853,7 +852,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -1043,9 +1042,11 @@ rte_ring_enqueue_burst(struct rte_ring *r, void * const *obj_table,
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1063,9 +1064,11 @@ rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1085,12 +1088,13 @@ rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   - Number of objects dequeued
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
 	if (r->cons.single)
-		return rte_ring_sc_dequeue_burst(r, obj_table, n);
+		return rte_ring_sc_dequeue_burst(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_burst(r, obj_table, n);
+		return rte_ring_mc_dequeue_burst(r, obj_table, n, available);
 }
 
 #ifdef __cplusplus
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index c06ff54..8970e1c 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -121,7 +121,8 @@ app_main_loop_worker(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_rx[i],
 			(void **) worker_mbuf->array,
-			app.burst_size_worker_read);
+			app.burst_size_worker_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
@@ -151,7 +152,8 @@ app_main_loop_tx(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_tx[i],
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
-			app.burst_size_tx_read);
+			app.burst_size_tx_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
diff --git a/test/test/test_link_bonding_mode4.c b/test/test/test_link_bonding_mode4.c
index 8df28b4..15091b1 100644
--- a/test/test/test_link_bonding_mode4.c
+++ b/test/test/test_link_bonding_mode4.c
@@ -193,7 +193,8 @@ static uint8_t lacpdu_rx_count[RTE_MAX_ETHPORTS] = {0, };
 static int
 slave_get_pkts(struct slave_conf *slave, struct rte_mbuf **buf, uint16_t size)
 {
-	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf, size);
+	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf,
+			size, NULL);
 }
 
 /*
diff --git a/test/test/test_pmd_ring_perf.c b/test/test/test_pmd_ring_perf.c
index 045a7f2..004882a 100644
--- a/test/test/test_pmd_ring_perf.c
+++ b/test/test/test_pmd_ring_perf.c
@@ -67,7 +67,7 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t eth_start = rte_rdtsc();
@@ -99,7 +99,7 @@ test_single_enqueue_dequeue(void)
 	rte_compiler_barrier();
 	for (i = 0; i < iterations; i++) {
 		rte_ring_enqueue_bulk(r, &burst, 1, NULL);
-		rte_ring_dequeue_bulk(r, &burst, 1);
+		rte_ring_dequeue_bulk(r, &burst, 1, NULL);
 	}
 	const uint64_t sc_end = rte_rdtsc_precise();
 	rte_compiler_barrier();
@@ -133,7 +133,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, (void *)burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, (void *)burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, (void *)burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index b0ca88b..858ebc1 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -119,7 +119,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		    __func__, i, rand);
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand,
 				NULL) != 0);
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand,
+				NULL) == rand);
 
 		/* fill the ring */
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz, NULL) != 0);
@@ -129,7 +130,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz,
+				NULL) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -186,19 +188,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -232,19 +234,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -265,7 +267,7 @@ test_ring_basic(void)
 		cur_src += MAX_BULK;
 		if (ret == 0)
 			goto fail;
-		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if (ret == 0)
 			goto fail;
@@ -303,13 +305,13 @@ test_ring_basic(void)
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue2\n");
@@ -390,19 +392,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1) ;
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -451,19 +453,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available memory space for the exact MAX_BULK entries */
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -505,19 +507,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -539,7 +541,7 @@ test_ring_burst_basic(void)
 		cur_src += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
@@ -578,19 +580,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available objects - the exact MAX_BULK */
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -613,7 +615,7 @@ test_ring_burst_basic(void)
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret != 2)
 		goto fail;
@@ -753,7 +755,7 @@ test_ring_basic_ex(void)
 		goto fail_test;
 	}
 
-	ret = rte_ring_dequeue_burst(rp, obj, 2);
+	ret = rte_ring_dequeue_burst(rp, obj, 2, NULL);
 	if (ret != 2) {
 		printf("test_ring_basic_ex: rte_ring_dequeue_burst fails \n");
 		goto fail_test;
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index f95a8e9..ed89896 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -152,12 +152,12 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t mc_end = rte_rdtsc();
 
 	printf("SC empty dequeue: %.2F\n",
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
@@ -325,7 +325,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -333,7 +334,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
@@ -361,7 +363,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -369,7 +372,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
diff --git a/test/test/test_table_acl.c b/test/test/test_table_acl.c
index b3bfda4..4d43be7 100644
--- a/test/test/test_table_acl.c
+++ b/test/test/test_table_acl.c
@@ -713,7 +713,7 @@ test_pipeline_single_filter(int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0) {
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_pipeline.c b/test/test/test_table_pipeline.c
index 36bfeda..b58aa5d 100644
--- a/test/test/test_table_pipeline.c
+++ b/test/test/test_table_pipeline.c
@@ -494,7 +494,7 @@ test_pipeline_single_filter(int test_type, int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0)
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_ports.c b/test/test/test_table_ports.c
index 395f4f3..39592ce 100644
--- a/test/test/test_table_ports.c
+++ b/test/test/test_table_ports.c
@@ -163,7 +163,7 @@ test_port_ring_writer(void)
 	rte_port_ring_writer_ops.f_flush(port);
 	expected_pkts = 1;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -7;
@@ -178,7 +178,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -193,7 +193,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -208,7 +208,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -9;
diff --git a/test/test/virtual_pmd.c b/test/test/virtual_pmd.c
index 39e070c..b209355 100644
--- a/test/test/virtual_pmd.c
+++ b/test/test/virtual_pmd.c
@@ -342,7 +342,7 @@ virtual_ethdev_rx_burst_success(void *queue __rte_unused,
 	dev_private = vrtl_eth_dev->data->dev_private;
 
 	rx_count = rte_ring_dequeue_burst(dev_private->rx_queue, (void **) bufs,
-			nb_pkts);
+			nb_pkts, NULL);
 
 	/* increments ipackets count */
 	dev_private->eth_stats.ipackets += rx_count;
@@ -508,7 +508,7 @@ virtual_ethdev_get_mbufs_from_tx_queue(uint8_t port_id,
 
 	dev_private = vrtl_eth_dev->data->dev_private;
 	return rte_ring_dequeue_burst(dev_private->tx_queue, (void **)pkt_burst,
-		burst_length);
+		burst_length, NULL);
 }
 
 static uint8_t
-- 
2.9.3

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v4 07/14] ring: make bulk and burst fn return vals consistent
                           ` (4 preceding siblings ...)
  2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 06/14] ring: remove watermark support Bruce Richardson
@ 2017-03-28 20:35  2%       ` Bruce Richardson
  2017-03-28 20:36  1%       ` [dpdk-dev] [PATCH v4 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
    7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:35 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

The bulk fns for rings returns 0 for all elements enqueued and negative
for no space. Change that to make them consistent with the burst functions
in returning the number of elements enqueued/dequeued, i.e. 0 or N.
This change also allows the return value from enq/deq to be used directly
without a branch for error checking.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/release_17_05.rst             |  11 +++
 doc/guides/sample_app_ug/server_node_efd.rst       |   2 +-
 examples/load_balancer/runtime.c                   |  16 ++-
 .../client_server_mp/mp_client/client.c            |   8 +-
 .../client_server_mp/mp_server/main.c              |   2 +-
 examples/qos_sched/app_thread.c                    |   8 +-
 examples/server_node_efd/node/node.c               |   2 +-
 examples/server_node_efd/server/main.c             |   2 +-
 lib/librte_mempool/rte_mempool_ring.c              |  12 ++-
 lib/librte_ring/rte_ring.h                         | 109 +++++++--------------
 test/test-pipeline/pipeline_hash.c                 |   2 +-
 test/test-pipeline/runtime.c                       |   8 +-
 test/test/test_ring.c                              |  46 +++++----
 test/test/test_ring_perf.c                         |   8 +-
 14 files changed, 106 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 084b359..6da2612 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -137,6 +137,17 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
   * removed the function ``rte_ring_set_water_mark`` as part of a general
     removal of watermarks support in the library.
+  * changed the return value of the enqueue and dequeue bulk functions to
+    match that of the burst equivalents. In all cases, ring functions which
+    operate on multiple packets now return the number of elements enqueued
+    or dequeued, as appropriate. The updated functions are:
+
+    - ``rte_ring_mp_enqueue_bulk``
+    - ``rte_ring_sp_enqueue_bulk``
+    - ``rte_ring_enqueue_bulk``
+    - ``rte_ring_mc_dequeue_bulk``
+    - ``rte_ring_sc_dequeue_bulk``
+    - ``rte_ring_dequeue_bulk``
 
 ABI Changes
 -----------
diff --git a/doc/guides/sample_app_ug/server_node_efd.rst b/doc/guides/sample_app_ug/server_node_efd.rst
index 9b69cfe..e3a63c8 100644
--- a/doc/guides/sample_app_ug/server_node_efd.rst
+++ b/doc/guides/sample_app_ug/server_node_efd.rst
@@ -286,7 +286,7 @@ repeated infinitely.
 
         cl = &nodes[node];
         if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-                cl_rx_buf[node].count) != 0){
+                cl_rx_buf[node].count) != cl_rx_buf[node].count){
             for (j = 0; j < cl_rx_buf[node].count; j++)
                 rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
             cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 6944325..82b10bc 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -146,7 +146,7 @@ app_lcore_io_rx_buffer_to_send (
 		(void **) lp->rx.mbuf_out[worker].array,
 		bsz);
 
-	if (unlikely(ret == -ENOBUFS)) {
+	if (unlikely(ret == 0)) {
 		uint32_t k;
 		for (k = 0; k < bsz; k ++) {
 			struct rte_mbuf *m = lp->rx.mbuf_out[worker].array[k];
@@ -312,7 +312,7 @@ app_lcore_io_rx_flush(struct app_lcore_params_io *lp, uint32_t n_workers)
 			(void **) lp->rx.mbuf_out[worker].array,
 			lp->rx.mbuf_out[worker].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->rx.mbuf_out[worker].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->rx.mbuf_out[worker].array[k];
@@ -349,9 +349,8 @@ app_lcore_io_tx(
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
 				bsz_rd);
 
-			if (unlikely(ret == -ENOENT)) {
+			if (unlikely(ret == 0))
 				continue;
-			}
 
 			n_mbufs += bsz_rd;
 
@@ -505,9 +504,8 @@ app_lcore_worker(
 			(void **) lp->mbuf_in.array,
 			bsz_rd);
 
-		if (unlikely(ret == -ENOENT)) {
+		if (unlikely(ret == 0))
 			continue;
-		}
 
 #if APP_WORKER_DROP_ALL_PACKETS
 		for (j = 0; j < bsz_rd; j ++) {
@@ -559,7 +557,7 @@ app_lcore_worker(
 
 #if APP_STATS
 			lp->rings_out_iters[port] ++;
-			if (ret == 0) {
+			if (ret > 0) {
 				lp->rings_out_count[port] += 1;
 			}
 			if (lp->rings_out_iters[port] == APP_STATS){
@@ -572,7 +570,7 @@ app_lcore_worker(
 			}
 #endif
 
-			if (unlikely(ret == -ENOBUFS)) {
+			if (unlikely(ret == 0)) {
 				uint32_t k;
 				for (k = 0; k < bsz_wr; k ++) {
 					struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
@@ -609,7 +607,7 @@ app_lcore_worker_flush(struct app_lcore_params_worker *lp)
 			(void **) lp->mbuf_out[port].array,
 			lp->mbuf_out[port].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->mbuf_out[port].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index d4f9ca3..dca9eb9 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -276,14 +276,10 @@ main(int argc, char *argv[])
 	printf("[Press Ctrl-C to quit ...]\n");
 
 	for (;;) {
-		uint16_t i, rx_pkts = PKT_READ_SIZE;
+		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		/* try dequeuing max possible packets first, if that fails, get the
-		 * most we can. Loop body should only execute once, maximum */
-		while (rx_pkts > 0 &&
-				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts, rx_pkts) != 0))
-			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring), PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/multi_process/client_server_mp/mp_server/main.c b/examples/multi_process/client_server_mp/mp_server/main.c
index a6dc12d..19c95b2 100644
--- a/examples/multi_process/client_server_mp/mp_server/main.c
+++ b/examples/multi_process/client_server_mp/mp_server/main.c
@@ -227,7 +227,7 @@ flush_rx_queue(uint16_t client)
 
 	cl = &clients[client];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[client].buffer,
-			cl_rx_buf[client].count) != 0){
+			cl_rx_buf[client].count) == 0){
 		for (j = 0; j < cl_rx_buf[client].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[client].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[client].count;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 70fdcdb..dab4594 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -107,7 +107,7 @@ app_rx_thread(struct thread_conf **confs)
 			}
 
 			if (unlikely(rte_ring_sp_enqueue_bulk(conf->rx_ring,
-								(void **)rx_mbufs, nb_rx) != 0)) {
+					(void **)rx_mbufs, nb_rx) == 0)) {
 				for(i = 0; i < nb_rx; i++) {
 					rte_pktmbuf_free(rx_mbufs[i]);
 
@@ -180,7 +180,7 @@ app_tx_thread(struct thread_conf **confs)
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
 					burst_conf.qos_dequeue);
-		if (likely(retval == 0)) {
+		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
 			conf->counter = 0; /* reset empty read loop counter */
@@ -230,7 +230,9 @@ app_worker_thread(struct thread_conf **confs)
 		nb_pkt = rte_sched_port_dequeue(conf->sched_port, mbufs,
 					burst_conf.qos_dequeue);
 		if (likely(nb_pkt > 0))
-			while (rte_ring_sp_enqueue_bulk(conf->tx_ring, (void **)mbufs, nb_pkt) != 0);
+			while (rte_ring_sp_enqueue_bulk(conf->tx_ring,
+					(void **)mbufs, nb_pkt) == 0)
+				; /* empty body */
 
 		conf_idx++;
 		if (confs[conf_idx] == NULL)
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index a6c0c70..9ec6a05 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) != 0))
+					rx_pkts) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/examples/server_node_efd/server/main.c b/examples/server_node_efd/server/main.c
index 1a54d1b..3eb7fac 100644
--- a/examples/server_node_efd/server/main.c
+++ b/examples/server_node_efd/server/main.c
@@ -247,7 +247,7 @@ flush_rx_queue(uint16_t node)
 
 	cl = &nodes[node];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-			cl_rx_buf[node].count) != 0){
+			cl_rx_buf[node].count) != cl_rx_buf[node].count){
 		for (j = 0; j < cl_rx_buf[node].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index b9aa64d..409b860 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -42,26 +42,30 @@ static int
 common_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_mp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_sp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_mc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_sc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 906e8ae..34b438c 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -349,14 +349,10 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -388,7 +384,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOBUFS;
+				return 0;
 			else {
 				/* No free entry available */
 				if (unlikely(free_entries == 0))
@@ -414,7 +410,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -430,14 +426,10 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -457,7 +449,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOBUFS;
+			return 0;
 		else {
 			/* No free entry available */
 			if (unlikely(free_entries == 0))
@@ -474,7 +466,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -495,16 +487,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
 
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -536,7 +523,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOENT;
+				return 0;
 			else {
 				if (unlikely(entries == 0))
 					return 0;
@@ -562,7 +549,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	r->cons.tail = cons_next;
 
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -580,15 +567,10 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -607,7 +589,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	if (n > entries) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOENT;
+			return 0;
 		else {
 			if (unlikely(entries == 0))
 				return 0;
@@ -623,7 +605,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -639,10 +621,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -659,10 +640,9 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -683,10 +663,9 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 		      unsigned n)
 {
@@ -713,7 +692,7 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 static inline int __attribute__((always_inline))
 rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_mp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_mp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -730,7 +709,7 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_sp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_sp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -751,10 +730,7 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_enqueue(struct rte_ring *r, void *obj)
 {
-	if (r->prod.single)
-		return rte_ring_sp_enqueue(r, obj);
-	else
-		return rte_ring_mp_enqueue(r, obj);
+	return rte_ring_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -770,11 +746,9 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -791,11 +765,9 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects to dequeue from the ring to the obj_table,
  *   must be strictly positive.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -815,11 +787,9 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue, no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	if (r->cons.single)
@@ -846,7 +816,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -864,7 +834,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -886,10 +856,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	if (r->cons.single)
-		return rte_ring_sc_dequeue(r, obj_p);
-	else
-		return rte_ring_mc_dequeue(r, obj_p);
+	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
diff --git a/test/test-pipeline/pipeline_hash.c b/test/test-pipeline/pipeline_hash.c
index 10d2869..1ac0aa8 100644
--- a/test/test-pipeline/pipeline_hash.c
+++ b/test/test-pipeline/pipeline_hash.c
@@ -547,6 +547,6 @@ app_main_loop_rx_metadata(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index 42a6142..4e20669 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -98,7 +98,7 @@ app_main_loop_rx(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -123,7 +123,7 @@ app_main_loop_worker(void) {
 			(void **) worker_mbuf->array,
 			app.burst_size_worker_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		do {
@@ -131,7 +131,7 @@ app_main_loop_worker(void) {
 				app.rings_tx[i ^ 1],
 				(void **) worker_mbuf->array,
 				app.burst_size_worker_write);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -152,7 +152,7 @@ app_main_loop_tx(void) {
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
 			app.burst_size_tx_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		n_mbufs += app.burst_size_tx_read;
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 666a451..112433b 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -117,20 +117,18 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		rand = RTE_MAX(rte_rand() % RING_SIZE, 1UL);
 		printf("%s: iteration %u, random shift: %u;\n",
 		    __func__, i, rand);
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rand));
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rand));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand) != 0);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
 
 		/* fill the ring */
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rsz));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz) != 0);
 		TEST_RING_VERIFY(0 == rte_ring_free_count(r));
 		TEST_RING_VERIFY(rsz == rte_ring_count(r));
 		TEST_RING_VERIFY(rte_ring_full(r));
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rsz));
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -171,37 +169,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -217,37 +215,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -264,11 +262,11 @@ test_ring_basic(void)
 	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
 		ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 		cur_src += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 		cur_dst += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 	}
 
@@ -294,25 +292,25 @@ test_ring_basic(void)
 
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue2\n");
 		goto fail;
 	}
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index 320c20c..8ccbdef 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -195,13 +195,13 @@ enqueue_bulk(void *p)
 
 	const uint64_t sp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_sp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sp_end = rte_rdtsc();
 
 	const uint64_t mp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_mp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mp_end = rte_rdtsc();
 
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v4 06/14] ring: remove watermark support
                           ` (3 preceding siblings ...)
  2017-03-28 20:35  4%       ` [dpdk-dev] [PATCH v4 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
@ 2017-03-28 20:35  2%       ` Bruce Richardson
  2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:35 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

Remove the watermark support. A future commit will add support for having
enqueue functions return the amount of free space in the ring, which will
allow applications to implement their own watermark checks, while also
being more useful to the app.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V2: fix missed references to watermarks in v1
---
 doc/guides/prog_guide/ring_lib.rst     |   8 --
 doc/guides/rel_notes/release_17_05.rst |   2 +
 examples/Makefile                      |   2 +-
 lib/librte_ring/rte_ring.c             |  23 -----
 lib/librte_ring/rte_ring.h             |  58 +------------
 test/test/autotest_test_funcs.py       |   7 --
 test/test/commands.c                   |  52 ------------
 test/test/test_ring.c                  | 149 +--------------------------------
 8 files changed, 8 insertions(+), 293 deletions(-)

diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index d4ab502..b31ab7a 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -102,14 +102,6 @@ Name
 A ring is identified by a unique name.
 It is not possible to create two rings with the same name (rte_ring_create() returns NULL if this is attempted).
 
-Water Marking
-~~~~~~~~~~~~~
-
-The ring can have a high water mark (threshold).
-Once an enqueue operation reaches the high water mark, the producer is notified, if the water mark is configured.
-
-This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 25d8549..084b359 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -135,6 +135,8 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
+  * removed the function ``rte_ring_set_water_mark`` as part of a general
+    removal of watermarks support in the library.
 
 ABI Changes
 -----------
diff --git a/examples/Makefile b/examples/Makefile
index da2bfdd..19cd5ad 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -81,7 +81,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += packet_ordering
 DIRS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ptpclient
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
-DIRS-y += quota_watermark
+#DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
 ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 934ce87..25f64f0 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -138,7 +138,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->watermark = count;
 	r->prod.single = !!(flags & RING_F_SP_ENQ);
 	r->cons.single = !!(flags & RING_F_SC_DEQ);
 	r->size = count;
@@ -256,24 +255,6 @@ rte_ring_free(struct rte_ring *r)
 	rte_free(te);
 }
 
-/*
- * change the high water mark. If *count* is 0, water marking is
- * disabled
- */
-int
-rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
-{
-	if (count >= r->size)
-		return -EINVAL;
-
-	/* if count is 0, disable the watermarking */
-	if (count == 0)
-		count = r->size;
-
-	r->watermark = count;
-	return 0;
-}
-
 /* dump the status of the ring on the console */
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
@@ -287,10 +268,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->watermark == r->size)
-		fprintf(f, "  watermark=0\n");
-	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index f8ac7f5..906e8ae 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -153,7 +153,6 @@ struct rte_ring {
 			/**< Memzone, if any, containing the rte_ring */
 	uint32_t size;           /**< Size of ring. */
 	uint32_t mask;           /**< Mask (size-1) of ring. */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -168,7 +167,6 @@ struct rte_ring {
 
 #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer". */
 #define RING_F_SC_DEQ 0x0002 /**< The default dequeue is "single-consumer". */
-#define RTE_RING_QUOT_EXCEED (1 << 31)  /**< Quota exceed for burst ops */
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
@@ -274,26 +272,6 @@ struct rte_ring *rte_ring_create(const char *name, unsigned count,
 void rte_ring_free(struct rte_ring *r);
 
 /**
- * Change the high water mark.
- *
- * If *count* is 0, water marking is disabled. Otherwise, it is set to the
- * *count* value. The *count* value must be greater than 0 and less
- * than the ring size.
- *
- * This function can be called at any time (not necessarily at
- * initialization).
- *
- * @param r
- *   A pointer to the ring structure.
- * @param count
- *   The new water mark value.
- * @return
- *   - 0: Success; water mark changed.
- *   - -EINVAL: Invalid water mark value.
- */
-int rte_ring_set_water_mark(struct rte_ring *r, unsigned count);
-
-/**
  * Dump the status of the ring to a file.
  *
  * @param f
@@ -374,8 +352,6 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -390,7 +366,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	int success;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -431,13 +406,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-				(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	/*
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
@@ -446,7 +414,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -465,8 +433,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -479,7 +445,6 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_next, free_entries;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	prod_head = r->prod.head;
 	cons_tail = r->cons.tail;
@@ -508,15 +473,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-			(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -682,8 +640,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -704,8 +660,6 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -730,8 +684,6 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -756,8 +708,6 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -775,8 +725,6 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -798,8 +746,6 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
diff --git a/test/test/autotest_test_funcs.py b/test/test/autotest_test_funcs.py
index 1c5f390..8da8fcd 100644
--- a/test/test/autotest_test_funcs.py
+++ b/test/test/autotest_test_funcs.py
@@ -292,11 +292,4 @@ def ring_autotest(child, test_name):
     elif index == 2:
         return -1, "Fail [Timeout]"
 
-    child.sendline("set_watermark test 100")
-    child.sendline("dump_ring test")
-    index = child.expect(["  watermark=100",
-                          pexpect.TIMEOUT], timeout=1)
-    if index != 0:
-        return -1, "Fail [Bad watermark]"
-
     return 0, "Success"
diff --git a/test/test/commands.c b/test/test/commands.c
index 2df46b0..551c81d 100644
--- a/test/test/commands.c
+++ b/test/test/commands.c
@@ -228,57 +228,6 @@ cmdline_parse_inst_t cmd_dump_one = {
 
 /****************/
 
-struct cmd_set_ring_result {
-	cmdline_fixed_string_t set;
-	cmdline_fixed_string_t name;
-	uint32_t value;
-};
-
-static void cmd_set_ring_parsed(void *parsed_result, struct cmdline *cl,
-				__attribute__((unused)) void *data)
-{
-	struct cmd_set_ring_result *res = parsed_result;
-	struct rte_ring *r;
-	int ret;
-
-	r = rte_ring_lookup(res->name);
-	if (r == NULL) {
-		cmdline_printf(cl, "Cannot find ring\n");
-		return;
-	}
-
-	if (!strcmp(res->set, "set_watermark")) {
-		ret = rte_ring_set_water_mark(r, res->value);
-		if (ret != 0)
-			cmdline_printf(cl, "Cannot set water mark\n");
-	}
-}
-
-cmdline_parse_token_string_t cmd_set_ring_set =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, set,
-				 "set_watermark");
-
-cmdline_parse_token_string_t cmd_set_ring_name =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, name, NULL);
-
-cmdline_parse_token_num_t cmd_set_ring_value =
-	TOKEN_NUM_INITIALIZER(struct cmd_set_ring_result, value, UINT32);
-
-cmdline_parse_inst_t cmd_set_ring = {
-	.f = cmd_set_ring_parsed,  /* function to call */
-	.data = NULL,      /* 2nd arg of func */
-	.help_str = "set watermark: "
-			"set_watermark <ring_name> <value>",
-	.tokens = {        /* token list, NULL terminated */
-		(void *)&cmd_set_ring_set,
-		(void *)&cmd_set_ring_name,
-		(void *)&cmd_set_ring_value,
-		NULL,
-	},
-};
-
-/****************/
-
 struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
@@ -419,7 +368,6 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_autotest,
 	(cmdline_parse_inst_t *)&cmd_dump,
 	(cmdline_parse_inst_t *)&cmd_dump_one,
-	(cmdline_parse_inst_t *)&cmd_set_ring,
 	(cmdline_parse_inst_t *)&cmd_quit,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx_anchor,
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 3891f5d..666a451 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -78,21 +78,6 @@
  *      - Dequeue one object, two objects, MAX_BULK objects
  *      - Check that dequeued pointers are correct
  *
- *    - Test watermark and default bulk enqueue/dequeue:
- *
- *      - Set watermark
- *      - Set default bulk value
- *      - Enqueue objects, check that -EDQUOT is returned when
- *        watermark is exceeded
- *      - Check that dequeued pointers are correct
- *
- * #. Check live watermark change
- *
- *    - Start a loop on another lcore that will enqueue and dequeue
- *      objects in a ring. It will monitor the value of watermark.
- *    - At the same time, change the watermark on the master lcore.
- *    - The slave lcore will check that watermark changes from 16 to 32.
- *
  * #. Performance tests.
  *
  * Tests done in test_ring_perf.c
@@ -115,123 +100,6 @@ static struct rte_ring *r;
 
 #define	TEST_RING_FULL_EMTPY_ITER	8
 
-static int
-check_live_watermark_change(__attribute__((unused)) void *dummy)
-{
-	uint64_t hz = rte_get_timer_hz();
-	void *obj_table[MAX_BULK];
-	unsigned watermark, watermark_old = 16;
-	uint64_t cur_time, end_time;
-	int64_t diff = 0;
-	int i, ret;
-	unsigned count = 4;
-
-	/* init the object table */
-	memset(obj_table, 0, sizeof(obj_table));
-	end_time = rte_get_timer_cycles() + (hz / 4);
-
-	/* check that bulk and watermark are 4 and 32 (respectively) */
-	while (diff >= 0) {
-
-		/* add in ring until we reach watermark */
-		ret = 0;
-		for (i = 0; i < 16; i ++) {
-			if (ret != 0)
-				break;
-			ret = rte_ring_enqueue_bulk(r, obj_table, count);
-		}
-
-		if (ret != -EDQUOT) {
-			printf("Cannot enqueue objects, or watermark not "
-			       "reached (ret=%d)\n", ret);
-			return -1;
-		}
-
-		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->watermark;
-		if (watermark != watermark_old &&
-		    (watermark_old != 16 || watermark != 32)) {
-			printf("Bad watermark change %u -> %u\n", watermark_old,
-			       watermark);
-			return -1;
-		}
-		watermark_old = watermark;
-
-		/* dequeue objects from ring */
-		while (i--) {
-			ret = rte_ring_dequeue_bulk(r, obj_table, count);
-			if (ret != 0) {
-				printf("Cannot dequeue (ret=%d)\n", ret);
-				return -1;
-			}
-		}
-
-		cur_time = rte_get_timer_cycles();
-		diff = end_time - cur_time;
-	}
-
-	if (watermark_old != 32 ) {
-		printf(" watermark was not updated (wm=%u)\n",
-		       watermark_old);
-		return -1;
-	}
-
-	return 0;
-}
-
-static int
-test_live_watermark_change(void)
-{
-	unsigned lcore_id = rte_lcore_id();
-	unsigned lcore_id2 = rte_get_next_lcore(lcore_id, 0, 1);
-
-	printf("Test watermark live modification\n");
-	rte_ring_set_water_mark(r, 16);
-
-	/* launch a thread that will enqueue and dequeue, checking
-	 * watermark and quota */
-	rte_eal_remote_launch(check_live_watermark_change, NULL, lcore_id2);
-
-	rte_delay_ms(100);
-	rte_ring_set_water_mark(r, 32);
-	rte_delay_ms(100);
-
-	if (rte_eal_wait_lcore(lcore_id2) < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Test for catch on invalid watermark values */
-static int
-test_set_watermark( void ){
-	unsigned count;
-	int setwm;
-
-	struct rte_ring *r = rte_ring_lookup("test_ring_basic_ex");
-	if(r == NULL){
-		printf( " ring lookup failed\n" );
-		goto error;
-	}
-	count = r->size * 2;
-	setwm = rte_ring_set_water_mark(r, count);
-	if (setwm != -EINVAL){
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-
-	count = 0;
-	rte_ring_set_water_mark(r, count);
-	if (r->watermark != r->size) {
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-	return 0;
-
-error:
-	return -1;
-}
-
 /*
  * helper routine for test_ring_basic
  */
@@ -418,8 +286,7 @@ test_ring_basic(void)
 	cur_src = src;
 	cur_dst = dst;
 
-	printf("test watermark and default bulk enqueue / dequeue\n");
-	rte_ring_set_water_mark(r, 20);
+	printf("test default bulk enqueue / dequeue\n");
 	num_elems = 16;
 
 	cur_src = src;
@@ -433,8 +300,8 @@ test_ring_basic(void)
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != -EDQUOT) {
-		printf("Watermark not exceeded\n");
+	if (ret != 0) {
+		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
@@ -930,16 +797,6 @@ test_ring(void)
 		return -1;
 
 	/* basic operations */
-	if (test_live_watermark_change() < 0)
-		return -1;
-
-	if ( test_set_watermark() < 0){
-		printf ("Test failed to detect invalid parameter\n");
-		return -1;
-	}
-	else
-		printf ( "Test detected forced bad watermark values\n");
-
 	if ( test_create_count_odd() < 0){
 			printf ("Test failed to detect odd count\n");
 			return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v4 05/14] ring: remove the yield when waiting for tail update
                           ` (2 preceding siblings ...)
  2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 04/14] ring: remove debug setting Bruce Richardson
@ 2017-03-28 20:35  4%       ` Bruce Richardson
  2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 06/14] ring: remove watermark support Bruce Richardson
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:35 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base                              |  1 -
 doc/guides/prog_guide/env_abstraction_layer.rst |  5 ----
 doc/guides/rel_notes/release_17_05.rst          |  1 +
 lib/librte_ring/rte_ring.h                      | 35 +++++--------------------
 4 files changed, 7 insertions(+), 35 deletions(-)

diff --git a/config/common_base b/config/common_base
index 69e91ae..2d54ddf 100644
--- a/config/common_base
+++ b/config/common_base
@@ -452,7 +452,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
 # Compile librte_mempool
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 10a10a8..7c39cd2 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -352,11 +352,6 @@ Known Issues
 
   3. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
 
-  ``RTE_RING_PAUSE_REP_COUNT`` is defined for rte_ring to reduce contention. It's mainly for case 2, a yield is issued after number of times pause repeat.
-
-  It adds a sched_yield() syscall if the thread spins for too long while waiting on the other thread to finish its operations on the ring.
-  This gives the preempted thread a chance to proceed and finish with the ring enqueue/dequeue operation.
-
 + rte_timer
 
   Running  ``rte_timer_manager()`` on a non-EAL pthread is not allowed. However, resetting/stopping the timer from a non-EAL pthread is allowed.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 50123c2..25d8549 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -134,6 +134,7 @@ API Changes
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
+  * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 2777b41..f8ac7f5 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -114,11 +114,6 @@ enum rte_ring_queue_behavior {
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
 			   sizeof(RTE_RING_MZ_PREFIX) + 1)
 
-#ifndef RTE_RING_PAUSE_REP_COUNT
-#define RTE_RING_PAUSE_REP_COUNT 0 /**< Yield after pause num of times, no yield
-                                    *   if RTE_RING_PAUSE_REP not defined. */
-#endif
-
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
 #if RTE_CACHE_LINE_SIZE < 128
@@ -393,7 +388,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -447,18 +442,9 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head)) {
+	while (unlikely(r->prod.tail != prod_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->prod.tail = prod_next;
 	return ret;
 }
@@ -491,7 +477,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 {
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -568,7 +554,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -613,18 +599,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head)) {
+	while (unlikely(r->cons.tail != cons_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -659,7 +636,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
-- 
2.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 04/14] ring: remove debug setting
    2017-03-28 20:35  5%       ` [dpdk-dev] [PATCH v4 01/14] ring: remove split cacheline build setting Bruce Richardson
  2017-03-28 20:35  3%       ` [dpdk-dev] [PATCH v4 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
@ 2017-03-28 20:35  2%       ` Bruce Richardson
  2017-03-28 20:35  4%       ` [dpdk-dev] [PATCH v4 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:35 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base                     |   1 -
 doc/guides/prog_guide/ring_lib.rst     |   7 -
 doc/guides/rel_notes/release_17_05.rst |   1 +
 lib/librte_ring/rte_ring.c             |  41 ----
 lib/librte_ring/rte_ring.h             |  97 +-------
 test/test/test_ring.c                  | 410 ---------------------------------
 6 files changed, 13 insertions(+), 544 deletions(-)

diff --git a/config/common_base b/config/common_base
index c394651..69e91ae 100644
--- a/config/common_base
+++ b/config/common_base
@@ -452,7 +452,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index 9f69753..d4ab502 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -110,13 +110,6 @@ Once an enqueue operation reaches the high water mark, the producer is notified,
 
 This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
 
-Debug
-~~~~~
-
-When debug is enabled (CONFIG_RTE_LIBRTE_RING_DEBUG is set),
-the library stores some per-ring statistic counters about the number of enqueues/dequeues.
-These statistics are per-core to avoid concurrent accesses or atomic operations.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 8b66ac3..50123c2 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -133,6 +133,7 @@ API Changes
   have been made to it:
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+  * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 93485d4..934ce87 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -131,12 +131,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 			  RTE_CACHE_LINE_MASK) != 0);
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_LIBRTE_RING_DEBUG
-	RTE_BUILD_BUG_ON((sizeof(struct rte_ring_debug_stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
@@ -284,11 +278,6 @@ rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
 {
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats sum;
-	unsigned lcore_id;
-#endif
-
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
 	fprintf(f, "  size=%"PRIu32"\n", r->size);
@@ -302,36 +291,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		fprintf(f, "  watermark=0\n");
 	else
 		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
-
-	/* sum and dump statistics */
-#ifdef RTE_LIBRTE_RING_DEBUG
-	memset(&sum, 0, sizeof(sum));
-	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
-		sum.enq_success_bulk += r->stats[lcore_id].enq_success_bulk;
-		sum.enq_success_objs += r->stats[lcore_id].enq_success_objs;
-		sum.enq_quota_bulk += r->stats[lcore_id].enq_quota_bulk;
-		sum.enq_quota_objs += r->stats[lcore_id].enq_quota_objs;
-		sum.enq_fail_bulk += r->stats[lcore_id].enq_fail_bulk;
-		sum.enq_fail_objs += r->stats[lcore_id].enq_fail_objs;
-		sum.deq_success_bulk += r->stats[lcore_id].deq_success_bulk;
-		sum.deq_success_objs += r->stats[lcore_id].deq_success_objs;
-		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
-		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
-	}
-	fprintf(f, "  size=%"PRIu32"\n", r->size);
-	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
-	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
-	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
-	fprintf(f, "  enq_quota_objs=%"PRIu64"\n", sum.enq_quota_objs);
-	fprintf(f, "  enq_fail_bulk=%"PRIu64"\n", sum.enq_fail_bulk);
-	fprintf(f, "  enq_fail_objs=%"PRIu64"\n", sum.enq_fail_objs);
-	fprintf(f, "  deq_success_bulk=%"PRIu64"\n", sum.deq_success_bulk);
-	fprintf(f, "  deq_success_objs=%"PRIu64"\n", sum.deq_success_objs);
-	fprintf(f, "  deq_fail_bulk=%"PRIu64"\n", sum.deq_fail_bulk);
-	fprintf(f, "  deq_fail_objs=%"PRIu64"\n", sum.deq_fail_objs);
-#else
-	fprintf(f, "  no statistics available\n");
-#endif
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index d650215..2777b41 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -109,24 +109,6 @@ enum rte_ring_queue_behavior {
 	RTE_RING_QUEUE_VARIABLE   /* Enq/Deq as many items as possible from ring */
 };
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-/**
- * A structure that stores the ring statistics (per-lcore).
- */
-struct rte_ring_debug_stats {
-	uint64_t enq_success_bulk; /**< Successful enqueues number. */
-	uint64_t enq_success_objs; /**< Objects successfully enqueued. */
-	uint64_t enq_quota_bulk;   /**< Successful enqueues above watermark. */
-	uint64_t enq_quota_objs;   /**< Objects enqueued above watermark. */
-	uint64_t enq_fail_bulk;    /**< Failed enqueues number. */
-	uint64_t enq_fail_objs;    /**< Objects that failed to be enqueued. */
-	uint64_t deq_success_bulk; /**< Successful dequeues number. */
-	uint64_t deq_success_objs; /**< Objects successfully dequeued. */
-	uint64_t deq_fail_bulk;    /**< Failed dequeues number. */
-	uint64_t deq_fail_objs;    /**< Objects that failed to be dequeued. */
-} __rte_cache_aligned;
-#endif
-
 #define RTE_RING_MZ_PREFIX "RG_"
 /**< The maximum length of a ring name. */
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
@@ -184,10 +166,6 @@ struct rte_ring {
 	/** Ring consumer status. */
 	struct rte_ring_headtail cons __rte_aligned(CONS_ALIGN);
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-#endif
-
 	void *ring[] __rte_cache_aligned;   /**< Memory space of ring starts here.
 	                                     * not volatile so need to be careful
 	                                     * about compiler re-ordering */
@@ -199,27 +177,6 @@ struct rte_ring {
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
- * @internal When debug is enabled, store ring statistics.
- * @param r
- *   A pointer to the ring.
- * @param name
- *   The name of the statistics field to increment in the ring.
- * @param n
- *   The number to add to the object-oriented statistics.
- */
-#ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {                        \
-		unsigned __lcore_id = rte_lcore_id();           \
-		if (__lcore_id < RTE_MAX_LCORE) {               \
-			r->stats[__lcore_id].name##_objs += n;  \
-			r->stats[__lcore_id].name##_bulk += 1;  \
-		}                                               \
-	} while(0)
-#else
-#define __RING_STAT_ADD(r, name, n) do {} while(0)
-#endif
-
-/**
  * Calculate the memory size needed for a ring
  *
  * This function returns the number of bytes needed for a ring, given
@@ -460,17 +417,12 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOBUFS;
-			}
 			else {
 				/* No free entry available */
-				if (unlikely(free_entries == 0)) {
-					__RING_STAT_ADD(r, enq_fail, n);
+				if (unlikely(free_entries == 0))
 					return 0;
-				}
-
 				n = free_entries;
 			}
 		}
@@ -485,15 +437,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	/*
 	 * If there are other enqueues in progress that preceded us,
@@ -557,17 +505,12 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, enq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOBUFS;
-		}
 		else {
 			/* No free entry available */
-			if (unlikely(free_entries == 0)) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (unlikely(free_entries == 0))
 				return 0;
-			}
-
 			n = free_entries;
 		}
 	}
@@ -580,15 +523,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	r->prod.tail = prod_next;
 	return ret;
@@ -652,16 +591,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOENT;
-			}
 			else {
-				if (unlikely(entries == 0)){
-					__RING_STAT_ADD(r, deq_fail, n);
+				if (unlikely(entries == 0))
 					return 0;
-				}
-
 				n = entries;
 			}
 		}
@@ -691,7 +625,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 			sched_yield();
 		}
 	}
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -738,16 +671,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	entries = prod_tail - cons_head;
 
 	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, deq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOENT;
-		}
 		else {
-			if (unlikely(entries == 0)){
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (unlikely(entries == 0))
 				return 0;
-			}
-
 			n = entries;
 		}
 	}
@@ -759,7 +687,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	DEQUEUE_PTRS();
 	rte_smp_rmb();
 
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
 }
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 5f09097..3891f5d 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -763,412 +763,6 @@ test_ring_burst_basic(void)
 	return -1;
 }
 
-static int
-test_ring_stats(void)
-{
-
-#ifndef RTE_LIBRTE_RING_DEBUG
-	printf("Enable RTE_LIBRTE_RING_DEBUG to test ring stats.\n");
-	return 0;
-#else
-	void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
-	int ret;
-	unsigned i;
-	unsigned num_items            = 0;
-	unsigned failed_enqueue_ops   = 0;
-	unsigned failed_enqueue_items = 0;
-	unsigned failed_dequeue_ops   = 0;
-	unsigned failed_dequeue_items = 0;
-	unsigned last_enqueue_ops     = 0;
-	unsigned last_enqueue_items   = 0;
-	unsigned last_quota_ops       = 0;
-	unsigned last_quota_items     = 0;
-	unsigned lcore_id = rte_lcore_id();
-	struct rte_ring_debug_stats *ring_stats = &r->stats[lcore_id];
-
-	printf("Test the ring stats.\n");
-
-	/* Reset the watermark in case it was set in another test. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Allocate some dummy object pointers. */
-	src = malloc(RING_SIZE*2*sizeof(void *));
-	if (src == NULL)
-		goto fail;
-
-	for (i = 0; i < RING_SIZE*2 ; i++) {
-		src[i] = (void *)(unsigned long)i;
-	}
-
-	/* Allocate some memory for copied objects. */
-	dst = malloc(RING_SIZE*2*sizeof(void *));
-	if (dst == NULL)
-		goto fail;
-
-	memset(dst, 0, RING_SIZE*2*sizeof(void *));
-
-	/* Set the head and tail pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	/* Do Enqueue tests. */
-	printf("Test the dequeue stats.\n");
-
-	/* Fill the ring up to RING_SIZE -1. */
-	printf("Fill the ring.\n");
-	for (i = 0; i< (RING_SIZE/MAX_BULK); i++) {
-		rte_ring_sp_enqueue_burst(r, cur_src, MAX_BULK);
-		cur_src += MAX_BULK;
-	}
-
-	/* Adjust for final enqueue = MAX_BULK -1. */
-	cur_src--;
-
-	printf("Verify that the ring is full.\n");
-	if (rte_ring_full(r) != 1)
-		goto fail;
-
-
-	printf("Verify the enqueue success stats.\n");
-	/* Stats should match above enqueue operations to fill the ring. */
-	if (ring_stats->enq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Current max objects is RING_SIZE -1. */
-	if (ring_stats->enq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any failures yet. */
-	if (ring_stats->enq_fail_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_fail_objs != 0)
-		goto fail;
-
-
-	printf("Test stats for SP burst enqueue to a full ring.\n");
-	num_items = 2;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for SP bulk enqueue to a full ring.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP burst enqueue to a full ring.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP bulk enqueue to a full ring.\n");
-	num_items = 16;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	/* Do Dequeue tests. */
-	printf("Test the dequeue stats.\n");
-
-	printf("Empty the ring.\n");
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* There was only RING_SIZE -1 objects to dequeue. */
-	cur_dst++;
-
-	printf("Verify ring is empty.\n");
-	if (1 != rte_ring_empty(r))
-		goto fail;
-
-	printf("Verify the dequeue success stats.\n");
-	/* Stats should match above dequeue operations. */
-	if (ring_stats->deq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Objects dequeued is RING_SIZE -1. */
-	if (ring_stats->deq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any dequeue failure stats yet. */
-	if (ring_stats->deq_fail_bulk != 0)
-		goto fail;
-
-	printf("Test stats for SC burst dequeue with an empty ring.\n");
-	num_items = 2;
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for SC bulk dequeue with an empty ring.\n");
-	num_items = 4;
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC burst dequeue with an empty ring.\n");
-	num_items = 8;
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC bulk dequeue with an empty ring.\n");
-	num_items = 16;
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test total enqueue/dequeue stats.\n");
-	/* At this point the enqueue and dequeue stats should be the same. */
-	if (ring_stats->enq_success_bulk != ring_stats->deq_success_bulk)
-		goto fail;
-	if (ring_stats->enq_success_objs != ring_stats->deq_success_objs)
-		goto fail;
-	if (ring_stats->enq_fail_bulk    != ring_stats->deq_fail_bulk)
-		goto fail;
-	if (ring_stats->enq_fail_objs    != ring_stats->deq_fail_objs)
-		goto fail;
-
-
-	/* Watermark Tests. */
-	printf("Test the watermark/quota stats.\n");
-
-	printf("Verify the initial watermark stats.\n");
-	/* Watermark stats should be 0 since there is no watermark. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Set a watermark. */
-	rte_ring_set_water_mark(r, 16);
-
-	/* Reset pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue below watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should still be 0. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Success stats should have increased. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops + 1)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items + num_items)
-		goto fail;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue at watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != 1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP burst enqueue above watermark.\n");
-	num_items = 1;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP burst enqueue above watermark.\n");
-	num_items = 2;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP bulk enqueue above watermark.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP bulk enqueue above watermark.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	printf("Test watermark success stats.\n");
-	/* Success stats should be same as last non-watermarked enqueue. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items)
-		goto fail;
-
-
-	/* Cleanup. */
-
-	/* Empty the ring. */
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* Reset the watermark. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Free memory before test completed */
-	free(src);
-	free(dst);
-	return 0;
-
-fail:
-	free(src);
-	free(dst);
-	return -1;
-#endif
-}
-
 /*
  * it will always fail to create ring with a wrong ring size number in this function
  */
@@ -1335,10 +929,6 @@ test_ring(void)
 	if (test_ring_basic() < 0)
 		return -1;
 
-	/* ring stats */
-	if (test_ring_stats() < 0)
-		return -1;
-
 	/* basic operations */
 	if (test_live_watermark_change() < 0)
 		return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v4 03/14] ring: eliminate duplication of size and mask fields
    2017-03-28 20:35  5%       ` [dpdk-dev] [PATCH v4 01/14] ring: remove split cacheline build setting Bruce Richardson
@ 2017-03-28 20:35  3%       ` Bruce Richardson
  2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 04/14] ring: remove debug setting Bruce Richardson
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:35 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

The size and mask fields are duplicated in both the producer and
consumer data structures. Move them out of that into the top level
structure so they are not duplicated.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_ring/rte_ring.c | 20 ++++++++++----------
 lib/librte_ring/rte_ring.h | 32 ++++++++++++++++----------------
 test/test/test_ring.c      |  6 +++---
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 93a8692..93485d4 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -144,11 +144,11 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->prod.watermark = count;
+	r->watermark = count;
 	r->prod.single = !!(flags & RING_F_SP_ENQ);
 	r->cons.single = !!(flags & RING_F_SC_DEQ);
-	r->prod.size = r->cons.size = count;
-	r->prod.mask = r->cons.mask = count-1;
+	r->size = count;
+	r->mask = count - 1;
 	r->prod.head = r->cons.head = 0;
 	r->prod.tail = r->cons.tail = 0;
 
@@ -269,14 +269,14 @@ rte_ring_free(struct rte_ring *r)
 int
 rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 {
-	if (count >= r->prod.size)
+	if (count >= r->size)
 		return -EINVAL;
 
 	/* if count is 0, disable the watermarking */
 	if (count == 0)
-		count = r->prod.size;
+		count = r->size;
 
-	r->prod.watermark = count;
+	r->watermark = count;
 	return 0;
 }
 
@@ -291,17 +291,17 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  ct=%"PRIu32"\n", r->cons.tail);
 	fprintf(f, "  ch=%"PRIu32"\n", r->cons.head);
 	fprintf(f, "  pt=%"PRIu32"\n", r->prod.tail);
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->prod.watermark == r->prod.size)
+	if (r->watermark == r->size)
 		fprintf(f, "  watermark=0\n");
 	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->prod.watermark);
+		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 
 	/* sum and dump statistics */
 #ifdef RTE_LIBRTE_RING_DEBUG
@@ -318,7 +318,7 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
 		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
 	}
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
 	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
 	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 331c94f..d650215 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -151,10 +151,7 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 struct rte_ring_headtail {
 	volatile uint32_t head;  /**< Prod/consumer head. */
 	volatile uint32_t tail;  /**< Prod/consumer tail. */
-	uint32_t size;           /**< Size of ring. */
-	uint32_t mask;           /**< Mask (size-1) of ring. */
 	uint32_t single;         /**< True if single prod/cons */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 };
 
 /**
@@ -174,9 +171,12 @@ struct rte_ring {
 	 * next time the ABI changes
 	 */
 	char name[RTE_MEMZONE_NAMESIZE];    /**< Name of the ring. */
-	int flags;                       /**< Flags supplied at creation. */
+	int flags;               /**< Flags supplied at creation. */
 	const struct rte_memzone *memzone;
 			/**< Memzone, if any, containing the rte_ring */
+	uint32_t size;           /**< Size of ring. */
+	uint32_t mask;           /**< Mask (size-1) of ring. */
+	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -355,7 +355,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * Placed here since identical code needed in both
  * single and multi producer enqueue functions */
 #define ENQUEUE_PTRS() do { \
-	const uint32_t size = r->prod.size; \
+	const uint32_t size = r->size; \
 	uint32_t idx = prod_head & mask; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & ((~(unsigned)0x3))); i+=4, idx+=4) { \
@@ -382,7 +382,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * single and multi consumer dequeue functions */
 #define DEQUEUE_PTRS() do { \
 	uint32_t idx = cons_head & mask; \
-	const uint32_t size = r->cons.size; \
+	const uint32_t size = r->size; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & (~(unsigned)0x3)); i+=4, idx+=4) {\
 			obj_table[i] = r->ring[idx]; \
@@ -437,7 +437,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -485,7 +485,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -544,7 +544,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	prod_head = r->prod.head;
@@ -580,7 +580,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -630,7 +630,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -727,7 +727,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
 	prod_tail = r->prod.tail;
@@ -1056,7 +1056,7 @@ rte_ring_full(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return ((cons_tail - prod_tail - 1) & r->prod.mask) == 0;
+	return ((cons_tail - prod_tail - 1) & r->mask) == 0;
 }
 
 /**
@@ -1089,7 +1089,7 @@ rte_ring_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (prod_tail - cons_tail) & r->prod.mask;
+	return (prod_tail - cons_tail) & r->mask;
 }
 
 /**
@@ -1105,7 +1105,7 @@ rte_ring_free_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (cons_tail - prod_tail - 1) & r->prod.mask;
+	return (cons_tail - prod_tail - 1) & r->mask;
 }
 
 /**
@@ -1119,7 +1119,7 @@ rte_ring_free_count(const struct rte_ring *r)
 static inline unsigned int
 rte_ring_get_size(const struct rte_ring *r)
 {
-	return r->prod.size;
+	return r->size;
 }
 
 /**
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index ebcb896..5f09097 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -148,7 +148,7 @@ check_live_watermark_change(__attribute__((unused)) void *dummy)
 		}
 
 		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->prod.watermark;
+		watermark = r->watermark;
 		if (watermark != watermark_old &&
 		    (watermark_old != 16 || watermark != 32)) {
 			printf("Bad watermark change %u -> %u\n", watermark_old,
@@ -213,7 +213,7 @@ test_set_watermark( void ){
 		printf( " ring lookup failed\n" );
 		goto error;
 	}
-	count = r->prod.size*2;
+	count = r->size * 2;
 	setwm = rte_ring_set_water_mark(r, count);
 	if (setwm != -EINVAL){
 		printf("Test failed to detect invalid watermark count value\n");
@@ -222,7 +222,7 @@ test_set_watermark( void ){
 
 	count = 0;
 	rte_ring_set_water_mark(r, count);
-	if (r->prod.watermark != r->prod.size) {
+	if (r->watermark != r->size) {
 		printf("Test failed to detect invalid watermark count value\n");
 		goto error;
 	}
-- 
2.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 01/14] ring: remove split cacheline build setting
  @ 2017-03-28 20:35  5%       ` Bruce Richardson
  2017-03-28 20:35  3%       ` [dpdk-dev] [PATCH v4 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-28 20:35 UTC (permalink / raw)
  To: olivier.matz; +Cc: thomas.monjalon, dev, Bruce Richardson

Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure.  Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V2: Limit the cacheline * 2 alignment to platforms with < 128B line size
---
 config/common_base                     |  1 -
 doc/guides/rel_notes/release_17_05.rst |  7 +++++++
 lib/librte_ring/rte_ring.c             |  2 --
 lib/librte_ring/rte_ring.h             | 16 ++++++++++------
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/config/common_base b/config/common_base
index 37aa1e1..c394651 100644
--- a/config/common_base
+++ b/config/common_base
@@ -453,7 +453,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 #
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
-CONFIG_RTE_RING_SPLIT_PROD_CONS=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2a045b3..8b66ac3 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -127,6 +127,13 @@ API Changes
 * The LPM ``next_hop`` field is extended from 8 bits to 21 bits for IPv6
   while keeping ABI compatibility.
 
+* **Reworked rte_ring library**
+
+  The rte_ring library has been reworked and updated. The following changes
+  have been made to it:
+
+  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+
 ABI Changes
 -----------
 
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index ca0a108..4bc6da1 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_RING_SPLIT_PROD_CONS
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
 #ifdef RTE_LIBRTE_RING_DEBUG
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 72ccca5..399ae3b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -139,6 +139,14 @@ struct rte_ring_debug_stats {
 
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
+#if RTE_CACHE_LINE_SIZE < 128
+#define PROD_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#define CONS_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#else
+#define PROD_ALIGN RTE_CACHE_LINE_SIZE
+#define CONS_ALIGN RTE_CACHE_LINE_SIZE
+#endif
+
 /**
  * An RTE ring structure.
  *
@@ -168,7 +176,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Producer head. */
 		volatile uint32_t tail;  /**< Producer tail. */
-	} prod __rte_cache_aligned;
+	} prod __rte_aligned(PROD_ALIGN);
 
 	/** Ring consumer status. */
 	struct cons {
@@ -177,11 +185,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Consumer head. */
 		volatile uint32_t tail;  /**< Consumer tail. */
-#ifdef RTE_RING_SPLIT_PROD_CONS
-	} cons __rte_cache_aligned;
-#else
-	} cons;
-#endif
+	} cons __rte_aligned(CONS_ALIGN);
 
 #ifdef RTE_LIBRTE_RING_DEBUG
 	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-- 
2.9.3

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v3 19/22] vhost: rename header file
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
                       ` (6 preceding siblings ...)
  2017-03-28 12:45  2%     ` [dpdk-dev] [PATCH v3 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
@ 2017-03-28 12:45 10%     ` Yuanhan Liu
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/rel_notes/deprecation.rst               |  9 ---------
 doc/guides/rel_notes/release_17_05.rst             |  3 +++
 drivers/net/vhost/rte_eth_vhost.c                  |  2 +-
 drivers/net/vhost/rte_eth_vhost.h                  |  2 +-
 examples/tep_termination/main.c                    |  2 +-
 examples/tep_termination/vxlan_setup.c             |  2 +-
 examples/vhost/main.c                              |  2 +-
 lib/librte_vhost/Makefile                          |  2 +-
 lib/librte_vhost/{rte_virtio_net.h => rte_vhost.h} | 10 +++++-----
 lib/librte_vhost/vhost.c                           |  2 +-
 lib/librte_vhost/vhost.h                           |  2 +-
 lib/librte_vhost/vhost_user.h                      |  2 +-
 lib/librte_vhost/virtio_net.c                      |  2 +-
 13 files changed, 18 insertions(+), 24 deletions(-)
 rename lib/librte_vhost/{rte_virtio_net.h => rte_vhost.h} (98%)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d6544ed..9708b39 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -95,15 +95,6 @@ Deprecation Notices
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
 
-* vhost: API/ABI changes are planned for 17.05, for making DPDK vhost library
-  generic enough so that applications can build different vhost-user drivers
-  (instead of vhost-user net only) on top of that.
-  Specifically, ``virtio_net_device_ops`` will be renamed to ``vhost_device_ops``.
-  Correspondingly, some API's parameter need be changed. Few more functions also
-  need be reworked to let it be device aware. For example, different virtio device
-  has different feature set, meaning functions like ``rte_vhost_feature_disable``
-  need be changed. Last, file rte_virtio_net.h will be renamed to rte_vhost.h.
-
 * ABI changes are planned for 17.05 in the ``rte_cryptodev_ops`` structure.
   A pointer to a rte_cryptodev_config structure will be added to the
   function prototype ``cryptodev_configure_t``, as a new parameter.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 8f06fc4..c053fff 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -165,6 +165,9 @@ API Changes
    * The vhost API ``rte_vhost_driver_session_start`` is removed. Instead,
      ``rte_vhost_driver_start`` should be used.
 
+   * The vhost public header file ``rte_virtio_net.h`` is renamed to
+     ``rte_vhost.h``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index e6c0758..32e774b 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -40,7 +40,7 @@
 #include <rte_memcpy.h>
 #include <rte_vdev.h>
 #include <rte_kvargs.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_spinlock.h>
 
 #include "rte_eth_vhost.h"
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index ea4bce4..39ca771 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -41,7 +41,7 @@
 #include <stdint.h>
 #include <stdbool.h>
 
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 /*
  * Event description.
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 24c62cd..cd6e3f1 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "main.h"
 #include "vxlan.h"
diff --git a/examples/tep_termination/vxlan_setup.c b/examples/tep_termination/vxlan_setup.c
index 8f1f15b..87de74d 100644
--- a/examples/tep_termination/vxlan_setup.c
+++ b/examples/tep_termination/vxlan_setup.c
@@ -49,7 +49,7 @@
 #include <rte_tcp.h>
 
 #include "main.h"
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 #include "vxlan.h"
 #include "vxlan_setup.h"
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 64b3eea..08b82f6 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 5cf4e93..4847069 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -51,7 +51,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \
 				   virtio_net.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h
 
 # dependencies
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_vhost.h
similarity index 98%
rename from lib/librte_vhost/rte_virtio_net.h
rename to lib/librte_vhost/rte_vhost.h
index 627708d..d4ee210 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -31,12 +31,12 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#ifndef _VIRTIO_NET_H_
-#define _VIRTIO_NET_H_
+#ifndef _RTE_VHOST_H_
+#define _RTE_VHOST_H_
 
 /**
  * @file
- * Interface to vhost net
+ * Interface to vhost-user
  */
 
 #include <stdint.h>
@@ -418,4 +418,4 @@ uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
 			      struct rte_vhost_vring *vring);
 
-#endif /* _VIRTIO_NET_H_ */
+#endif /* _RTE_VHOST_H_ */
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 8be5b6a..3105a47 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -45,7 +45,7 @@
 #include <rte_string_fns.h>
 #include <rte_memory.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "vhost.h"
 
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index a199ee6..ddd8a9c 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -46,7 +46,7 @@
 #include <rte_log.h>
 #include <rte_ether.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* Used to indicate that the device is running on a data core */
 #define VIRTIO_DEV_RUNNING 1
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 838dec8..2ba22db 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -37,7 +37,7 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* refer to hw/virtio/vhost-user.c */
 
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 7ae7904..1004ae6 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -39,7 +39,7 @@
 #include <rte_memcpy.h>
 #include <rte_ether.h>
 #include <rte_ip.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
-- 
1.9.0

^ permalink raw reply	[relevance 10%]

* [dpdk-dev] [PATCH v3 18/22] vhost: introduce API to start a specific driver
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
                       ` (5 preceding siblings ...)
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 14/22] vhost: rename device ops struct Yuanhan Liu
@ 2017-03-28 12:45  2%     ` Yuanhan Liu
  2017-03-28 12:45 10%     ` [dpdk-dev] [PATCH v3 19/22] vhost: rename header file Yuanhan Liu
  2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

We used to use rte_vhost_driver_session_start() to trigger the vhost-user
session. It takes no argument, thus it's a global trigger. And it could
be problematic.

The issue is, currently, rte_vhost_driver_register(path, flags) actually
tries to put it into the session loop (by fdset_add). However, it needs
a set of APIs to set a vhost-user driver properly:
  * rte_vhost_driver_register(path, flags);
  * rte_vhost_driver_set_features(path, features);
  * rte_vhost_driver_callback_register(path, vhost_device_ops);

If a new vhost-user driver is registered after the trigger (think OVS-DPDK
that could add a port dynamically from cmdline), the current code will
effectively starts the session for the new driver just after the first
API rte_vhost_driver_register() is invoked, leaving later calls taking
no effect at all.

To handle the case properly, this patch introduce a new API,
rte_vhost_driver_start(path), to trigger a specific vhost-user driver.
To do that, the rte_vhost_driver_register(path, flags) is simplified
to create the socket only and let rte_vhost_driver_start(path) to
actually put it into the session loop.

Meanwhile, the rte_vhost_driver_session_start is removed: we could hide
the session thread internally (create the thread if it has not been
created). This would also simplify the application.

NOTE: the API order in prog guide is slightly adjusted for showing the
correct invoke order.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---

v3: - fix broken reconnect
---
 doc/guides/prog_guide/vhost_lib.rst    | 24 +++++-----
 doc/guides/rel_notes/release_17_05.rst |  8 ++++
 drivers/net/vhost/rte_eth_vhost.c      | 50 ++------------------
 examples/tep_termination/main.c        |  8 +++-
 examples/vhost/main.c                  |  9 +++-
 lib/librte_vhost/fd_man.c              |  9 ++--
 lib/librte_vhost/fd_man.h              |  2 +-
 lib/librte_vhost/rte_vhost_version.map |  2 +-
 lib/librte_vhost/rte_virtio_net.h      | 15 +++++-
 lib/librte_vhost/socket.c              | 84 ++++++++++++++++++++--------------
 10 files changed, 108 insertions(+), 103 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index a4fb1f1..5979290 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -116,12 +116,6 @@ The following is an overview of some key Vhost API functions:
   vhost-user driver could be vhost-user net, yet it could be something else,
   say, vhost-user SCSI.
 
-* ``rte_vhost_driver_session_start()``
-
-  This function starts the vhost session loop to handle vhost messages. It
-  starts an infinite loop, therefore it should be called in a dedicated
-  thread.
-
 * ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
@@ -149,6 +143,17 @@ The following is an overview of some key Vhost API functions:
     ``VHOST_F_LOG_ALL`` will be set/cleared at the start/end of live
     migration, respectively.
 
+* ``rte_vhost_driver_disable/enable_features(path, features))``
+
+  This function disables/enables some features. For example, it can be used to
+  disable mergeable buffers and TSO features, which both are enabled by
+  default.
+
+* ``rte_vhost_driver_start(path)``
+
+  This function triggers the vhost-user negotiation. It should be invoked at
+  the end of initializing a vhost-user driver.
+
 * ``rte_vhost_enqueue_burst(vid, queue_id, pkts, count)``
 
   Transmits (enqueues) ``count`` packets from host to guest.
@@ -157,13 +162,6 @@ The following is an overview of some key Vhost API functions:
 
   Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``.
 
-* ``rte_vhost_driver_disable/enable_features(path, features))``
-
-  This function disables/enables some features. For example, it can be used to
-  disable mergeable buffers and TSO features, which both are enabled by
-  default.
-
-
 Vhost-user Implementations
 --------------------------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2efe292..8f06fc4 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -57,6 +57,11 @@ New Features
   * Enable Vhost PMD's MTU get feature.
   * Get max MTU value from host in Virtio PMD
 
+* **Made the vhost lib be a generic vhost-user lib.**
+
+  Now it could be used to implement any other vhost-user drivers, such
+  as, vhost-user SCSI.
+
 
 Resolved Issues
 ---------------
@@ -157,6 +162,9 @@ API Changes
    * The vhost struct ``virtio_net_device_ops`` is renamed to
      ``vhost_device_ops``
 
+   * The vhost API ``rte_vhost_driver_session_start`` is removed. Instead,
+     ``rte_vhost_driver_start`` should be used.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 97a765f..e6c0758 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -127,9 +127,6 @@ struct internal_list {
 
 static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
 
-static rte_atomic16_t nb_started_ports;
-static pthread_t session_th;
-
 static struct rte_eth_link pmd_link = {
 		.link_speed = 10000,
 		.link_duplex = ETH_LINK_FULL_DUPLEX,
@@ -743,42 +740,6 @@ struct vhost_xstats_name_off {
 	return vid;
 }
 
-static void *
-vhost_driver_session(void *param __rte_unused)
-{
-	/* start event handling */
-	rte_vhost_driver_session_start();
-
-	return NULL;
-}
-
-static int
-vhost_driver_session_start(void)
-{
-	int ret;
-
-	ret = pthread_create(&session_th,
-			NULL, vhost_driver_session, NULL);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't create a thread\n");
-
-	return ret;
-}
-
-static void
-vhost_driver_session_stop(void)
-{
-	int ret;
-
-	ret = pthread_cancel(session_th);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't cancel the thread\n");
-
-	ret = pthread_join(session_th, NULL);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't join the thread\n");
-}
-
 static int
 eth_dev_start(struct rte_eth_dev *dev)
 {
@@ -1083,10 +1044,10 @@ struct vhost_xstats_name_off {
 		goto error;
 	}
 
-	/* We need only one message handling thread */
-	if (rte_atomic16_add_return(&nb_started_ports, 1) == 1) {
-		if (vhost_driver_session_start())
-			goto error;
+	if (rte_vhost_driver_start(iface_name) < 0) {
+		RTE_LOG(ERR, PMD, "Failed to start driver for %s\n",
+			iface_name);
+		goto error;
 	}
 
 	return data->port_id;
@@ -1213,9 +1174,6 @@ struct vhost_xstats_name_off {
 
 	eth_dev_close(eth_dev);
 
-	if (rte_atomic16_sub_return(&nb_started_ports, 1) == 0)
-		vhost_driver_session_stop();
-
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 738f2d2..24c62cd 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1263,7 +1263,13 @@ static inline void __attribute__((always_inline))
 			"failed to register vhost driver callbacks.\n");
 	}
 
-	rte_vhost_driver_session_start();
+	if (rte_vhost_driver_start(dev_basename) < 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to start vhost driver.\n");
+	}
+
+	RTE_LCORE_FOREACH_SLAVE(lcore_id)
+		rte_eal_wait_lcore(lcore_id);
 
 	return 0;
 }
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 4395306..64b3eea 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1545,9 +1545,16 @@ static inline void __attribute__((always_inline))
 			rte_exit(EXIT_FAILURE,
 				"failed to register vhost driver callbacks.\n");
 		}
+
+		if (rte_vhost_driver_start(file) < 0) {
+			rte_exit(EXIT_FAILURE,
+				"failed to start vhost driver.\n");
+		}
 	}
 
-	rte_vhost_driver_session_start();
+	RTE_LCORE_FOREACH_SLAVE(lcore_id)
+		rte_eal_wait_lcore(lcore_id);
+
 	return 0;
 
 }
diff --git a/lib/librte_vhost/fd_man.c b/lib/librte_vhost/fd_man.c
index c7a4490..2ceacc9 100644
--- a/lib/librte_vhost/fd_man.c
+++ b/lib/librte_vhost/fd_man.c
@@ -210,8 +210,8 @@
  * will wait until the flag is reset to zero(which indicates the callback is
  * finished), then it could free the context after fdset_del.
  */
-void
-fdset_event_dispatch(struct fdset *pfdset)
+void *
+fdset_event_dispatch(void *arg)
 {
 	int i;
 	struct pollfd *pfd;
@@ -221,9 +221,10 @@
 	int fd, numfds;
 	int remove1, remove2;
 	int need_shrink;
+	struct fdset *pfdset = arg;
 
 	if (pfdset == NULL)
-		return;
+		return NULL;
 
 	while (1) {
 
@@ -294,4 +295,6 @@
 		if (need_shrink)
 			fdset_shrink(pfdset);
 	}
+
+	return NULL;
 }
diff --git a/lib/librte_vhost/fd_man.h b/lib/librte_vhost/fd_man.h
index d319cac..90d34db 100644
--- a/lib/librte_vhost/fd_man.h
+++ b/lib/librte_vhost/fd_man.h
@@ -64,6 +64,6 @@ int fdset_add(struct fdset *pfdset, int fd,
 
 void *fdset_del(struct fdset *pfdset, int fd);
 
-void fdset_event_dispatch(struct fdset *pfdset);
+void *fdset_event_dispatch(void *arg);
 
 #endif
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index f4b74da..0785873 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -4,7 +4,6 @@ DPDK_2.0 {
 	rte_vhost_dequeue_burst;
 	rte_vhost_driver_callback_register;
 	rte_vhost_driver_register;
-	rte_vhost_driver_session_start;
 	rte_vhost_enable_guest_notification;
 	rte_vhost_enqueue_burst;
 
@@ -35,6 +34,7 @@ DPDK_17.05 {
 	rte_vhost_driver_enable_features;
 	rte_vhost_driver_get_features;
 	rte_vhost_driver_set_features;
+	rte_vhost_driver_start;
 	rte_vhost_get_mem_table;
 	rte_vhost_get_mtu;
 	rte_vhost_get_negotiated_features;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 11b204d..627708d 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -250,8 +250,19 @@ void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(const char *path,
 	struct vhost_device_ops const * const ops);
-/* Start vhost driver session blocking loop. */
-int rte_vhost_driver_session_start(void);
+
+/**
+ *
+ * Start the vhost-user driver.
+ *
+ * This function triggers the vhost-user negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_start(const char *path);
 
 /**
  * Get the MTU value of the device if set in QEMU.
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 9fb28b5..eb2aedd 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -63,7 +63,8 @@ struct vhost_user_socket {
 	struct vhost_user_connection_list conn_list;
 	pthread_mutex_t conn_mutex;
 	char *path;
-	int listenfd;
+	int socket_fd;
+	struct sockaddr_un un;
 	bool is_server;
 	bool reconnect;
 	bool dequeue_zero_copy;
@@ -101,7 +102,8 @@ struct vhost_user {
 
 static void vhost_user_server_new_connection(int fd, void *data, int *remove);
 static void vhost_user_read_cb(int fd, void *dat, int *remove);
-static int vhost_user_create_client(struct vhost_user_socket *vsocket);
+static int create_unix_socket(struct vhost_user_socket *vsocket);
+static int vhost_user_start_client(struct vhost_user_socket *vsocket);
 
 static struct vhost_user vhost_user = {
 	.fdset = {
@@ -280,23 +282,26 @@ struct vhost_user {
 
 		free(conn);
 
-		if (vsocket->reconnect)
-			vhost_user_create_client(vsocket);
+		if (vsocket->reconnect) {
+			create_unix_socket(vsocket);
+			vhost_user_start_client(vsocket);
+		}
 	}
 }
 
 static int
-create_unix_socket(const char *path, struct sockaddr_un *un, bool is_server)
+create_unix_socket(struct vhost_user_socket *vsocket)
 {
 	int fd;
+	struct sockaddr_un *un = &vsocket->un;
 
 	fd = socket(AF_UNIX, SOCK_STREAM, 0);
 	if (fd < 0)
 		return -1;
 	RTE_LOG(INFO, VHOST_CONFIG, "vhost-user %s: socket created, fd: %d\n",
-		is_server ? "server" : "client", fd);
+		vsocket->is_server ? "server" : "client", fd);
 
-	if (!is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) {
+	if (!vsocket->is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"vhost-user: can't set nonblocking mode for socket, fd: "
 			"%d (%s)\n", fd, strerror(errno));
@@ -306,25 +311,21 @@ struct vhost_user {
 
 	memset(un, 0, sizeof(*un));
 	un->sun_family = AF_UNIX;
-	strncpy(un->sun_path, path, sizeof(un->sun_path));
+	strncpy(un->sun_path, vsocket->path, sizeof(un->sun_path));
 	un->sun_path[sizeof(un->sun_path) - 1] = '\0';
 
-	return fd;
+	vsocket->socket_fd = fd;
+	return 0;
 }
 
 static int
-vhost_user_create_server(struct vhost_user_socket *vsocket)
+vhost_user_start_server(struct vhost_user_socket *vsocket)
 {
-	int fd;
 	int ret;
-	struct sockaddr_un un;
+	int fd = vsocket->socket_fd;
 	const char *path = vsocket->path;
 
-	fd = create_unix_socket(path, &un, vsocket->is_server);
-	if (fd < 0)
-		return -1;
-
-	ret = bind(fd, (struct sockaddr *)&un, sizeof(un));
+	ret = bind(fd, (struct sockaddr *)&vsocket->un, sizeof(vsocket->un));
 	if (ret < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"failed to bind to %s: %s; remove it and try again\n",
@@ -337,7 +338,6 @@ struct vhost_user {
 	if (ret < 0)
 		goto err;
 
-	vsocket->listenfd = fd;
 	ret = fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection,
 		  NULL, vsocket);
 	if (ret < 0) {
@@ -456,20 +456,15 @@ struct vhost_user_reconnect_list {
 }
 
 static int
-vhost_user_create_client(struct vhost_user_socket *vsocket)
+vhost_user_start_client(struct vhost_user_socket *vsocket)
 {
-	int fd;
 	int ret;
-	struct sockaddr_un un;
+	int fd = vsocket->socket_fd;
 	const char *path = vsocket->path;
 	struct vhost_user_reconnect *reconn;
 
-	fd = create_unix_socket(path, &un, vsocket->is_server);
-	if (fd < 0)
-		return -1;
-
-	ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&un,
-					  sizeof(un));
+	ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&vsocket->un,
+					  sizeof(vsocket->un));
 	if (ret == 0) {
 		vhost_user_add_connection(fd, vsocket);
 		return 0;
@@ -492,7 +487,7 @@ struct vhost_user_reconnect_list {
 		close(fd);
 		return -1;
 	}
-	reconn->un = un;
+	reconn->un = vsocket->un;
 	reconn->fd = fd;
 	reconn->vsocket = vsocket;
 	pthread_mutex_lock(&reconn_list.mutex);
@@ -643,11 +638,10 @@ struct vhost_user_reconnect_list {
 				goto out;
 			}
 		}
-		ret = vhost_user_create_client(vsocket);
 	} else {
 		vsocket->is_server = true;
-		ret = vhost_user_create_server(vsocket);
 	}
+	ret = create_unix_socket(vsocket);
 	if (ret < 0) {
 		free(vsocket->path);
 		free(vsocket);
@@ -703,8 +697,8 @@ struct vhost_user_reconnect_list {
 
 		if (!strcmp(vsocket->path, path)) {
 			if (vsocket->is_server) {
-				fdset_del(&vhost_user.fdset, vsocket->listenfd);
-				close(vsocket->listenfd);
+				fdset_del(&vhost_user.fdset, vsocket->socket_fd);
+				close(vsocket->socket_fd);
 				unlink(path);
 			} else if (vsocket->reconnect) {
 				vhost_user_remove_reconnect(vsocket);
@@ -774,8 +768,28 @@ struct vhost_device_ops const *
 }
 
 int
-rte_vhost_driver_session_start(void)
+rte_vhost_driver_start(const char *path)
 {
-	fdset_event_dispatch(&vhost_user.fdset);
-	return 0;
+	struct vhost_user_socket *vsocket;
+	static pthread_t fdset_tid;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	if (!vsocket)
+		return -1;
+
+	if (fdset_tid == 0) {
+		int ret = pthread_create(&fdset_tid, NULL, fdset_event_dispatch,
+				     &vhost_user.fdset);
+		if (ret < 0)
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"failed to create fdset handling thread");
+	}
+
+	if (vsocket->is_server)
+		return vhost_user_start_server(vsocket);
+	else
+		return vhost_user_start_client(vsocket);
 }
-- 
1.9.0

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3 14/22] vhost: rename device ops struct
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
                       ` (4 preceding siblings ...)
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 13/22] vhost: do not include net specific headers Yuanhan Liu
@ 2017-03-28 12:45  4%     ` Yuanhan Liu
  2017-03-28 12:45  2%     ` [dpdk-dev] [PATCH v3 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
                       ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

rename "virtio_net_device_ops" to "vhost_device_ops", to not let it
be virtio-net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/prog_guide/vhost_lib.rst    | 2 +-
 doc/guides/rel_notes/release_17_05.rst | 3 +++
 drivers/net/vhost/rte_eth_vhost.c      | 2 +-
 examples/tep_termination/main.c        | 2 +-
 examples/vhost/main.c                  | 2 +-
 lib/librte_vhost/Makefile              | 2 +-
 lib/librte_vhost/rte_virtio_net.h      | 4 ++--
 lib/librte_vhost/socket.c              | 6 +++---
 lib/librte_vhost/vhost.h               | 4 ++--
 9 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 40f3b3b..e6e34f3 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -122,7 +122,7 @@ The following is an overview of some key Vhost API functions:
   starts an infinite loop, therefore it should be called in a dedicated
   thread.
 
-* ``rte_vhost_driver_callback_register(path, virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
   the appropriate action when some events happen. The following events are
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2b56e80..2efe292 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -154,6 +154,9 @@ API Changes
      * ``linux/if.h``
      * ``rte_ether.h``
 
+   * The vhost struct ``virtio_net_device_ops`` is renamed to
+     ``vhost_device_ops``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 891ee70..97a765f 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -671,7 +671,7 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
-static struct virtio_net_device_ops vhost_ops = {
+static struct vhost_device_ops vhost_ops = {
 	.new_device          = new_device,
 	.destroy_device      = destroy_device,
 	.vring_state_changed = vring_state_changed,
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 18b977e..738f2d2 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1081,7 +1081,7 @@ static inline void __attribute__((always_inline))
  * These callback allow devices to be added to the data core when configuration
  * has been fully complete.
  */
-static const struct virtio_net_device_ops virtio_net_device_ops = {
+static const struct vhost_device_ops virtio_net_device_ops = {
 	.new_device =  new_device,
 	.destroy_device = destroy_device,
 };
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 72a9d69..4395306 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1270,7 +1270,7 @@ static inline void __attribute__((always_inline))
  * These callback allow devices to be added to the data core when configuration
  * has been fully complete.
  */
-static const struct virtio_net_device_ops virtio_net_device_ops =
+static const struct vhost_device_ops virtio_net_device_ops =
 {
 	.new_device =  new_device,
 	.destroy_device = destroy_device,
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 415ffc6..5cf4e93 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -36,7 +36,7 @@ LIB = librte_vhost.a
 
 EXPORT_MAP := rte_vhost_version.map
 
-LIBABIVER := 3
+LIBABIVER := 4
 
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64
 CFLAGS += -I vhost_user
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 0063949..26ac35f 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -87,7 +87,7 @@ struct rte_vhost_vring {
 /**
  * Device and vring operations.
  */
-struct virtio_net_device_ops {
+struct vhost_device_ops {
 	int (*new_device)(int vid);		/**< Add device. */
 	void (*destroy_device)(int vid);	/**< Remove device. */
 
@@ -198,7 +198,7 @@ static inline uint64_t __attribute__((always_inline))
 
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(const char *path,
-	struct virtio_net_device_ops const * const ops);
+	struct vhost_device_ops const * const ops);
 /* Start vhost driver session blocking loop. */
 int rte_vhost_driver_session_start(void);
 
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index ed9c5b4..9fb28b5 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -78,7 +78,7 @@ struct vhost_user_socket {
 	uint64_t supported_features;
 	uint64_t features;
 
-	struct virtio_net_device_ops const *notify_ops;
+	struct vhost_device_ops const *notify_ops;
 };
 
 struct vhost_user_connection {
@@ -748,7 +748,7 @@ struct vhost_user_reconnect_list {
  */
 int
 rte_vhost_driver_callback_register(const char *path,
-	struct virtio_net_device_ops const * const ops)
+	struct vhost_device_ops const * const ops)
 {
 	struct vhost_user_socket *vsocket;
 
@@ -761,7 +761,7 @@ struct vhost_user_reconnect_list {
 	return vsocket ? 0 : -1;
 }
 
-struct virtio_net_device_ops const *
+struct vhost_device_ops const *
 vhost_driver_callback_get(const char *path)
 {
 	struct vhost_user_socket *vsocket;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 672098b..225ff2e 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -191,7 +191,7 @@ struct virtio_net {
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
-	struct virtio_net_device_ops const *notify_ops;
+	struct vhost_device_ops const *notify_ops;
 
 	uint32_t		nr_guest_pages;
 	uint32_t		max_guest_pages;
@@ -265,7 +265,7 @@ static inline phys_addr_t __attribute__((always_inline))
 void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);
 
-struct virtio_net_device_ops const *vhost_driver_callback_get(const char *path);
+struct vhost_device_ops const *vhost_driver_callback_get(const char *path);
 
 /*
  * Backend-specific cleanup.
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3 12/22] vhost: drop the Rx and Tx queue macro
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
                       ` (2 preceding siblings ...)
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 10/22] vhost: export the number of vrings Yuanhan Liu
@ 2017-03-28 12:45  4%     ` Yuanhan Liu
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 13/22] vhost: do not include net specific headers Yuanhan Liu
                       ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

They are virtio-net specific and should be defined inside the virtio-net
driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst | 6 ++++++
 drivers/net/vhost/rte_eth_vhost.c      | 2 ++
 examples/tep_termination/main.h        | 2 ++
 examples/vhost/main.h                  | 2 ++
 lib/librte_vhost/rte_virtio_net.h      | 3 ---
 5 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index eca9451..55bf136 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -141,6 +141,12 @@ API Changes
    * The vhost API ``rte_vhost_get_queue_num`` is deprecated, instead,
      ``rte_vhost_get_vring_num`` should be used.
 
+   * Few macros are removed in ``rte_virtio_net.h``
+
+     * ``VIRTIO_RXQ``
+     * ``VIRTIO_TXQ``
+     * ``VIRTIO_QNUM``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index dc583e4..891ee70 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -45,6 +45,8 @@
 
 #include "rte_eth_vhost.h"
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 #define ETH_VHOST_IFACE_ARG		"iface"
 #define ETH_VHOST_QUEUES_ARG		"queues"
 #define ETH_VHOST_CLIENT_ARG		"client"
diff --git a/examples/tep_termination/main.h b/examples/tep_termination/main.h
index c0ea766..8ed817d 100644
--- a/examples/tep_termination/main.h
+++ b/examples/tep_termination/main.h
@@ -54,6 +54,8 @@
 /* Max number of devices. Limited by the application. */
 #define MAX_DEVICES 64
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 /* Per-device statistics struct */
 struct device_statistics {
 	uint64_t tx_total;
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index 6bb42e8..7a3d251 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -41,6 +41,8 @@
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER2
 #define RTE_LOGTYPE_VHOST_PORT   RTE_LOGTYPE_USER3
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 struct device_statistics {
 	uint64_t	tx;
 	uint64_t	tx_total;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index f700d2f..1ae1920 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -55,9 +55,6 @@
 #define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
 #define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
 
-/* Enum for virtqueue management. */
-enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
-
 /**
  * Information relating to memory regions including offsets to
  * addresses in QEMUs memory file.
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3 13/22] vhost: do not include net specific headers
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
                       ` (3 preceding siblings ...)
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
@ 2017-03-28 12:45  4%     ` Yuanhan Liu
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 14/22] vhost: rename device ops struct Yuanhan Liu
                       ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Include it internally, at vhost.h.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst | 7 +++++++
 examples/vhost/main.h                  | 2 ++
 lib/librte_vhost/rte_virtio_net.h      | 4 ----
 lib/librte_vhost/vhost.h               | 4 ++++
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 55bf136..2b56e80 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -147,6 +147,13 @@ API Changes
      * ``VIRTIO_TXQ``
      * ``VIRTIO_QNUM``
 
+   * Few net specific header files are removed in ``rte_virtio_net.h``
+
+     * ``linux/virtio_net.h``
+     * ``sys/socket.h``
+     * ``linux/if.h``
+     * ``rte_ether.h``
+
 
 ABI Changes
 -----------
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index 7a3d251..ddcd858 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -36,6 +36,8 @@
 
 #include <sys/queue.h>
 
+#include <rte_ether.h>
+
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER2
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 1ae1920..0063949 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -42,14 +42,10 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 #include <linux/virtio_ring.h>
-#include <linux/virtio_net.h>
 #include <sys/eventfd.h>
-#include <sys/socket.h>
-#include <linux/if.h>
 
 #include <rte_memory.h>
 #include <rte_mempool.h>
-#include <rte_ether.h>
 
 #define RTE_VHOST_USER_CLIENT		(1ULL << 0)
 #define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 84e379a..672098b 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -39,8 +39,12 @@
 #include <sys/queue.h>
 #include <unistd.h>
 #include <linux/vhost.h>
+#include <linux/virtio_net.h>
+#include <sys/socket.h>
+#include <linux/if.h>
 
 #include <rte_log.h>
+#include <rte_ether.h>
 
 #include "rte_virtio_net.h"
 
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3 10/22] vhost: export the number of vrings
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
  2017-03-28 12:45  5%     ` [dpdk-dev] [PATCH v3 02/22] net/vhost: remove feature related APIs Yuanhan Liu
  2017-03-28 12:45  3%     ` [dpdk-dev] [PATCH v3 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
@ 2017-03-28 12:45  4%     ` Yuanhan Liu
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
                       ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

We used to use rte_vhost_get_queue_num() for telling how many vrings.
However, the return value is the number of "queue pairs", which is
very virtio-net specific. To make it generic, we should return the
number of vrings instead, and let the driver do the proper translation.
Say, virtio-net driver could turn it to the number of queue pairs by
dividing 2.

Meanwhile, mark rte_vhost_get_queue_num as deprecated.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst |  3 +++
 drivers/net/vhost/rte_eth_vhost.c      |  2 +-
 lib/librte_vhost/rte_vhost_version.map |  1 +
 lib/librte_vhost/rte_virtio_net.h      | 17 +++++++++++++++++
 lib/librte_vhost/vhost.c               | 11 +++++++++++
 5 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index dfa636d..eca9451 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -138,6 +138,9 @@ API Changes
    * The vhost API ``rte_vhost_driver_callback_register(ops)`` takes one
      more argument: ``rte_vhost_driver_callback_register(path, ops)``.
 
+   * The vhost API ``rte_vhost_get_queue_num`` is deprecated, instead,
+     ``rte_vhost_get_vring_num`` should be used.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index f6ad616..dc583e4 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -569,7 +569,7 @@ struct vhost_xstats_name_off {
 		vq->port = eth_dev->data->port_id;
 	}
 
-	for (i = 0; i < rte_vhost_get_queue_num(vid) * VIRTIO_QNUM; i++)
+	for (i = 0; i < rte_vhost_get_vring_num(vid); i++)
 		rte_vhost_enable_guest_notification(vid, i, 0);
 
 	rte_vhost_get_mtu(vid, &eth_dev->data->mtu);
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 2b309b2..8df14dc 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -39,6 +39,7 @@ DPDK_17.05 {
 	rte_vhost_get_mtu;
 	rte_vhost_get_negotiated_features;
 	rte_vhost_get_vhost_vring;
+	rte_vhost_get_vring_num;
 	rte_vhost_gpa_to_vva;
 
 } DPDK_16.07;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 36674bb..f700d2f 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -237,17 +237,34 @@ int rte_vhost_driver_callback_register(const char *path,
 int rte_vhost_get_numa_node(int vid);
 
 /**
+ * @deprecated
  * Get the number of queues the device supports.
  *
+ * Note this function is deprecated, as it returns a queue pair number,
+ * which is virtio-net specific. Instead, rte_vhost_get_vring_num should
+ * be used.
+ *
  * @param vid
  *  virtio-net device ID
  *
  * @return
  *  The number of queues, 0 on failure
  */
+__rte_deprecated
 uint32_t rte_vhost_get_queue_num(int vid);
 
 /**
+ * Get the number of vrings the device supports.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of vrings, 0 on failure
+ */
+uint16_t rte_vhost_get_vring_num(int vid);
+
+/**
  * Get the virtio net device's ifname, which is the vhost-user socket
  * file path.
  *
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 70477c6..74ae3b2 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -317,6 +317,17 @@ struct virtio_net *
 	return dev->nr_vring / 2;
 }
 
+uint16_t
+rte_vhost_get_vring_num(int vid)
+{
+	struct virtio_net *dev = get_device(vid);
+
+	if (dev == NULL)
+		return 0;
+
+	return dev->nr_vring;
+}
+
 int
 rte_vhost_get_ifname(int vid, char *buf, size_t len)
 {
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3 04/22] vhost: make notify ops per vhost driver
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
  2017-03-28 12:45  5%     ` [dpdk-dev] [PATCH v3 02/22] net/vhost: remove feature related APIs Yuanhan Liu
@ 2017-03-28 12:45  3%     ` Yuanhan Liu
  2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 10/22] vhost: export the number of vrings Yuanhan Liu
                       ` (6 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Assume there is an application both support vhost-user net and
vhost-user scsi, the callback should be different. Making notify
ops per vhost driver allow application define different set of
callbacks for different driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---

v2: - check the return value of callback_register and callback_get
    - update release note
---
 doc/guides/prog_guide/vhost_lib.rst    |  2 +-
 doc/guides/rel_notes/release_17_05.rst |  3 +++
 drivers/net/vhost/rte_eth_vhost.c      | 20 +++++++++++---------
 examples/tep_termination/main.c        |  7 ++++++-
 examples/vhost/main.c                  |  9 +++++++--
 lib/librte_vhost/rte_virtio_net.h      |  3 ++-
 lib/librte_vhost/socket.c              | 32 ++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost.c               | 16 +---------------
 lib/librte_vhost/vhost.h               |  5 ++++-
 lib/librte_vhost/vhost_user.c          | 22 ++++++++++++++++------
 10 files changed, 83 insertions(+), 36 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 6a4d206..40f3b3b 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -122,7 +122,7 @@ The following is an overview of some key Vhost API functions:
   starts an infinite loop, therefore it should be called in a dedicated
   thread.
 
-* ``rte_vhost_driver_callback_register(virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, virtio_net_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
   the appropriate action when some events happen. The following events are
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 4e405b1..dfa636d 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -135,6 +135,9 @@ API Changes
      * ``rte_eth_vhost_feature_enable``
      * ``rte_eth_vhost_feature_get``
 
+   * The vhost API ``rte_vhost_driver_callback_register(ops)`` takes one
+     more argument: ``rte_vhost_driver_callback_register(path, ops)``.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 83063c2..f6ad616 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -669,6 +669,12 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
+static struct virtio_net_device_ops vhost_ops = {
+	.new_device          = new_device,
+	.destroy_device      = destroy_device,
+	.vring_state_changed = vring_state_changed,
+};
+
 int
 rte_eth_vhost_get_queue_event(uint8_t port_id,
 		struct rte_eth_vhost_queue_event *event)
@@ -738,15 +744,6 @@ struct vhost_xstats_name_off {
 static void *
 vhost_driver_session(void *param __rte_unused)
 {
-	static struct virtio_net_device_ops vhost_ops;
-
-	/* set vhost arguments */
-	vhost_ops.new_device = new_device;
-	vhost_ops.destroy_device = destroy_device;
-	vhost_ops.vring_state_changed = vring_state_changed;
-	if (rte_vhost_driver_callback_register(&vhost_ops) < 0)
-		RTE_LOG(ERR, PMD, "Can't register callbacks\n");
-
 	/* start event handling */
 	rte_vhost_driver_session_start();
 
@@ -1079,6 +1076,11 @@ struct vhost_xstats_name_off {
 	if (rte_vhost_driver_register(iface_name, flags))
 		goto error;
 
+	if (rte_vhost_driver_callback_register(iface_name, &vhost_ops) < 0) {
+		RTE_LOG(ERR, PMD, "Can't register callbacks\n");
+		goto error;
+	}
+
 	/* We need only one message handling thread */
 	if (rte_atomic16_add_return(&nb_started_ports, 1) == 1) {
 		if (vhost_driver_session_start())
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 8097dcd..18b977e 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1256,7 +1256,12 @@ static inline void __attribute__((always_inline))
 	rte_vhost_driver_disable_features(dev_basename,
 		1ULL << VIRTIO_NET_F_MRG_RXBUF);
 
-	rte_vhost_driver_callback_register(&virtio_net_device_ops);
+	ret = rte_vhost_driver_callback_register(dev_basename,
+		&virtio_net_device_ops);
+	if (ret != 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to register vhost driver callbacks.\n");
+	}
 
 	rte_vhost_driver_session_start();
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 972a6a8..72a9d69 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1538,9 +1538,14 @@ static inline void __attribute__((always_inline))
 			rte_vhost_driver_enable_features(file,
 				1ULL << VIRTIO_NET_F_CTRL_RX);
 		}
-	}
 
-	rte_vhost_driver_callback_register(&virtio_net_device_ops);
+		ret = rte_vhost_driver_callback_register(file,
+			&virtio_net_device_ops);
+		if (ret != 0) {
+			rte_exit(EXIT_FAILURE,
+				"failed to register vhost driver callbacks.\n");
+		}
+	}
 
 	rte_vhost_driver_session_start();
 	return 0;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 5dadd3d..67bd125 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -133,7 +133,8 @@ struct virtio_net_device_ops {
 uint64_t rte_vhost_driver_get_features(const char *path);
 
 /* Register callbacks. */
-int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const);
+int rte_vhost_driver_callback_register(const char *path,
+	struct virtio_net_device_ops const * const ops);
 /* Start vhost driver session blocking loop. */
 int rte_vhost_driver_session_start(void);
 
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 87801c0..ed9c5b4 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -77,6 +77,8 @@ struct vhost_user_socket {
 	 */
 	uint64_t supported_features;
 	uint64_t features;
+
+	struct virtio_net_device_ops const *notify_ops;
 };
 
 struct vhost_user_connection {
@@ -741,6 +743,36 @@ struct vhost_user_reconnect_list {
 	return -1;
 }
 
+/*
+ * Register ops so that we can add/remove device to data core.
+ */
+int
+rte_vhost_driver_callback_register(const char *path,
+	struct virtio_net_device_ops const * const ops)
+{
+	struct vhost_user_socket *vsocket;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	if (vsocket)
+		vsocket->notify_ops = ops;
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	return vsocket ? 0 : -1;
+}
+
+struct virtio_net_device_ops const *
+vhost_driver_callback_get(const char *path)
+{
+	struct vhost_user_socket *vsocket;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	return vsocket ? vsocket->notify_ops : NULL;
+}
+
 int
 rte_vhost_driver_session_start(void)
 {
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 7b40a92..7d7bb3c 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -51,9 +51,6 @@
 
 struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
 
-/* device ops to add/remove device to/from data core. */
-struct virtio_net_device_ops const *notify_ops;
-
 struct virtio_net *
 get_device(int vid)
 {
@@ -253,7 +250,7 @@ struct virtio_net *
 
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(vid);
+		dev->notify_ops->destroy_device(vid);
 	}
 
 	cleanup_device(dev, 1);
@@ -396,14 +393,3 @@ struct virtio_net *
 	dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
 	return 0;
 }
-
-/*
- * Register ops so that we can add/remove device to data core.
- */
-int
-rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
-{
-	notify_ops = ops;
-
-	return 0;
-}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 692691b..6186216 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -185,6 +185,8 @@ struct virtio_net {
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
+	struct virtio_net_device_ops const *notify_ops;
+
 	uint32_t		nr_guest_pages;
 	uint32_t		max_guest_pages;
 	struct guest_page       *guest_pages;
@@ -288,7 +290,6 @@ static inline phys_addr_t __attribute__((always_inline))
 	return 0;
 }
 
-struct virtio_net_device_ops const *notify_ops;
 struct virtio_net *get_device(int vid);
 
 int vhost_new_device(void);
@@ -301,6 +302,8 @@ static inline phys_addr_t __attribute__((always_inline))
 void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);
 
+struct virtio_net_device_ops const *vhost_driver_callback_get(const char *path);
+
 /*
  * Backend-specific cleanup.
  *
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index d630098..0cadd79 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -135,7 +135,7 @@
 {
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	cleanup_device(dev, 0);
@@ -503,7 +503,7 @@
 	/* Remove from the data plane. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	if (dev->mem) {
@@ -687,7 +687,7 @@
 						"dequeue zero copy is enabled\n");
 			}
 
-			if (notify_ops->new_device(dev->vid) == 0)
+			if (dev->notify_ops->new_device(dev->vid) == 0)
 				dev->flags |= VIRTIO_DEV_RUNNING;
 		}
 	}
@@ -721,7 +721,7 @@
 	/* We have to stop the queue (virtio) if it is running. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	dev->flags &= ~VIRTIO_DEV_READY;
@@ -763,8 +763,8 @@
 		"set queue enable: %d to qp idx: %d\n",
 		enable, state->index);
 
-	if (notify_ops->vring_state_changed)
-		notify_ops->vring_state_changed(dev->vid, state->index, enable);
+	if (dev->notify_ops->vring_state_changed)
+		dev->notify_ops->vring_state_changed(dev->vid, state->index, enable);
 
 	dev->virtqueue[state->index]->enabled = enable;
 
@@ -978,6 +978,16 @@
 	if (dev == NULL)
 		return -1;
 
+	if (!dev->notify_ops) {
+		dev->notify_ops = vhost_driver_callback_get(dev->ifname);
+		if (!dev->notify_ops) {
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"failed to get callback ops for driver %s\n",
+				dev->ifname);
+			return -1;
+		}
+	}
+
 	ret = read_vhost_message(fd, &msg);
 	if (ret <= 0 || msg.request >= VHOST_USER_MAX) {
 		if (ret < 0)
-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v3 02/22] net/vhost: remove feature related APIs
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
@ 2017-03-28 12:45  5%     ` Yuanhan Liu
  2017-03-28 12:45  3%     ` [dpdk-dev] [PATCH v3 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
                       ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

The rte_eth_vhost_feature_disable/enable/get APIs are just a wrapper of
rte_vhost_feature_disable/enable/get. However, the later are going to
be refactored; it's going to take an extra parameter (socket_file path),
to let it be per-device.

Instead of changing those vhost-pmd APIs to adapt to the new vhost APIs,
we could simply remove them, and let vdev to serve this purpose. After
all, vdev options is better for disabling/enabling some features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - write more informative commit log on why they are removed.
    - update release note
---
 doc/guides/rel_notes/release_17_05.rst      |  7 +++++++
 drivers/net/vhost/rte_eth_vhost.c           | 25 ------------------------
 drivers/net/vhost/rte_eth_vhost.h           | 30 -----------------------------
 drivers/net/vhost/rte_pmd_vhost_version.map |  3 ---
 4 files changed, 7 insertions(+), 58 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index bb64428..4e405b1 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -129,6 +129,13 @@ API Changes
 * The LPM ``next_hop`` field is extended from 8 bits to 21 bits for IPv6
   while keeping ABI compatibility.
 
+   * The following vhost-pmd APIs are removed
+
+     * ``rte_eth_vhost_feature_disable``
+     * ``rte_eth_vhost_feature_enable``
+     * ``rte_eth_vhost_feature_get``
+
+
 ABI Changes
 -----------
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index a4435da..83063c2 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -965,31 +965,6 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
-/**
- * Disable features in feature_mask. Returns 0 on success.
- */
-int
-rte_eth_vhost_feature_disable(uint64_t feature_mask)
-{
-	return rte_vhost_feature_disable(feature_mask);
-}
-
-/**
- * Enable features in feature_mask. Returns 0 on success.
- */
-int
-rte_eth_vhost_feature_enable(uint64_t feature_mask)
-{
-	return rte_vhost_feature_enable(feature_mask);
-}
-
-/* Returns currently supported vhost features */
-uint64_t
-rte_eth_vhost_feature_get(void)
-{
-	return rte_vhost_feature_get();
-}
-
 static const struct eth_dev_ops ops = {
 	.dev_start = eth_dev_start,
 	.dev_stop = eth_dev_stop,
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index 7c98b1a..ea4bce4 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -43,36 +43,6 @@
 
 #include <rte_virtio_net.h>
 
-/**
- * Disable features in feature_mask.
- *
- * @param feature_mask
- *  Vhost features defined in "linux/virtio_net.h".
- * @return
- *  - On success, zero.
- *  - On failure, a negative value.
- */
-int rte_eth_vhost_feature_disable(uint64_t feature_mask);
-
-/**
- * Enable features in feature_mask.
- *
- * @param feature_mask
- *  Vhost features defined in "linux/virtio_net.h".
- * @return
- *  - On success, zero.
- *  - On failure, a negative value.
- */
-int rte_eth_vhost_feature_enable(uint64_t feature_mask);
-
-/**
- * Returns currently supported vhost features.
- *
- * @return
- *  Vhost features defined in "linux/virtio_net.h".
- */
-uint64_t rte_eth_vhost_feature_get(void);
-
 /*
  * Event description.
  */
diff --git a/drivers/net/vhost/rte_pmd_vhost_version.map b/drivers/net/vhost/rte_pmd_vhost_version.map
index 3d44083..695db85 100644
--- a/drivers/net/vhost/rte_pmd_vhost_version.map
+++ b/drivers/net/vhost/rte_pmd_vhost_version.map
@@ -1,9 +1,6 @@
 DPDK_16.04 {
 	global:
 
-	rte_eth_vhost_feature_disable;
-	rte_eth_vhost_feature_enable;
-	rte_eth_vhost_feature_get;
 	rte_eth_vhost_get_queue_event;
 
 	local: *;
-- 
1.9.0

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
                     ` (7 preceding siblings ...)
  2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 19/22] vhost: rename header file Yuanhan Liu
@ 2017-03-28 12:45  4%   ` Yuanhan Liu
  2017-03-28 12:45  5%     ` [dpdk-dev] [PATCH v3 02/22] net/vhost: remove feature related APIs Yuanhan Liu
                       ` (8 more replies)
  8 siblings, 9 replies; 200+ results
From: Yuanhan Liu @ 2017-03-28 12:45 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

This patchset makes DPDK vhost library generic enough, so that we could
build other vhost-user drivers on top of it. For example, SPDK (Storage
Performance Development Kit) is trying to enable vhost-user SCSI.

The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
the info about the virtio device (i.e. vring address, negotiated features,
etc) and let the specific vhost-user driver to fetch them (by the API
provided by DPDK vhost lib). With those info being provided, the vhost-user
driver then could get/put vring entries, thus, it could exchange data
between the guest and host.

The last patch demonstrates how to use these new APIs to implement a
very simple vhost-user net driver, without any fancy features enabled.


Change log
==========

v2: - rebase
    - updated release note
    - updated API comments
    - renamed rte_vhost_get_vhost_memory to rte_vhost_get_mem_table

    - added a new device callback: features_changed(), bascially for live
      migration support
    - introduced rte_vhost_driver_start() to start a specific driver
    - misc fixes

v3: - rebaseon top of vhost-user socket fix
    - fix reconnect
    - fix shared build
    - fix typos


Major API/ABI Changes summary
=============================

- some renames
  * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
  * "rte_virtio_net.h"  ==> "rte_vhost.h"

- driver related APIs are bond with the socket file
  * rte_vhost_driver_set_features(socket_file, features);
  * rte_vhost_driver_get_features(socket_file, features);
  * rte_vhost_driver_enable_features(socket_file, features)
  * rte_vhost_driver_disable_features(socket_file, features)
  * rte_vhost_driver_callback_register(socket_file, notify_ops);
  * rte_vhost_driver_start(socket_file);
    This function replaces rte_vhost_driver_session_start(). Check patch
    18 for more information.

- new APIs to fetch guest and vring info
  * rte_vhost_get_mem_table(vid, mem);
  * rte_vhost_get_negotiated_features(vid);
  * rte_vhost_get_vhost_vring(vid, vring_idx, vring);

- new exported structures 
  * struct rte_vhost_vring
  * struct rte_vhost_mem_region
  * struct rte_vhost_memory

- a new device ops callback: features_changed().


Some design choices
===================

While making this patchset, I met quite few design choices and here are
two of them, with the issue and the reason I made such choices provided.
Please let me know if you have any comments (or better ideas).

Export public structures or not
-------------------------------

I made an ABI refactor last time (v16.07): move all the structures
internally and let applications use a "vid" to reference the internal
struct. With that, I hope we could never worry about the annoying ABI
issues.

It works great (and as expected) since then, as far as we only support
virito-net, as far as we can handle all the descs inside vhost lib. It
becomes problematic when a user wants to implement a vhost-user driver
somewhere. For example, it needs do the GPA to VVA translation. Without
any structs exported, some functions like gpa_to_vva() can't be inlined.
Calling it would be costly, especially it's a function we have to invoke
for processing each vring desc.

For that reason, the guest memory regions are exported. With that, the
gpa_to_vva could be inlined.

  
Add helper functions to fetch/update descs or not
-------------------------------------------------

I intended to do it like this way: introduce one function to get @count
of descs from a specific vring and another one to update the used descs.
It's something like
    rte_vhost_vring_get_descs(vid, vring_idx, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vid, vring_idx, count, offset, descs);

With that, vhost-user driver programmer's task would be easier, as he/she
doesn't have to parse the descs any more (such as to handle indirect desc).

But judging that virtio 1.1 is just emerged and it proposes a completely
ring layout, and most importantly, the vring desc structure is also changed,
I'd like to hold the introducation of such two functions. Otherwise, it's
very likely the two will be invalid when virtio 1.1 is out. Though I think
it may could be addressed with a care design, something like making the IOV
generic enough:

	struct rte_vhost_iov {
		uint64_t	gpa;
		uint64_t	vva;
		uint64_t	len;
	};

Instead, I go with the other way: introduce few APIs to export all the vring
infos (vring size, vring addr, callfd, etc), and let the vhost-user driver
read and update the descs. Those info could be passed to vhost-user driver
by introducing one API for each, but for saving few APIs and reducing few
calls for the programmer, I packed few key fields into a new structure, so
that it can be fetched with one call:
        struct rte_vhost_vring {
                struct vring_desc       *desc;
                struct vring_avail      *avail;
                struct vring_used       *used;
                uint64_t                log_guest_addr;
       
                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

When virtio 1.1 comes out, likely a simple change like following would
just work:
        struct rte_vhost_vring {
		union {
			struct {
                		struct vring_desc       *desc;
                		struct vring_avail      *avail;
                		struct vring_used       *used;
                		uint64_t                log_guest_addr;
			};
			struct desc	*desc_1_1;	/* vring addr for virtio 1.1 */
		};
       
                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

AFAIK, it's not an ABI breakage. Even if it does, we could introduce a new
API to get the virtio 1.1 ring address.

Those fields are the minimum set I got for a specific vring, with the mind
it would bring the minimum chance to break ABI for future extension. If we
need more info, we could introduce a new API.

OTOH, for getting the best performance, the two functions also have to be
inlined ("vid + vring_idx" combo is replaced with "vring"):
    rte_vhost_vring_get_descs(vring, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vring, count, offset, descs);

That said, one way or another, we have to export rte_vhost_vring struct.
For this reason, I didn't rush into introducing the two APIs.

	--yliu


---
Yuanhan Liu (22):
  vhost: introduce driver features related APIs
  net/vhost: remove feature related APIs
  vhost: use new APIs to handle features
  vhost: make notify ops per vhost driver
  vhost: export guest memory regions
  vhost: introduce API to fetch negotiated features
  vhost: export vhost vring info
  vhost: export API to translate gpa to vva
  vhost: turn queue pair to vring
  vhost: export the number of vrings
  vhost: move the device ready check at proper place
  vhost: drop the Rx and Tx queue macro
  vhost: do not include net specific headers
  vhost: rename device ops struct
  vhost: rename virtio-net to vhost
  vhost: add features changed callback
  vhost: export APIs for live migration support
  vhost: introduce API to start a specific driver
  vhost: rename header file
  vhost: workaround the build dependency on mbuf header
  vhost: do not destroy device on repeat mem table message
  examples/vhost: demonstrate the new generic vhost APIs

 doc/guides/prog_guide/vhost_lib.rst         |  42 +--
 doc/guides/rel_notes/deprecation.rst        |   9 -
 doc/guides/rel_notes/release_17_05.rst      |  40 +++
 drivers/net/vhost/rte_eth_vhost.c           | 101 ++-----
 drivers/net/vhost/rte_eth_vhost.h           |  32 +--
 drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
 examples/tep_termination/main.c             |  23 +-
 examples/tep_termination/main.h             |   2 +
 examples/tep_termination/vxlan_setup.c      |   2 +-
 examples/vhost/Makefile                     |   2 +-
 examples/vhost/main.c                       | 100 +++++--
 examples/vhost/main.h                       |  33 ++-
 examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++
 lib/librte_vhost/Makefile                   |   4 +-
 lib/librte_vhost/fd_man.c                   |   9 +-
 lib/librte_vhost/fd_man.h                   |   2 +-
 lib/librte_vhost/rte_vhost.h                | 423 ++++++++++++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map      |  16 +-
 lib/librte_vhost/rte_virtio_net.h           | 208 --------------
 lib/librte_vhost/socket.c                   | 227 ++++++++++++---
 lib/librte_vhost/vhost.c                    | 229 ++++++++-------
 lib/librte_vhost/vhost.h                    | 113 +++++---
 lib/librte_vhost/vhost_user.c               | 115 ++++----
 lib/librte_vhost/vhost_user.h               |   2 +-
 lib/librte_vhost/virtio_net.c               |  71 ++---
 25 files changed, 1526 insertions(+), 687 deletions(-)
 create mode 100644 examples/vhost/virtio_net.c
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor
  2017-03-27 13:02  3%                     ` Thomas Monjalon
@ 2017-03-28  8:25  0%                       ` Hunt, David
  0 siblings, 0 replies; 200+ results
From: Hunt, David @ 2017-03-28  8:25 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, bruce.richardson

Hi Thomas,

On 27/3/2017 2:02 PM, Thomas Monjalon wrote:
> 2017-03-20 10:08, David Hunt:
>> Also bumped up the ABI version number in the Makefile
> It would be good to explain the intent of versioning here.
>
>> Signed-off-by: David Hunt <david.hunt@intel.com>
>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>> ---
>>   lib/librte_distributor/Makefile                    |  2 +-
>>   lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
>>   lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
>>   lib/librte_distributor/rte_distributor_v20.c       | 10 +++
>>   lib/librte_distributor/rte_distributor_version.map | 14 ++++
>>   5 files changed, 162 insertions(+), 10 deletions(-)
>>   create mode 100644 lib/librte_distributor/rte_distributor_v1705.h
>>
>> diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
>> index 2b28eff..2f05cf3 100644
>> --- a/lib/librte_distributor/Makefile
>> +++ b/lib/librte_distributor/Makefile
>> @@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
>>   
>>   EXPORT_MAP := rte_distributor_version.map
>>   
>> -LIBABIVER := 1
>> +LIBABIVER := 2
> Why keeping ABI compat if you bump ABIVER?
>
> I guess you do not really want to bump now.

You are correct. The symbol versioning will ensure old binaries will 
work without the bump in LIBABIVER.
Please do not apply this line.

Thanks,
Dave.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v12 3/6] lib: add bitrate statistics library
    2017-03-27 20:21  1% ` [dpdk-dev] [PATCH v12 1/6] lib: add information metrics library Remy Horton
@ 2017-03-27 20:21  2% ` Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-27 20:21 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a library that calculates peak and average data-rate
statistics. For ethernet devices. These statistics are reported using
the metrics library.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                        |   4 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/metrics_lib.rst              |  65 ++++++++++
 doc/guides/rel_notes/release_17_02.rst             |   1 +
 doc/guides/rel_notes/release_17_05.rst             |   5 +
 lib/Makefile                                       |   1 +
 lib/librte_bitratestats/Makefile                   |  53 ++++++++
 lib/librte_bitratestats/rte_bitrate.c              | 141 +++++++++++++++++++++
 lib/librte_bitratestats/rte_bitrate.h              |  80 ++++++++++++
 .../rte_bitratestats_version.map                   |   9 ++
 mk/rte.app.mk                                      |   1 +
 13 files changed, 367 insertions(+)
 create mode 100644 lib/librte_bitratestats/Makefile
 create mode 100644 lib/librte_bitratestats/rte_bitrate.c
 create mode 100644 lib/librte_bitratestats/rte_bitrate.h
 create mode 100644 lib/librte_bitratestats/rte_bitratestats_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index d58a1a7..9045454 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -630,6 +630,10 @@ Metrics
 M: Remy Horton <remy.horton@intel.com>
 F: lib/librte_metrics/
 
+Bit-rate statistics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_bitratestats/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index 1e84c3b..e6163eb 100644
--- a/config/common_base
+++ b/config/common_base
@@ -635,3 +635,8 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
 # Compile the crypto performance application
 #
 CONFIG_RTE_APP_CRYPTO_PERF=y
+
+#
+# Compile the bitrate statistics library
+#
+CONFIG_RTE_LIBRTE_BITRATE=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 26a26b7..8492bce 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -157,4 +157,5 @@ There are many libraries, so their headers may be grouped by topics:
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
   [device metrics]     (@ref rte_metrics.h),
+  [bitrate statistics] (@ref rte_bitrate.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fbbcf8e..c4b3b68 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -36,6 +36,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_eal/common/include \
                           lib/librte_eal/common/include/generic \
                           lib/librte_acl \
+                          lib/librte_bitratestats \
                           lib/librte_cfgfile \
                           lib/librte_cmdline \
                           lib/librte_compat \
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
index 87f806d..1c2a28f 100644
--- a/doc/guides/prog_guide/metrics_lib.rst
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -178,3 +178,68 @@ print out all metrics for a given port:
         free(metrics);
         free(names);
     }
+
+
+Bit-rate statistics library
+---------------------------
+
+The bit-rate library calculates the exponentially-weighted moving
+average and peak bit-rates for each active port (i.e. network device).
+These statistics are reported via the metrics library using the
+following names:
+
+    - ``mean_bits_in``: Average inbound bit-rate
+    - ``mean_bits_out``:  Average outbound bit-rate
+    - ``ewma_bits_in``: Average inbound bit-rate (EWMA smoothed)
+    - ``ewma_bits_out``:  Average outbound bit-rate (EWMA smoothed)
+    - ``peak_bits_in``:  Peak inbound bit-rate
+    - ``peak_bits_out``:  Peak outbound bit-rate
+
+Once initialised and clocked at the appropriate frequency, these
+statistics can be obtained by querying the metrics library.
+
+Initialization
+~~~~~~~~~~~~~~
+
+Before it is used the bit-rate statistics library has to be initialised
+by calling ``rte_stats_bitrate_create()``, which will return a bit-rate
+calculation object. Since the bit-rate library uses the metrics library
+to report the calculated statistics, the bit-rate library then needs to
+register the calculated statistics with the metrics library. This is
+done using the helper function ``rte_stats_bitrate_reg()``.
+
+.. code-block:: c
+
+    struct rte_stats_bitrates *bitrate_data;
+
+    bitrate_data = rte_stats_bitrate_create();
+    if (bitrate_data == NULL)
+        rte_exit(EXIT_FAILURE, "Could not allocate bit-rate data.\n");
+    rte_stats_bitrate_reg(bitrate_data);
+
+Controlling the sampling rate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since the library works by periodic sampling but does not use an
+internal thread, the application has to periodically call
+``rte_stats_bitrate_calc()``. The frequency at which this function
+is called should be the intended sampling rate required for the
+calculated statistics. For instance if per-second statistics are
+desired, this function should be called once a second.
+
+.. code-block:: c
+
+    tics_datum = rte_rdtsc();
+    tics_per_1sec = rte_get_timer_hz();
+
+    while( 1 ) {
+        /* ... */
+        tics_current = rte_rdtsc();
+	if (tics_current - tics_datum >= tics_per_1sec) {
+	    /* Periodic bitrate calculation */
+	    for (idx_port = 0; idx_port < cnt_ports; idx_port++)
+	            rte_stats_bitrate_calc(bitrate_data, idx_port);
+		tics_datum = tics_current;
+	    }
+        /* ... */
+    }
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 8bd706f..63786df 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -353,6 +353,7 @@ The libraries prepended with a plus sign were incremented in this version.
 .. code-block:: diff
 
      librte_acl.so.2
+   + librte_bitratestats.so.1
      librte_cfgfile.so.2
      librte_cmdline.so.2
      librte_cryptodev.so.2
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 33bc8dd..f1b83bc 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -83,6 +83,11 @@ Resolved Issues
   reporting mechanism that is independent of other libraries such
   as ethdev.
 
+* **Added bit-rate calculation library.**
+
+  A library that can be used to calculate device bit-rates. Calculated
+  bitrates are reported using the metrics library.
+
 
 EAL
 ~~~
diff --git a/lib/Makefile b/lib/Makefile
index 29f6a81..ecc54c0 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
 DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
 DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
+DIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += librte_bitratestats
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
diff --git a/lib/librte_bitratestats/Makefile b/lib/librte_bitratestats/Makefile
new file mode 100644
index 0000000..743b62c
--- /dev/null
+++ b/lib/librte_bitratestats/Makefile
@@ -0,0 +1,53 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_bitratestats.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_bitratestats_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_BITRATE) := rte_bitrate.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_BITRATE)-include += rte_bitrate.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_metrics
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_bitratestats/rte_bitrate.c b/lib/librte_bitratestats/rte_bitrate.c
new file mode 100644
index 0000000..3252598
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.c
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_bitrate.h>
+
+/*
+ * Persistent bit-rate data.
+ * @internal
+ */
+struct rte_stats_bitrate {
+	uint64_t last_ibytes;
+	uint64_t last_obytes;
+	uint64_t peak_ibits;
+	uint64_t peak_obits;
+	uint64_t mean_ibits;
+	uint64_t mean_obits;
+	uint64_t ewma_ibits;
+	uint64_t ewma_obits;
+};
+
+struct rte_stats_bitrates {
+	struct rte_stats_bitrate port_stats[RTE_MAX_ETHPORTS];
+	uint16_t id_stats_set;
+};
+
+struct rte_stats_bitrates *
+rte_stats_bitrate_create(void)
+{
+	return rte_zmalloc(NULL, sizeof(struct rte_stats_bitrates),
+		RTE_CACHE_LINE_SIZE);
+}
+
+int
+rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data)
+{
+	const char * const names[] = {
+		"ewma_bits_in", "ewma_bits_out",
+		"mean_bits_in", "mean_bits_out",
+		"peak_bits_in", "peak_bits_out",
+	};
+	int return_value;
+
+	return_value = rte_metrics_reg_names(&names[0], 6);
+	if (return_value >= 0)
+		bitrate_data->id_stats_set = return_value;
+	return return_value;
+}
+
+int
+rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id)
+{
+	struct rte_stats_bitrate *port_data;
+	struct rte_eth_stats eth_stats;
+	int ret_code;
+	uint64_t cnt_bits;
+	int64_t delta;
+	const int64_t alpha_percent = 20;
+	uint64_t values[6];
+
+	ret_code = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret_code != 0)
+		return ret_code;
+
+	port_data = &bitrate_data->port_stats[port_id];
+
+	/* Incoming bitrate. This is an iteratively calculated EWMA
+	 * (Expomentially Weighted Moving Average) that uses a
+	 * weighting factor of alpha_percent. An unsmoothed mean
+	 * for just the current time delta is also calculated for the
+	 * benefit of people who don't understand signal processing.
+	 */
+	cnt_bits = (eth_stats.ibytes - port_data->last_ibytes) << 3;
+	port_data->last_ibytes = eth_stats.ibytes;
+	if (cnt_bits > port_data->peak_ibits)
+		port_data->peak_ibits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_ibits;
+	/* The +-50 fixes integer rounding during divison */
+	if (delta > 0)
+		delta = (delta * alpha_percent + 50) / 100;
+	else
+		delta = (delta * alpha_percent - 50) / 100;
+	port_data->ewma_ibits += delta;
+	port_data->mean_ibits = cnt_bits;
+
+	/* Outgoing bitrate (also EWMA) */
+	cnt_bits = (eth_stats.obytes - port_data->last_obytes) << 3;
+	port_data->last_obytes = eth_stats.obytes;
+	if (cnt_bits > port_data->peak_obits)
+		port_data->peak_obits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_obits;
+	delta = (delta * alpha_percent + 50) / 100;
+	port_data->ewma_obits += delta;
+	port_data->mean_obits = cnt_bits;
+
+	values[0] = port_data->ewma_ibits;
+	values[1] = port_data->ewma_obits;
+	values[2] = port_data->mean_ibits;
+	values[3] = port_data->mean_obits;
+	values[4] = port_data->peak_ibits;
+	values[5] = port_data->peak_obits;
+	rte_metrics_update_values(port_id, bitrate_data->id_stats_set,
+		values, 6);
+	return 0;
+}
diff --git a/lib/librte_bitratestats/rte_bitrate.h b/lib/librte_bitratestats/rte_bitrate.h
new file mode 100644
index 0000000..564e4f7
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.h
@@ -0,0 +1,80 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+/**
+ *  Bitrate statistics data structure.
+ *  This data structure is intentionally opaque.
+ */
+struct rte_stats_bitrates;
+
+
+/**
+ * Allocate a bitrate statistics structure
+ *
+ * @return
+ *   - Pointer to structure on success
+ *   - NULL on error (zmalloc failure)
+ */
+struct rte_stats_bitrates *rte_stats_bitrate_create(void);
+
+
+/**
+ * Register bitrate statistics with the metric library.
+ *
+ * @param bitrate_data
+ *   Pointer allocated by rte_stats_create()
+ *
+ * @return
+ *   Zero on success
+ *   Negative on error
+ */
+int rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data);
+
+
+/**
+ * Calculate statistics for current time window. The period with which
+ * this function is called should be the intended sampling window width.
+ *
+ * @param bitrate_data
+ *   Bitrate statistics data pointer
+ *
+ * @param port_id
+ *   Port id to calculate statistics for
+ *
+ * @return
+ *  - Zero on success
+ *  - Negative value on error
+ */
+int rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id);
diff --git a/lib/librte_bitratestats/rte_bitratestats_version.map b/lib/librte_bitratestats/rte_bitratestats_version.map
new file mode 100644
index 0000000..fe74544
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitratestats_version.map
@@ -0,0 +1,9 @@
+DPDK_17.05 {
+	global:
+
+	rte_stats_bitrate_calc;
+	rte_stats_bitrate_create;
+	rte_stats_bitrate_reg;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index dfe5232..858b6b6 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -100,6 +100,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
+_LDLIBS-$(CONFIG_RTE_LIBRTE_BITRATE)        += -lrte_bitratestats
 
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
-- 
2.5.5

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v12 1/6] lib: add information metrics library
  @ 2017-03-27 20:21  1% ` Remy Horton
  2017-03-27 20:21  2% ` [dpdk-dev] [PATCH v12 3/6] lib: add bitrate statistics library Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-27 20:21 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a new information metrics library. This Metrics
library implements a mechanism by which producers can publish
numeric information for later querying by consumers. Metrics
themselves are statistics that are not generated by PMDs, and
hence are not reported via ethdev extended statistics.

Metric information is populated using a push model, where
producers update the values contained within the metric
library by calling an update function on the relevant metrics.
Consumers receive metric information by querying the central
metric data, which is held in shared memory.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                |   4 +
 config/common_base                         |   5 +
 doc/api/doxy-api-index.md                  |   1 +
 doc/api/doxy-api.conf                      |   1 +
 doc/guides/prog_guide/index.rst            |   1 +
 doc/guides/prog_guide/metrics_lib.rst      | 180 +++++++++++++++++
 doc/guides/rel_notes/release_17_02.rst     |   1 +
 doc/guides/rel_notes/release_17_05.rst     |   8 +
 lib/Makefile                               |   1 +
 lib/librte_metrics/Makefile                |  51 +++++
 lib/librte_metrics/rte_metrics.c           | 299 +++++++++++++++++++++++++++++
 lib/librte_metrics/rte_metrics.h           | 240 +++++++++++++++++++++++
 lib/librte_metrics/rte_metrics_version.map |  13 ++
 mk/rte.app.mk                              |   2 +
 14 files changed, 807 insertions(+)
 create mode 100644 doc/guides/prog_guide/metrics_lib.rst
 create mode 100644 lib/librte_metrics/Makefile
 create mode 100644 lib/librte_metrics/rte_metrics.c
 create mode 100644 lib/librte_metrics/rte_metrics.h
 create mode 100644 lib/librte_metrics/rte_metrics_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 0c78b58..d58a1a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -626,6 +626,10 @@ F: lib/librte_jobstats/
 F: examples/l2fwd-jobstats/
 F: doc/guides/sample_app_ug/l2_forward_job_stats.rst
 
+Metrics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_metrics/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index 37aa1e1..1e84c3b 100644
--- a/config/common_base
+++ b/config/common_base
@@ -506,6 +506,11 @@ CONFIG_RTE_LIBRTE_EFD=y
 CONFIG_RTE_LIBRTE_JOBSTATS=y
 
 #
+# Compile the device metrics library
+#
+CONFIG_RTE_LIBRTE_METRICS=y
+
+#
 # Compile librte_lpm
 #
 CONFIG_RTE_LIBRTE_LPM=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index eb39f69..26a26b7 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -156,4 +156,5 @@ There are many libraries, so their headers may be grouped by topics:
   [common]             (@ref rte_common.h),
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
+  [device metrics]     (@ref rte_metrics.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fdcf13c..fbbcf8e 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -52,6 +52,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_mbuf \
                           lib/librte_mempool \
                           lib/librte_meter \
+                          lib/librte_metrics \
                           lib/librte_net \
                           lib/librte_pdump \
                           lib/librte_pipeline \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 77f427e..2a69844 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -62,6 +62,7 @@ Programmer's Guide
     packet_classif_access_ctrl
     packet_framework
     vhost_lib
+    metrics_lib
     port_hotplug_framework
     source_org
     dev_kit_build_system
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
new file mode 100644
index 0000000..87f806d
--- /dev/null
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -0,0 +1,180 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Metrics_Library:
+
+Metrics Library
+===============
+
+The Metrics library implements a mechanism by which *producers* can
+publish numeric information for later querying by *consumers*. In
+practice producers will typically be other libraries or primary
+processes, whereas consumers will typically be applications.
+
+Metrics themselves are statistics that are not generated by PMDs. Metric
+information is populated using a push model, where producers update the
+values contained within the metric library by calling an update function
+on the relevant metrics. Consumers receive metric information by querying
+the central metric data, which is held in shared memory.
+
+For each metric, a separate value is maintained for each port id, and
+when publishing metric values the producers need to specify which port is
+being updated. In addition there is a special id ``RTE_METRICS_GLOBAL``
+that is intended for global statistics that are not associated with any
+individual device. Since the metrics library is self-contained, the only
+restriction on port numbers is that they are less than ``RTE_MAX_ETHPORTS``
+- there is no requirement for the ports to actually exist.
+
+Initialising the library
+------------------------
+
+Before the library can be used, it has to be initialized by calling
+``rte_metrics_init()`` which sets up the metric store in shared memory.
+This is where producers will publish metric information to, and where
+consumers will query it from.
+
+.. code-block:: c
+
+    rte_metrics_init(rte_socket_id());
+
+This function **must** be called from a primary process, but otherwise
+producers and consumers can be in either primary or secondary processes.
+
+Registering metrics
+-------------------
+
+Metrics must first be *registered*, which is the way producers declare
+the names of the metrics they will be publishing. Registration can either
+be done individually, or a set of metrics can be registered as a group.
+Individual registration is done using ``rte_metrics_reg_name()``:
+
+.. code-block:: c
+
+    id_1 = rte_metrics_reg_name("mean_bits_in");
+    id_2 = rte_metrics_reg_name("mean_bits_out");
+    id_3 = rte_metrics_reg_name("peak_bits_in");
+    id_4 = rte_metrics_reg_name("peak_bits_out");
+
+or alternatively, a set of metrics can be registered together using
+``rte_metrics_reg_names()``:
+
+.. code-block:: c
+
+    const char * const names[] = {
+        "mean_bits_in", "mean_bits_out",
+        "peak_bits_in", "peak_bits_out",
+    };
+    id_set = rte_metrics_reg_names(&names[0], 4);
+
+If the return value is negative, it means registration failed. Otherwise
+the return value is the *key* for the metric, which is used when updating
+values. A table mapping together these key values and the metrics' names
+can be obtained using ``rte_metrics_get_names()``.
+
+Updating metric values
+----------------------
+
+Once registered, producers can update the metric for a given port using
+the ``rte_metrics_update_value()`` function. This uses the metric key
+that is returned when registering the metric, and can also be looked up
+using ``rte_metrics_get_names()``.
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_1, values[0]);
+    rte_metrics_update_value(port_id, id_2, values[1]);
+    rte_metrics_update_value(port_id, id_3, values[2]);
+    rte_metrics_update_value(port_id, id_4, values[3]);
+
+if metrics were registered as a single set, they can either be updated
+individually using ``rte_metrics_update_value()``, or updated together
+using the ``rte_metrics_update_values()`` function:
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_set, values[0]);
+    rte_metrics_update_value(port_id, id_set + 1, values[1]);
+    rte_metrics_update_value(port_id, id_set + 2, values[2]);
+    rte_metrics_update_value(port_id, id_set + 3, values[3]);
+
+    rte_metrics_update_values(port_id, id_set, values, 4);
+
+Note that ``rte_metrics_update_values()`` cannot be used to update
+metric values from *multiple* *sets*, as there is no guarantee two
+sets registered one after the other have contiguous id values.
+
+Querying metrics
+----------------
+
+Consumers can obtain metric values by querying the metrics library using
+the ``rte_metrics_get_values()`` function that return an array of
+``struct rte_metric_value``. Each entry within this array contains a metric
+value and its associated key. A key-name mapping can be obtained using the
+``rte_metrics_get_names()`` function that returns an array of
+``struct rte_metric_name`` that is indexed by the key. The following will
+print out all metrics for a given port:
+
+.. code-block:: c
+
+    void print_metrics() {
+        struct rte_metric_name *names;
+        int len;
+
+        len = rte_metrics_get_names(NULL, 0);
+        if (len < 0) {
+            printf("Cannot get metrics count\n");
+            return;
+        }
+        if (len == 0) {
+            printf("No metrics to display (none have been registered)\n");
+            return;
+        }
+        metrics = malloc(sizeof(struct rte_metric_value) * len);
+        names =  malloc(sizeof(struct rte_metric_name) * len);
+        if (metrics == NULL || names == NULL) {
+            printf("Cannot allocate memory\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        ret = rte_metrics_get_values(port_id, metrics, len);
+        if (ret < 0 || ret > len) {
+            printf("Cannot get metrics values\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        printf("Metrics for port %i:\n", port_id);
+        for (i = 0; i < len; i++)
+            printf("  %s: %"PRIu64"\n",
+                names[metrics[i].key].name, metrics[i].value);
+        free(metrics);
+        free(names);
+    }
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 357965a..8bd706f 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -368,6 +368,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_mbuf.so.2
      librte_mempool.so.2
      librte_meter.so.1
+   + librte_metrics.so.1
      librte_net.so.1
      librte_pdump.so.1
      librte_pipeline.so.3
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2a045b3..33bc8dd 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -75,6 +75,14 @@ Resolved Issues
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Added information metric library.**
+
+  A library that allows information metrics to be added and updated
+  by producers, typically other libraries, for later retrieval by
+  consumers such as applications. It is intended to provide a
+  reporting mechanism that is independent of other libraries such
+  as ethdev.
+
 
 EAL
 ~~~
diff --git a/lib/Makefile b/lib/Makefile
index 4178325..29f6a81 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -49,6 +49,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
 DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
 DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
+DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
diff --git a/lib/librte_metrics/Makefile b/lib/librte_metrics/Makefile
new file mode 100644
index 0000000..8d6e23a
--- /dev/null
+++ b/lib/librte_metrics/Makefile
@@ -0,0 +1,51 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_metrics.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_metrics_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_METRICS) := rte_metrics.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_METRICS)-include += rte_metrics.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_METRICS) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_metrics/rte_metrics.c b/lib/librte_metrics/rte_metrics.c
new file mode 100644
index 0000000..aa9ec50
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.c
@@ -0,0 +1,299 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_lcore.h>
+#include <rte_memzone.h>
+#include <rte_spinlock.h>
+
+#define RTE_METRICS_MAX_METRICS 256
+#define RTE_METRICS_MEMZONE_NAME "RTE_METRICS"
+
+/**
+ * Internal stats metadata and value entry.
+ *
+ * @internal
+ */
+struct rte_metrics_meta_s {
+	/** Name of metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+	/** Current value for metric */
+	uint64_t value[RTE_MAX_ETHPORTS];
+	/** Used for global metrics */
+	uint64_t global_value;
+	/** Index of next root element (zero for none) */
+	uint16_t idx_next_set;
+	/** Index of next metric in set (zero for none) */
+	uint16_t idx_next_stat;
+};
+
+/**
+ * Internal stats info structure.
+ *
+ * @internal
+ * Offsets into metadata are used instead of pointers because ASLR
+ * means that having the same physical addresses in different
+ * processes is not guaranteed.
+ */
+struct rte_metrics_data_s {
+	/**   Index of last metadata entry with valid data.
+	 * This value is not valid if cnt_stats is zero.
+	 */
+	uint16_t idx_last_set;
+	/**   Number of metrics. */
+	uint16_t cnt_stats;
+	/** Metric data memory block. */
+	struct rte_metrics_meta_s metadata[RTE_METRICS_MAX_METRICS];
+	/** Metric data access lock */
+	rte_spinlock_t lock;
+};
+
+void
+rte_metrics_init(int socket_id)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone != NULL)
+		return;
+	memzone = rte_memzone_reserve(RTE_METRICS_MEMZONE_NAME,
+		sizeof(struct rte_metrics_data_s), socket_id, 0);
+	if (memzone == NULL)
+		rte_exit(EXIT_FAILURE, "Unable to allocate stats memzone\n");
+	stats = memzone->addr;
+	memset(stats, 0, sizeof(struct rte_metrics_data_s));
+	rte_spinlock_init(&stats->lock);
+}
+
+int
+rte_metrics_reg_name(const char *name)
+{
+	const char * const list_names[] = {name};
+
+	return rte_metrics_reg_names(list_names, 1);
+}
+
+int
+rte_metrics_reg_names(const char * const *names, uint16_t cnt_names)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	uint16_t idx_base;
+
+	/* Some sanity checks */
+	if (cnt_names < 1 || names == NULL)
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	if (stats->cnt_stats + cnt_names >= RTE_METRICS_MAX_METRICS)
+		return -ENOMEM;
+
+	rte_spinlock_lock(&stats->lock);
+
+	/* Overwritten later if this is actually first set.. */
+	stats->metadata[stats->idx_last_set].idx_next_set = stats->cnt_stats;
+
+	stats->idx_last_set = idx_base = stats->cnt_stats;
+
+	for (idx_name = 0; idx_name < cnt_names; idx_name++) {
+		entry = &stats->metadata[idx_name + stats->cnt_stats];
+		strncpy(entry->name, names[idx_name],
+			RTE_METRICS_MAX_NAME_LEN);
+		memset(entry->value, 0, sizeof(entry->value));
+		entry->idx_next_stat = idx_name + stats->cnt_stats + 1;
+	}
+	entry->idx_next_stat = 0;
+	entry->idx_next_set = 0;
+	stats->cnt_stats += cnt_names;
+
+	rte_spinlock_unlock(&stats->lock);
+
+	return idx_base;
+}
+
+int
+rte_metrics_update_value(int port_id, uint16_t key, const uint64_t value)
+{
+	return rte_metrics_update_values(port_id, key, &value, 1);
+}
+
+int
+rte_metrics_update_values(int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_metric;
+	uint16_t idx_value;
+	uint16_t cnt_setsize;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	rte_spinlock_lock(&stats->lock);
+	idx_metric = key;
+	cnt_setsize = 1;
+	while (idx_metric < stats->cnt_stats) {
+		entry = &stats->metadata[idx_metric];
+		if (entry->idx_next_stat == 0)
+			break;
+		cnt_setsize++;
+		idx_metric++;
+	}
+	/* Check update does not cross set border */
+	if (count > cnt_setsize) {
+		rte_spinlock_unlock(&stats->lock);
+		return -ERANGE;
+	}
+
+	if (port_id == RTE_METRICS_GLOBAL)
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].global_value =
+				values[idx_value];
+		}
+	else
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].value[port_id] =
+				values[idx_value];
+		}
+	rte_spinlock_unlock(&stats->lock);
+	return 0;
+}
+
+int
+rte_metrics_get_names(struct rte_metric_name *names,
+	uint16_t capacity)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+	if (names != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		for (idx_name = 0; idx_name < stats->cnt_stats; idx_name++)
+			strncpy(names[idx_name].name,
+				stats->metadata[idx_name].name,
+				RTE_METRICS_MAX_NAME_LEN);
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
+
+int
+rte_metrics_get_values(int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+
+	if (values != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		if (port_id == RTE_METRICS_GLOBAL)
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->global_value;
+			}
+		else
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->value[port_id];
+			}
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
diff --git a/lib/librte_metrics/rte_metrics.h b/lib/librte_metrics/rte_metrics.h
new file mode 100644
index 0000000..7458328
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.h
@@ -0,0 +1,240 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ *
+ * DPDK Metrics module
+ *
+ * Metrics are statistics that are not generated by PMDs, and hence
+ * are better reported through a mechanism that is independent from
+ * the ethdev-based extended statistics. Providers will typically
+ * be other libraries and consumers will typically be applications.
+ *
+ * Metric information is populated using a push model, where producers
+ * update the values contained within the metric library by calling
+ * an update function on the relevant metrics. Consumers receive
+ * metric information by querying the central metric data, which is
+ * held in shared memory. Currently only bulk querying of metrics
+ * by consumers is supported.
+ */
+
+#ifndef _RTE_METRICS_H_
+#define _RTE_METRICS_H_
+
+/** Maximum length of metric name (including null-terminator) */
+#define RTE_METRICS_MAX_NAME_LEN 64
+
+/**
+ * Global metric special id.
+ *
+ * When used for the port_id parameter when calling
+ * rte_metrics_update_metric() or rte_metrics_update_metric(),
+ * the global metric, which are not associated with any specific
+ * port (i.e. device), are updated.
+ */
+#define RTE_METRICS_GLOBAL -1
+
+
+/**
+ * A name-key lookup for metrics.
+ *
+ * An array of this structure is returned by rte_metrics_get_names().
+ * The struct rte_metric_value references these names via their array index.
+ */
+struct rte_metric_name {
+	/** String describing metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+};
+
+
+/**
+ * Metric value structure.
+ *
+ * This structure is used by rte_metrics_get_values() to return metrics,
+ * which are statistics that are not generated by PMDs. It maps a name key,
+ * which corresponds to an index in the array returned by
+ * rte_metrics_get_names().
+ */
+struct rte_metric_value {
+	/** Numeric identifier of metric. */
+	uint16_t key;
+	/** Value for metric */
+	uint64_t value;
+};
+
+
+/**
+ * Initializes metric module. This function must be called from
+ * a primary process before metrics are used.
+ *
+ * @param socket_id
+ *   Socket to use for shared memory allocation.
+ */
+void rte_metrics_init(int socket_id);
+
+/**
+ * Register a metric, making it available as a reporting parameter.
+ *
+ * Registering a metric is the way producers declare a parameter
+ * that they wish to be reported. Once registered, the associated
+ * numeric key can be obtained via rte_metrics_get_names(), which
+ * is required for updating said metric's value.
+ *
+ * @param name
+ *   Metric name
+ *
+ * @return
+ *  - Zero or positive: Success (index key of new metric)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_name(const char *name);
+
+/**
+ * Register a set of metrics.
+ *
+ * This is a bulk version of rte_metrics_reg_metrics() and aside from
+ * handling multiple keys at once is functionally identical.
+ *
+ * @param names
+ *   List of metric names
+ *
+ * @param cnt_names
+ *   Number of metrics in set
+ *
+ * @return
+ *  - Zero or positive: Success (index key of start of set)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_names(const char * const *names, uint16_t cnt_names);
+
+/**
+ * Get metric name-key lookup table.
+ *
+ * @param names
+ *   A struct rte_metric_name array of at least *capacity* in size to
+ *   receive key names. If this is NULL, function returns the required
+ *   number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_name array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *names* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_names(
+	struct rte_metric_name *names,
+	uint16_t capacity);
+
+/**
+ * Get metric value table.
+ *
+ * @param port_id
+ *   Port id to query
+ *
+ * @param values
+ *   A struct rte_metric_value array of at least *capacity* in size to
+ *   receive metric ids and values. If this is NULL, function returns
+ *   the required number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_value array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *values* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_values(
+	int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity);
+
+/**
+ * Updates a metric
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Id of metric to update
+ * @param value
+ *   New value
+ *
+ * @return
+ *   - -EIO if unable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_value(
+	int port_id,
+	uint16_t key,
+	const uint64_t value);
+
+/**
+ * Updates a metric set. Note that it is an error to try to
+ * update across a set boundary.
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Base id of metrics set to update
+ * @param values
+ *   Set of new values
+ * @param count
+ *   Number of new values
+ *
+ * @return
+ *   - -ERANGE if count exceeds metric set size
+ *   - -EIO if upable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_values(
+	int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count);
+
+#endif
diff --git a/lib/librte_metrics/rte_metrics_version.map b/lib/librte_metrics/rte_metrics_version.map
new file mode 100644
index 0000000..4c5234c
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics_version.map
@@ -0,0 +1,13 @@
+DPDK_17.05 {
+	global:
+
+	rte_metrics_get_names;
+	rte_metrics_get_values;
+	rte_metrics_init;
+	rte_metrics_reg_name;
+	rte_metrics_reg_names;
+	rte_metrics_update_value;
+	rte_metrics_update_values;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 0e0b600..dfe5232 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -99,6 +99,8 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
+
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
-- 
2.5.5

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor
  2017-03-20 10:08  2%                   ` [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor David Hunt
@ 2017-03-27 13:02  3%                     ` Thomas Monjalon
  2017-03-28  8:25  0%                       ` Hunt, David
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2017-03-27 13:02 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, bruce.richardson

2017-03-20 10:08, David Hunt:
> Also bumped up the ABI version number in the Makefile

It would be good to explain the intent of versioning here.

> Signed-off-by: David Hunt <david.hunt@intel.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  lib/librte_distributor/Makefile                    |  2 +-
>  lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
>  lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
>  lib/librte_distributor/rte_distributor_v20.c       | 10 +++
>  lib/librte_distributor/rte_distributor_version.map | 14 ++++
>  5 files changed, 162 insertions(+), 10 deletions(-)
>  create mode 100644 lib/librte_distributor/rte_distributor_v1705.h
> 
> diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
> index 2b28eff..2f05cf3 100644
> --- a/lib/librte_distributor/Makefile
> +++ b/lib/librte_distributor/Makefile
> @@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
>  
>  EXPORT_MAP := rte_distributor_version.map
>  
> -LIBABIVER := 1
> +LIBABIVER := 2

Why keeping ABI compat if you bump ABIVER?

I guess you do not really want to bump now.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 03/14] ring: eliminate duplication of size and mask fields
  @ 2017-03-27 10:13  3%         ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-27 10:13 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: olivier.matz, dev, jerin.jacob

On Mon, Mar 27, 2017 at 11:52:58AM +0200, Thomas Monjalon wrote:
> 2017-03-24 17:09, Bruce Richardson:
> > The size and mask fields are duplicated in both the producer and
> > consumer data structures. Move them out of that into the top level
> > structure so they are not duplicated.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 
> Sorry Bruce, I encounter this error:
> 
> fatal error: no member named 'size' in 'struct rte_ring_headtail'
>                 if (r->prod.size >= ring_size) {
>                     ~~~~~~~ ^
>
Hi Thomas,

again I need more information here. I've just revalidated these first
three patches doing 7 builds with each one (gcc, clang, debug, shared 
library, old ABI, default-machine and 32-bit), as well as compiling the
apps for gcc and clang, and I see no errors.

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline personalization profile processing
  2017-03-25 21:03  3%         ` Chilikin, Andrey
@ 2017-03-27  2:09  0%           ` Xing, Beilei
  0 siblings, 0 replies; 200+ results
From: Xing, Beilei @ 2017-03-27  2:09 UTC (permalink / raw)
  To: Chilikin, Andrey, Wu, Jingjing; +Cc: Zhang, Helin, dev

Hi Andrey,

> -----Original Message-----
> From: Chilikin, Andrey
> Sent: Sunday, March 26, 2017 5:04 AM
> To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline
> personalization profile processing
> 
> Hi Beilei
> 
> > -----Original Message-----
> > From: Xing, Beilei
> > Sent: Saturday, March 25, 2017 4:04 AM
> > To: Chilikin, Andrey <andrey.chilikin@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>
> > Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline
> > personalization profile processing
> >
> > Hi Andrey,
> >
> > > -----Original Message-----
> > > From: Chilikin, Andrey
> > > Sent: Friday, March 24, 2017 10:53 PM
> > > To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > > <jingjing.wu@intel.com>
> > > Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline
> > > personalization profile processing
> > >
> > > Hi Beilei,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Beilei Xing
> > > > Sent: Friday, March 24, 2017 10:19 AM
> > > > To: Wu, Jingjing <jingjing.wu@intel.com>
> > > > Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline
> > > > personalization profile processing
> > > >
> > > > Add support for adding a pipeline personalization profile package.
> > > >
> > > > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> > > > ---
> > > >  app/test-pmd/cmdline.c                    |   1 +
> > > >  drivers/net/i40e/i40e_ethdev.c            | 198
> > > > ++++++++++++++++++++++++++++++
> > > >  drivers/net/i40e/rte_pmd_i40e.h           |  51 ++++++++
> > > >  drivers/net/i40e/rte_pmd_i40e_version.map |   6 +
> > > >  4 files changed, 256 insertions(+)
> > > >
> > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > > 47f935d..6e0625d 100644
> > > > --- a/app/test-pmd/cmdline.c
> > > > +++ b/app/test-pmd/cmdline.c
> > > > @@ -37,6 +37,7 @@
> > > >  #include <stdio.h>
> > > >  #include <stdint.h>
> > > >  #include <stdarg.h>
> > > > +#include <stdbool.h>
> > > >  #include <string.h>
> > > >  #include <termios.h>
> > > >  #include <unistd.h>
> > > > diff --git a/drivers/net/i40e/i40e_ethdev.c
> > > > b/drivers/net/i40e/i40e_ethdev.c index 3702214..bea593f 100644
> > > > --- a/drivers/net/i40e/i40e_ethdev.c
> > > > +++ b/drivers/net/i40e/i40e_ethdev.c
> > > > @@ -11259,3 +11259,201 @@ rte_pmd_i40e_reset_vf_stats(uint8_t
> > > port,
> > > >
> > > >  	return 0;
> > > >  }
> > > > +
> > > > +static void
> > > > +i40e_generate_profile_info_sec(char *name, struct
> > > > +i40e_ppp_version
> > > > *version,
> > > > +			       uint32_t track_id, uint8_t *profile_info_sec,
> > > > +			       bool add)
> > > > +{
> > > > +	struct i40e_profile_section_header *sec = NULL;
> > > > +	struct i40e_profile_info *pinfo;
> > > > +
> > > > +	sec = (struct i40e_profile_section_header *)profile_info_sec;
> > > > +	sec->tbl_size = 1;
> > > > +	sec->data_end = sizeof(struct i40e_profile_section_header) +
> > > > +		sizeof(struct i40e_profile_info);
> > > > +	sec->section.type = SECTION_TYPE_INFO;
> > > > +	sec->section.offset = sizeof(struct i40e_profile_section_header);
> > > > +	sec->section.size = sizeof(struct i40e_profile_info);
> > > > +	pinfo = (struct i40e_profile_info *)(profile_info_sec +
> > > > +					     sec->section.offset);
> > > > +	pinfo->track_id = track_id;
> > > > +	memcpy(pinfo->name, name, I40E_PPP_NAME_SIZE);
> > > > +	memcpy(&pinfo->version, version, sizeof(struct i40e_ppp_version));
> > > > +	if (add)
> > > > +		pinfo->op = I40E_PPP_ADD_TRACKID;
> > > > +	else
> > > > +		pinfo->op = I40E_PPP_REMOVE_TRACKID; }
> > > > +
> > > > +static enum i40e_status_code
> > > > +i40e_add_rm_profile_info(struct i40e_hw *hw, uint8_t
> > > > +*profile_info_sec) {
> > > > +	enum i40e_status_code status = I40E_SUCCESS;
> > > > +	struct i40e_profile_section_header *sec;
> > > > +	uint32_t track_id;
> > > > +	uint32_t offset = 0, info = 0;
> > > > +
> > > > +	sec = (struct i40e_profile_section_header *)profile_info_sec;
> > > > +	track_id = ((struct i40e_profile_info *)(profile_info_sec +
> > > > +					 sec->section.offset))->track_id;
> > > > +
> > > > +	status = i40e_aq_write_ppp(hw, (void *)sec, sec->data_end,
> > > > +				   track_id, &offset, &info, NULL);
> > > > +	if (status)
> > > > +		PMD_DRV_LOG(ERR, "Failed to add/remove profile info: "
> > > > +			    "offset %d, info %d",
> > > > +			    offset, info);
> > > > +
> > > > +	return status;
> > > > +}
> > > > +
> > > > +#define I40E_PROFILE_INFO_SIZE 48 #define
> I40E_MAX_PROFILE_NUM 16
> > > > +
> > > > +/* Check if the profile info exists */ static int
> > > > +i40e_check_profile_info(uint8_t port, uint8_t *profile_info_sec) {
> > > > +	struct rte_eth_dev *dev = &rte_eth_devices[port];
> > > > +	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data-
> > > > >dev_private);
> > > > +	uint8_t *buff;
> > > > +	struct rte_pmd_i40e_profile_list *p_list;
> > > > +	struct rte_pmd_i40e_profile_info *pinfo, *p;
> > > > +	uint32_t i;
> > > > +	int ret;
> > > > +
> > > > +	buff = rte_zmalloc("pinfo_list",
> > > > +			   (I40E_PROFILE_INFO_SIZE *
> > > > I40E_MAX_PROFILE_NUM + 4),
> > > > +			   0);
> > > > +	if (!buff) {
> > > > +		PMD_DRV_LOG(ERR, "failed to allocate memory");
> > > > +		return -1;
> > > > +	}
> > > > +
> > > > +	ret = i40e_aq_get_ppp_list(hw, (void *)buff,
> > > > +		      (I40E_PROFILE_INFO_SIZE * I40E_MAX_PROFILE_NUM +
> > > > 4),
> > > > +		      0, NULL);
> > > > +	if (ret) {
> > > > +		PMD_DRV_LOG(ERR, "Failed to get profile info list.");
> > > > +		rte_free(buff);
> > > > +		return -1;
> > > > +	}
> > > > +	p_list = (struct rte_pmd_i40e_profile_list *)buff;
> > > > +	pinfo = (struct rte_pmd_i40e_profile_info *)(profile_info_sec +
> > > > +			     sizeof(struct i40e_profile_section_header));
> > > > +	for (i = 0; i < p_list->p_count; i++) {
> > > > +		p = &p_list->p_info[i];
> > > > +		if ((pinfo->track_id == p->track_id) &&
> > > > +		    !memcmp(&pinfo->version, &p->version,
> > > > +			    sizeof(struct i40e_ppp_version)) &&
> > > > +		    !memcmp(&pinfo->name, &p->name,
> > > > +			    I40E_PPP_NAME_SIZE)) {
> > > > +			PMD_DRV_LOG(INFO, "Profile exists.");
> > > > +			rte_free(buff);
> > > > +			return 1;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	rte_free(buff);
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_pmd_i40e_process_ppp_package(uint8_t port, uint8_t *buff,
> > > > +				 uint32_t size, bool add)
> > >
> > > To make this function future-proof it is better not to use 'bool add'
> > > as there are at least three possible processing actions:
> > >
> > > 1. Process package and add it to the list of applied profiles
> > > (applying new
> > > personalization) 2. Process package and remove it from the list
> > > (restoring original configuration) 3. Process package and to not
> > > update the list (update already applied personalization)
> >
> > Thanks for your comments, I considered your suggestion before, but
> > it's a private API for user, I think there's only "add" and "remove"
> > for users, user shouldn't want to know if the profile needs to be
> > updated in the info list. So I think the first and the third items
> > above should be distinguished by driver instead of application. Driver
> > can distinguish them when driver parses the package by checking if track_id
> is 0.
> > What do you think?
> >
> > In my opinion, the first item means add a profile, the second item
> > means remove a profile, the third means add a read-only profile.
> > Please correct me if I'm wrong. In fact we only support the first in
> > this release, the second and third items above need to be supported after
> this release.
> 
> You already have "Action not supported temporarily" message now, so it will
> not break anything but will save users updating their apps ABI in future.
> In addition to read-only profiles, option 3 can be used for profiles which call
> generic admin commands, and only first call is needed to be registered. In
> this case TrackId will not be 0 and driver cannot used it directly.
> This is very minor change and do not see why not declare parameters
> properly from the first release if support for all three options is needed
> anyway.

OK, misunderstood for option 3, I thought there was only different track_id between option 1 and option 3.
Will update in next version, thanks for explanation.

> >
> > Beilei
> >
> > >
> > > > +{
> > > > +	struct rte_eth_dev *dev;
> > > > +	struct i40e_hw *hw;
> > > > +	struct i40e_package_header *pkg_hdr;
> > > > +	struct i40e_generic_seg_header *profile_seg_hdr;
> > > > +	struct i40e_generic_seg_header *metadata_seg_hdr;
> > > > +	uint32_t track_id;
> > > > +	uint8_t *profile_info_sec;
> > > > +	int is_exist;
> > > > +	enum i40e_status_code status = I40E_SUCCESS;
> > > > +
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
> > > > +
> > > > +	dev = &rte_eth_devices[port];
> > > > +
> > > > +	if (!is_device_supported(dev, &rte_i40e_pmd))
> > > > +		return -ENOTSUP;
> > > > +
> > > > +	hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > > > +
> > > > +	if (size < (sizeof(struct i40e_package_header) +
> > > > +		    sizeof(struct i40e_metadata_segment) +
> > > > +		    sizeof(uint32_t) * 2)) {
> > > > +		PMD_DRV_LOG(ERR, "Buff is invalid.");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	pkg_hdr = (struct i40e_package_header *)buff;
> > > > +
> > > > +	if (!pkg_hdr) {
> > > > +		PMD_DRV_LOG(ERR, "Failed to fill the package structure");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	if (pkg_hdr->segment_count < 2) {
> > > > +		PMD_DRV_LOG(ERR, "Segment_count should be 2 at least.");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	/* Find metadata segment */
> > > > +	metadata_seg_hdr =
> > > > i40e_find_segment_in_package(SEGMENT_TYPE_METADATA,
> > > > +							pkg_hdr);
> > > > +	if (!metadata_seg_hdr) {
> > > > +		PMD_DRV_LOG(ERR, "Failed to find metadata segment
> > > > header");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +	track_id = ((struct i40e_metadata_segment
> > > > +*)metadata_seg_hdr)->track_id;
> > > > +
> > > > +	/* Find profile segment */
> > > > +	profile_seg_hdr =
> > > > i40e_find_segment_in_package(SEGMENT_TYPE_I40E,
> > > > +						       pkg_hdr);
> > > > +	if (!profile_seg_hdr) {
> > > > +		PMD_DRV_LOG(ERR, "Failed to find profile segment
> header");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	profile_info_sec = rte_zmalloc("i40e_profile_info",
> > > > +			       sizeof(struct i40e_profile_section_header) +
> > > > +			       sizeof(struct i40e_profile_info),
> > > > +			       0);
> > > > +	if (!profile_info_sec) {
> > > > +		PMD_DRV_LOG(ERR, "Failed to allocate memory");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	if (add) {
> > > > +		/* Check if the profile exists */
> > > > +		i40e_generate_profile_info_sec(
> > > > +		     ((struct i40e_profile_segment *)profile_seg_hdr)->name,
> > > > +		     &((struct i40e_profile_segment *)profile_seg_hdr)-
> > > > >version,
> > > > +		     track_id, profile_info_sec, 1);
> > > > +		is_exist = i40e_check_profile_info(port, profile_info_sec);
> > > > +		if (is_exist) {
> > > > +			PMD_DRV_LOG(ERR, "Profile already exists.");
> > > > +			rte_free(profile_info_sec);
> > > > +			return 1;
> > > > +		}
> > > > +
> > > > +		/* Write profile to HW */
> > > > +		status = i40e_write_profile(hw,
> > > > +				 (struct i40e_profile_segment
> > > > *)profile_seg_hdr,
> > > > +				 track_id);
> > > > +		if (status)
> > > > +			PMD_DRV_LOG(ERR, "Failed to write profile.");
> > > > +
> > > > +		/* Add profile info to info list */
> > > > +		status = i40e_add_rm_profile_info(hw, profile_info_sec);
> > > > +		if (status)
> > > > +			PMD_DRV_LOG(ERR, "Failed to add profile info.");
> > > > +	} else
> > > > +		PMD_DRV_LOG(ERR, "Action not supported temporarily.");
> > > > +
> > > > +	rte_free(profile_info_sec);
> > > > +	return status;
> > > > +}
> > > > diff --git a/drivers/net/i40e/rte_pmd_i40e.h
> > > > b/drivers/net/i40e/rte_pmd_i40e.h index a0ad88c..4f6cdb5 100644
> > > > --- a/drivers/net/i40e/rte_pmd_i40e.h
> > > > +++ b/drivers/net/i40e/rte_pmd_i40e.h
> > > > @@ -65,6 +65,36 @@ struct rte_pmd_i40e_mb_event_param {
> > > >  	uint16_t msglen;   /**< length of the message */
> > > >  };
> > > >
> > > > +#define RTE_PMD_I40E_PPP_NAME_SIZE 32
> > > > +
> > > > +/**
> > > > + * Pipeline personalization profile version  */ struct
> > > > +rte_pmd_i40e_ppp_version {
> > > > +	uint8_t major;
> > > > +	uint8_t minor;
> > > > +	uint8_t update;
> > > > +	uint8_t draft;
> > > > +};
> > > > +
> > > > +/**
> > > > + * Structure of profile information  */ struct
> > > > +rte_pmd_i40e_profile_info {
> > > > +	uint32_t track_id;
> > > > +	struct rte_pmd_i40e_ppp_version version;
> > > > +	uint8_t reserved[8];
> > >
> > > Instead of uint8_t reserved[8] it should be
> > >     uint8_t owner;
> > >     uint8_t reserved[7];
> >
> > Actually I define this for getting profile info but not updating
> > profile info, so I thought there won't be "owner = REMOVE" in info
> > list. please correct me if I'm wrong. In base driver, there's "struct
> > i40e_profile_info " used for updating profile info.
> > All the structures defined in "rte_pmd_i40e.h" are used for getting
> > info by application.
> > Do you think "owner" member is used for user? If yes, I will update
> > it. If no, I think user will be confused by "owner".
> 
> After calling "get profile info list" 'owner' contains PF function which was
> used to execute a profile, not 'operation'.

OK, will update it.

> 
> >
> > >
> > > > +	uint8_t name[RTE_PMD_I40E_PPP_NAME_SIZE]; };
> > > > +
> > > > +/**
> > > > + * Structure of profile information list  */ struct
> > > > +rte_pmd_i40e_profile_list {
> > > > +	uint32_t p_count;
> > > > +	struct rte_pmd_i40e_profile_info p_info[1]; };
> > > > +
> > > >  /**
> > > >   * Notify VF when PF link status changes.
> > > >   *
> > > > @@ -332,4 +362,25 @@ int rte_pmd_i40e_get_vf_stats(uint8_t port,
> > > int
> > > > rte_pmd_i40e_reset_vf_stats(uint8_t port,
> > > >  				uint16_t vf_id);
> > > >
> > > > +/**
> > > > + * Load/Unload a ppp package
> > > > + *
> > > > + * @param port
> > > > + *    The port identifier of the Ethernet device.
> > > > + * @param buff
> > > > + *    buffer of package.
> > > > + * @param size
> > > > + *    size of buffer.
> > > > + * @param add
> > > > + *   - (1) write profile.
> > > > + *   - (0) remove profile.
> > > > + * @return
> > > > + *   - (0) if successful.
> > > > + *   - (-ENODEV) if *port* invalid.
> > > > + *   - (-EINVAL) if bad parameter.
> > > > + *   - (1) if profile exists.
> > > > + */
> > > > +int rte_pmd_i40e_process_ppp_package(uint8_t port, uint8_t *buff,
> > > > +				     uint32_t size, bool add);
> > > > +
> > > >  #endif /* _PMD_I40E_H_ */
> > > > diff --git a/drivers/net/i40e/rte_pmd_i40e_version.map
> > > > b/drivers/net/i40e/rte_pmd_i40e_version.map
> > > > index 7a5d211..01c4a90 100644
> > > > --- a/drivers/net/i40e/rte_pmd_i40e_version.map
> > > > +++ b/drivers/net/i40e/rte_pmd_i40e_version.map
> > > > @@ -22,3 +22,9 @@ DPDK_17.02 {
> > > >  	rte_pmd_i40e_set_vf_vlan_tag;
> > > >
> > > >  } DPDK_2.0;
> > > > +
> > > > +DPDK_17.05 {
> > > > +	global:
> > > > +
> > > > +	rte_pmd_i40e_process_ppp_package; };
> > > > --
> > > > 2.5.5
> > >
> > > Regards,
> > > Andrey

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline personalization profile processing
  @ 2017-03-25 21:03  3%         ` Chilikin, Andrey
  2017-03-27  2:09  0%           ` Xing, Beilei
  0 siblings, 1 reply; 200+ results
From: Chilikin, Andrey @ 2017-03-25 21:03 UTC (permalink / raw)
  To: Xing, Beilei, Wu, Jingjing; +Cc: Zhang, Helin, dev

Hi Beilei

> -----Original Message-----
> From: Xing, Beilei
> Sent: Saturday, March 25, 2017 4:04 AM
> To: Chilikin, Andrey <andrey.chilikin@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline personalization
> profile processing
> 
> Hi Andrey,
> 
> > -----Original Message-----
> > From: Chilikin, Andrey
> > Sent: Friday, March 24, 2017 10:53 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>
> > Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline
> > personalization profile processing
> >
> > Hi Beilei,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Beilei Xing
> > > Sent: Friday, March 24, 2017 10:19 AM
> > > To: Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> > > Subject: [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline
> > > personalization profile processing
> > >
> > > Add support for adding a pipeline personalization profile package.
> > >
> > > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> > > ---
> > >  app/test-pmd/cmdline.c                    |   1 +
> > >  drivers/net/i40e/i40e_ethdev.c            | 198
> > > ++++++++++++++++++++++++++++++
> > >  drivers/net/i40e/rte_pmd_i40e.h           |  51 ++++++++
> > >  drivers/net/i40e/rte_pmd_i40e_version.map |   6 +
> > >  4 files changed, 256 insertions(+)
> > >
> > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > 47f935d..6e0625d 100644
> > > --- a/app/test-pmd/cmdline.c
> > > +++ b/app/test-pmd/cmdline.c
> > > @@ -37,6 +37,7 @@
> > >  #include <stdio.h>
> > >  #include <stdint.h>
> > >  #include <stdarg.h>
> > > +#include <stdbool.h>
> > >  #include <string.h>
> > >  #include <termios.h>
> > >  #include <unistd.h>
> > > diff --git a/drivers/net/i40e/i40e_ethdev.c
> > > b/drivers/net/i40e/i40e_ethdev.c index 3702214..bea593f 100644
> > > --- a/drivers/net/i40e/i40e_ethdev.c
> > > +++ b/drivers/net/i40e/i40e_ethdev.c
> > > @@ -11259,3 +11259,201 @@ rte_pmd_i40e_reset_vf_stats(uint8_t
> > port,
> > >
> > >  	return 0;
> > >  }
> > > +
> > > +static void
> > > +i40e_generate_profile_info_sec(char *name, struct i40e_ppp_version
> > > *version,
> > > +			       uint32_t track_id, uint8_t *profile_info_sec,
> > > +			       bool add)
> > > +{
> > > +	struct i40e_profile_section_header *sec = NULL;
> > > +	struct i40e_profile_info *pinfo;
> > > +
> > > +	sec = (struct i40e_profile_section_header *)profile_info_sec;
> > > +	sec->tbl_size = 1;
> > > +	sec->data_end = sizeof(struct i40e_profile_section_header) +
> > > +		sizeof(struct i40e_profile_info);
> > > +	sec->section.type = SECTION_TYPE_INFO;
> > > +	sec->section.offset = sizeof(struct i40e_profile_section_header);
> > > +	sec->section.size = sizeof(struct i40e_profile_info);
> > > +	pinfo = (struct i40e_profile_info *)(profile_info_sec +
> > > +					     sec->section.offset);
> > > +	pinfo->track_id = track_id;
> > > +	memcpy(pinfo->name, name, I40E_PPP_NAME_SIZE);
> > > +	memcpy(&pinfo->version, version, sizeof(struct i40e_ppp_version));
> > > +	if (add)
> > > +		pinfo->op = I40E_PPP_ADD_TRACKID;
> > > +	else
> > > +		pinfo->op = I40E_PPP_REMOVE_TRACKID; }
> > > +
> > > +static enum i40e_status_code
> > > +i40e_add_rm_profile_info(struct i40e_hw *hw, uint8_t
> > > +*profile_info_sec) {
> > > +	enum i40e_status_code status = I40E_SUCCESS;
> > > +	struct i40e_profile_section_header *sec;
> > > +	uint32_t track_id;
> > > +	uint32_t offset = 0, info = 0;
> > > +
> > > +	sec = (struct i40e_profile_section_header *)profile_info_sec;
> > > +	track_id = ((struct i40e_profile_info *)(profile_info_sec +
> > > +					 sec->section.offset))->track_id;
> > > +
> > > +	status = i40e_aq_write_ppp(hw, (void *)sec, sec->data_end,
> > > +				   track_id, &offset, &info, NULL);
> > > +	if (status)
> > > +		PMD_DRV_LOG(ERR, "Failed to add/remove profile info: "
> > > +			    "offset %d, info %d",
> > > +			    offset, info);
> > > +
> > > +	return status;
> > > +}
> > > +
> > > +#define I40E_PROFILE_INFO_SIZE 48
> > > +#define I40E_MAX_PROFILE_NUM 16
> > > +
> > > +/* Check if the profile info exists */ static int
> > > +i40e_check_profile_info(uint8_t port, uint8_t *profile_info_sec) {
> > > +	struct rte_eth_dev *dev = &rte_eth_devices[port];
> > > +	struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data-
> > > >dev_private);
> > > +	uint8_t *buff;
> > > +	struct rte_pmd_i40e_profile_list *p_list;
> > > +	struct rte_pmd_i40e_profile_info *pinfo, *p;
> > > +	uint32_t i;
> > > +	int ret;
> > > +
> > > +	buff = rte_zmalloc("pinfo_list",
> > > +			   (I40E_PROFILE_INFO_SIZE *
> > > I40E_MAX_PROFILE_NUM + 4),
> > > +			   0);
> > > +	if (!buff) {
> > > +		PMD_DRV_LOG(ERR, "failed to allocate memory");
> > > +		return -1;
> > > +	}
> > > +
> > > +	ret = i40e_aq_get_ppp_list(hw, (void *)buff,
> > > +		      (I40E_PROFILE_INFO_SIZE * I40E_MAX_PROFILE_NUM +
> > > 4),
> > > +		      0, NULL);
> > > +	if (ret) {
> > > +		PMD_DRV_LOG(ERR, "Failed to get profile info list.");
> > > +		rte_free(buff);
> > > +		return -1;
> > > +	}
> > > +	p_list = (struct rte_pmd_i40e_profile_list *)buff;
> > > +	pinfo = (struct rte_pmd_i40e_profile_info *)(profile_info_sec +
> > > +			     sizeof(struct i40e_profile_section_header));
> > > +	for (i = 0; i < p_list->p_count; i++) {
> > > +		p = &p_list->p_info[i];
> > > +		if ((pinfo->track_id == p->track_id) &&
> > > +		    !memcmp(&pinfo->version, &p->version,
> > > +			    sizeof(struct i40e_ppp_version)) &&
> > > +		    !memcmp(&pinfo->name, &p->name,
> > > +			    I40E_PPP_NAME_SIZE)) {
> > > +			PMD_DRV_LOG(INFO, "Profile exists.");
> > > +			rte_free(buff);
> > > +			return 1;
> > > +		}
> > > +	}
> > > +
> > > +	rte_free(buff);
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_pmd_i40e_process_ppp_package(uint8_t port, uint8_t *buff,
> > > +				 uint32_t size, bool add)
> >
> > To make this function future-proof it is better not to use 'bool add'
> > as there are at least three possible processing actions:
> >
> > 1. Process package and add it to the list of applied profiles
> > (applying new
> > personalization) 2. Process package and remove it from the list
> > (restoring original configuration) 3. Process package and to not
> > update the list (update already applied personalization)
> 
> Thanks for your comments, I considered your suggestion before, but it's a
> private API for user, I think there's only "add" and "remove" for users, user
> shouldn't want to know if the profile needs to be updated in the info list. So I
> think the first and the third items above should be distinguished by driver
> instead of application. Driver can distinguish them when driver parses the
> package by checking if track_id is 0.
> What do you think?
> 
> In my opinion, the first item means add a profile, the second item means
> remove a profile, the third means add a read-only profile. Please correct me if
> I'm wrong. In fact we only support the first in this release, the second and third
> items above need to be supported after this release.

You already have "Action not supported temporarily" message now, so it will not break anything but will save users updating their apps ABI in future.
In addition to read-only profiles, option 3 can be used for profiles which call generic admin commands, and only first call is needed to be registered. In this case TrackId will not be 0 and driver cannot used it directly.
This is very minor change and do not see why not declare parameters properly from the first release if support for all three options is needed anyway.
> 
> Beilei
> 
> >
> > > +{
> > > +	struct rte_eth_dev *dev;
> > > +	struct i40e_hw *hw;
> > > +	struct i40e_package_header *pkg_hdr;
> > > +	struct i40e_generic_seg_header *profile_seg_hdr;
> > > +	struct i40e_generic_seg_header *metadata_seg_hdr;
> > > +	uint32_t track_id;
> > > +	uint8_t *profile_info_sec;
> > > +	int is_exist;
> > > +	enum i40e_status_code status = I40E_SUCCESS;
> > > +
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
> > > +
> > > +	dev = &rte_eth_devices[port];
> > > +
> > > +	if (!is_device_supported(dev, &rte_i40e_pmd))
> > > +		return -ENOTSUP;
> > > +
> > > +	hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > > +
> > > +	if (size < (sizeof(struct i40e_package_header) +
> > > +		    sizeof(struct i40e_metadata_segment) +
> > > +		    sizeof(uint32_t) * 2)) {
> > > +		PMD_DRV_LOG(ERR, "Buff is invalid.");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	pkg_hdr = (struct i40e_package_header *)buff;
> > > +
> > > +	if (!pkg_hdr) {
> > > +		PMD_DRV_LOG(ERR, "Failed to fill the package structure");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	if (pkg_hdr->segment_count < 2) {
> > > +		PMD_DRV_LOG(ERR, "Segment_count should be 2 at least.");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	/* Find metadata segment */
> > > +	metadata_seg_hdr =
> > > i40e_find_segment_in_package(SEGMENT_TYPE_METADATA,
> > > +							pkg_hdr);
> > > +	if (!metadata_seg_hdr) {
> > > +		PMD_DRV_LOG(ERR, "Failed to find metadata segment
> > > header");
> > > +		return -EINVAL;
> > > +	}
> > > +	track_id = ((struct i40e_metadata_segment
> > > +*)metadata_seg_hdr)->track_id;
> > > +
> > > +	/* Find profile segment */
> > > +	profile_seg_hdr =
> > > i40e_find_segment_in_package(SEGMENT_TYPE_I40E,
> > > +						       pkg_hdr);
> > > +	if (!profile_seg_hdr) {
> > > +		PMD_DRV_LOG(ERR, "Failed to find profile segment header");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	profile_info_sec = rte_zmalloc("i40e_profile_info",
> > > +			       sizeof(struct i40e_profile_section_header) +
> > > +			       sizeof(struct i40e_profile_info),
> > > +			       0);
> > > +	if (!profile_info_sec) {
> > > +		PMD_DRV_LOG(ERR, "Failed to allocate memory");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	if (add) {
> > > +		/* Check if the profile exists */
> > > +		i40e_generate_profile_info_sec(
> > > +		     ((struct i40e_profile_segment *)profile_seg_hdr)->name,
> > > +		     &((struct i40e_profile_segment *)profile_seg_hdr)-
> > > >version,
> > > +		     track_id, profile_info_sec, 1);
> > > +		is_exist = i40e_check_profile_info(port, profile_info_sec);
> > > +		if (is_exist) {
> > > +			PMD_DRV_LOG(ERR, "Profile already exists.");
> > > +			rte_free(profile_info_sec);
> > > +			return 1;
> > > +		}
> > > +
> > > +		/* Write profile to HW */
> > > +		status = i40e_write_profile(hw,
> > > +				 (struct i40e_profile_segment
> > > *)profile_seg_hdr,
> > > +				 track_id);
> > > +		if (status)
> > > +			PMD_DRV_LOG(ERR, "Failed to write profile.");
> > > +
> > > +		/* Add profile info to info list */
> > > +		status = i40e_add_rm_profile_info(hw, profile_info_sec);
> > > +		if (status)
> > > +			PMD_DRV_LOG(ERR, "Failed to add profile info.");
> > > +	} else
> > > +		PMD_DRV_LOG(ERR, "Action not supported temporarily.");
> > > +
> > > +	rte_free(profile_info_sec);
> > > +	return status;
> > > +}
> > > diff --git a/drivers/net/i40e/rte_pmd_i40e.h
> > > b/drivers/net/i40e/rte_pmd_i40e.h index a0ad88c..4f6cdb5 100644
> > > --- a/drivers/net/i40e/rte_pmd_i40e.h
> > > +++ b/drivers/net/i40e/rte_pmd_i40e.h
> > > @@ -65,6 +65,36 @@ struct rte_pmd_i40e_mb_event_param {
> > >  	uint16_t msglen;   /**< length of the message */
> > >  };
> > >
> > > +#define RTE_PMD_I40E_PPP_NAME_SIZE 32
> > > +
> > > +/**
> > > + * Pipeline personalization profile version  */ struct
> > > +rte_pmd_i40e_ppp_version {
> > > +	uint8_t major;
> > > +	uint8_t minor;
> > > +	uint8_t update;
> > > +	uint8_t draft;
> > > +};
> > > +
> > > +/**
> > > + * Structure of profile information  */ struct
> > > +rte_pmd_i40e_profile_info {
> > > +	uint32_t track_id;
> > > +	struct rte_pmd_i40e_ppp_version version;
> > > +	uint8_t reserved[8];
> >
> > Instead of uint8_t reserved[8] it should be
> >     uint8_t owner;
> >     uint8_t reserved[7];
> 
> Actually I define this for getting profile info but not updating profile info, so I
> thought there won't be "owner = REMOVE" in info list. please correct me if I'm
> wrong. In base driver, there's "struct i40e_profile_info " used for updating
> profile info.
> All the structures defined in "rte_pmd_i40e.h" are used for getting info by
> application.
> Do you think "owner" member is used for user? If yes, I will update it. If no, I
> think user will be confused by "owner".

After calling "get profile info list" 'owner' contains PF function which was used to execute a profile, not 'operation'.

> 
> >
> > > +	uint8_t name[RTE_PMD_I40E_PPP_NAME_SIZE]; };
> > > +
> > > +/**
> > > + * Structure of profile information list  */ struct
> > > +rte_pmd_i40e_profile_list {
> > > +	uint32_t p_count;
> > > +	struct rte_pmd_i40e_profile_info p_info[1]; };
> > > +
> > >  /**
> > >   * Notify VF when PF link status changes.
> > >   *
> > > @@ -332,4 +362,25 @@ int rte_pmd_i40e_get_vf_stats(uint8_t port,
> > int
> > > rte_pmd_i40e_reset_vf_stats(uint8_t port,
> > >  				uint16_t vf_id);
> > >
> > > +/**
> > > + * Load/Unload a ppp package
> > > + *
> > > + * @param port
> > > + *    The port identifier of the Ethernet device.
> > > + * @param buff
> > > + *    buffer of package.
> > > + * @param size
> > > + *    size of buffer.
> > > + * @param add
> > > + *   - (1) write profile.
> > > + *   - (0) remove profile.
> > > + * @return
> > > + *   - (0) if successful.
> > > + *   - (-ENODEV) if *port* invalid.
> > > + *   - (-EINVAL) if bad parameter.
> > > + *   - (1) if profile exists.
> > > + */
> > > +int rte_pmd_i40e_process_ppp_package(uint8_t port, uint8_t *buff,
> > > +				     uint32_t size, bool add);
> > > +
> > >  #endif /* _PMD_I40E_H_ */
> > > diff --git a/drivers/net/i40e/rte_pmd_i40e_version.map
> > > b/drivers/net/i40e/rte_pmd_i40e_version.map
> > > index 7a5d211..01c4a90 100644
> > > --- a/drivers/net/i40e/rte_pmd_i40e_version.map
> > > +++ b/drivers/net/i40e/rte_pmd_i40e_version.map
> > > @@ -22,3 +22,9 @@ DPDK_17.02 {
> > >  	rte_pmd_i40e_set_vf_vlan_tag;
> > >
> > >  } DPDK_2.0;
> > > +
> > > +DPDK_17.05 {
> > > +	global:
> > > +
> > > +	rte_pmd_i40e_process_ppp_package;
> > > +};
> > > --
> > > 2.5.5
> >
> > Regards,
> > Andrey

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v3 09/14] ring: allow dequeue fns to return remaining entry count
                         ` (5 preceding siblings ...)
  2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
@ 2017-03-24 17:10  2%     ` Bruce Richardson
    7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-24 17:10 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

Add an extra parameter to the ring dequeue burst/bulk functions so that
those functions can optionally return the amount of remaining objs in the
ring. This information can be used by applications in a number of ways,
for instance, with single-consumer queues, it provides a max
dequeue size which is guaranteed to work.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/pdump/main.c                                   |  2 +-
 doc/guides/rel_notes/release_17_05.rst             |  8 ++
 drivers/crypto/null/null_crypto_pmd.c              |  2 +-
 drivers/net/bonding/rte_eth_bond_pmd.c             |  3 +-
 drivers/net/ring/rte_eth_ring.c                    |  2 +-
 examples/distributor/main.c                        |  2 +-
 examples/load_balancer/runtime.c                   |  6 +-
 .../client_server_mp/mp_client/client.c            |  3 +-
 examples/packet_ordering/main.c                    |  6 +-
 examples/qos_sched/app_thread.c                    |  6 +-
 examples/quota_watermark/qw/main.c                 |  5 +-
 examples/server_node_efd/node/node.c               |  2 +-
 lib/librte_hash/rte_cuckoo_hash.c                  |  3 +-
 lib/librte_mempool/rte_mempool_ring.c              |  4 +-
 lib/librte_port/rte_port_frag.c                    |  3 +-
 lib/librte_port/rte_port_ring.c                    |  6 +-
 lib/librte_ring/rte_ring.h                         | 90 +++++++++++-----------
 test/test-pipeline/runtime.c                       |  6 +-
 test/test/test_link_bonding_mode4.c                |  3 +-
 test/test/test_pmd_ring_perf.c                     |  7 +-
 test/test/test_ring.c                              | 54 ++++++-------
 test/test/test_ring_perf.c                         | 20 +++--
 test/test/test_table_acl.c                         |  2 +-
 test/test/test_table_pipeline.c                    |  2 +-
 test/test/test_table_ports.c                       |  8 +-
 test/test/virtual_pmd.c                            |  4 +-
 26 files changed, 145 insertions(+), 114 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index b88090d..3b13753 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -496,7 +496,7 @@ pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
 
 	/* first dequeue packets from ring of primary process */
 	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
-			(void *)rxtx_bufs, BURST_SIZE);
+			(void *)rxtx_bufs, BURST_SIZE, NULL);
 	stats->dequeue_pkts += nb_in_deq;
 
 	if (nb_in_deq) {
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index dc1749b..f0eeac2 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -133,6 +133,8 @@ API Changes
   * added an extra parameter to the burst/bulk enqueue functions to
     return the number of free spaces in the ring after enqueue. This can
     be used by an application to implement its own watermark functionality.
+  * added an extra parameter to the burst/bulk dequeue functions to return
+    the number elements remaining in the ring after dequeue.
   * changed the return value of the enqueue and dequeue bulk functions to
     match that of the burst equivalents. In all cases, ring functions which
     operate on multiple packets now return the number of elements enqueued
@@ -145,6 +147,12 @@ API Changes
     - ``rte_ring_sc_dequeue_bulk``
     - ``rte_ring_dequeue_bulk``
 
+    NOTE: the above functions all have different parameters as well as
+    different return values, due to the other listed changes above. This
+    means that all instances of the functions in existing code will be
+    flagged by the compiler. The return value usage should be checked
+    while fixing the compiler error due to the extra parameter.
+
 ABI Changes
 -----------
 
diff --git a/drivers/crypto/null/null_crypto_pmd.c b/drivers/crypto/null/null_crypto_pmd.c
index ed5a9fc..f68ec8d 100644
--- a/drivers/crypto/null/null_crypto_pmd.c
+++ b/drivers/crypto/null/null_crypto_pmd.c
@@ -155,7 +155,7 @@ null_crypto_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_pkts,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index f3ac9e2..96638af 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1008,7 +1008,8 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		struct port *port = &mode_8023ad_ports[slaves[i]];
 
 		slave_slow_nb_pkts[i] = rte_ring_dequeue_burst(port->tx_ring,
-				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS);
+				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS,
+				NULL);
 		slave_nb_pkts[i] = slave_slow_nb_pkts[i];
 
 		for (j = 0; j < slave_slow_nb_pkts[i]; j++)
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index adbf478..77ef3a1 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -88,7 +88,7 @@ eth_ring_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
 	void **ptrs = (void *)&bufs[0];
 	struct ring_queue *r = q;
 	const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng,
-			ptrs, nb_bufs);
+			ptrs, nb_bufs, NULL);
 	if (r->rng->flags & RING_F_SC_DEQ)
 		r->rx_pkts.cnt += nb_rx;
 	else
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index bb84f13..90c9613 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -330,7 +330,7 @@ lcore_tx(struct rte_ring *in_r)
 
 			struct rte_mbuf *bufs[BURST_SIZE];
 			const uint16_t nb_rx = rte_ring_dequeue_burst(in_r,
-					(void *)bufs, BURST_SIZE);
+					(void *)bufs, BURST_SIZE, NULL);
 			app_stats.tx.dequeue_pkts += nb_rx;
 
 			/* if we get no traffic, flush anything we have */
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 1645994..8192c08 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -349,7 +349,8 @@ app_lcore_io_tx(
 			ret = rte_ring_sc_dequeue_bulk(
 				ring,
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
-				bsz_rd);
+				bsz_rd,
+				NULL);
 
 			if (unlikely(ret == 0))
 				continue;
@@ -504,7 +505,8 @@ app_lcore_worker(
 		ret = rte_ring_sc_dequeue_bulk(
 			ring_in,
 			(void **) lp->mbuf_in.array,
-			bsz_rd);
+			bsz_rd,
+			NULL);
 
 		if (unlikely(ret == 0))
 			continue;
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index dca9eb9..01b535c 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -279,7 +279,8 @@ main(int argc, char *argv[])
 		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts,
+				PKT_READ_SIZE, NULL);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c
index 569b6da..49ae35b 100644
--- a/examples/packet_ordering/main.c
+++ b/examples/packet_ordering/main.c
@@ -462,7 +462,7 @@ worker_thread(void *args_ptr)
 
 		/* dequeue the mbufs from rx_to_workers ring */
 		burst_size = rte_ring_dequeue_burst(ring_in,
-				(void *)burst_buffer, MAX_PKTS_BURST);
+				(void *)burst_buffer, MAX_PKTS_BURST, NULL);
 		if (unlikely(burst_size == 0))
 			continue;
 
@@ -510,7 +510,7 @@ send_thread(struct send_thread_args *args)
 
 		/* deque the mbufs from workers_to_tx ring */
 		nb_dq_mbufs = rte_ring_dequeue_burst(args->ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(nb_dq_mbufs == 0))
 			continue;
@@ -595,7 +595,7 @@ tx_thread(struct rte_ring *ring_in)
 
 		/* deque the mbufs from workers_to_tx ring */
 		dqnum = rte_ring_dequeue_burst(ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(dqnum == 0))
 			continue;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 0c81a15..15f117f 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -179,7 +179,7 @@ app_tx_thread(struct thread_conf **confs)
 
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
-					burst_conf.qos_dequeue);
+					burst_conf.qos_dequeue, NULL);
 		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
@@ -218,7 +218,7 @@ app_worker_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
@@ -254,7 +254,7 @@ app_mixed_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
diff --git a/examples/quota_watermark/qw/main.c b/examples/quota_watermark/qw/main.c
index 57df8ef..2dcddea 100644
--- a/examples/quota_watermark/qw/main.c
+++ b/examples/quota_watermark/qw/main.c
@@ -247,7 +247,8 @@ pipeline_stage(__attribute__((unused)) void *args)
 			}
 
 			/* Dequeue up to quota mbuf from rx */
-			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts, *quota);
+			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts,
+					*quota, NULL);
 			if (unlikely(nb_dq_pkts < 0))
 				continue;
 
@@ -305,7 +306,7 @@ send_stage(__attribute__((unused)) void *args)
 
 			/* Dequeue packets from tx and send them */
 			nb_dq_pkts = (uint16_t) rte_ring_dequeue_burst(tx,
-					(void *) tx_pkts, *quota);
+					(void *) tx_pkts, *quota, NULL);
 			rte_eth_tx_burst(dest_port_id, 0, tx_pkts, nb_dq_pkts);
 
 			/* TODO: Check if nb_dq_pkts == nb_tx_pkts? */
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index 9ec6a05..f780b92 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) == 0))
+					rx_pkts, NULL) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 6552199..645c0cf 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -536,7 +536,8 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
 		if (cached_free_slots->len == 0) {
 			/* Need to get another burst of free slots from global ring */
 			n_slots = rte_ring_mc_dequeue_burst(h->free_slots,
-					cached_free_slots->objs, LCORE_CACHE_SIZE);
+					cached_free_slots->objs,
+					LCORE_CACHE_SIZE, NULL);
 			if (n_slots == 0)
 				return -ENOSPC;
 
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index 9b8fd2b..5c132bf 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -58,14 +58,14 @@ static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_mc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_sc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_port/rte_port_frag.c b/lib/librte_port/rte_port_frag.c
index 0fcace9..320407e 100644
--- a/lib/librte_port/rte_port_frag.c
+++ b/lib/librte_port/rte_port_frag.c
@@ -186,7 +186,8 @@ rte_port_ring_reader_frag_rx(void *port,
 		/* If "pkts" buffer is empty, read packet burst from ring */
 		if (p->n_pkts == 0) {
 			p->n_pkts = rte_ring_sc_dequeue_burst(p->ring,
-				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX);
+				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX,
+				NULL);
 			RTE_PORT_RING_READER_FRAG_STATS_PKTS_IN_ADD(p, p->n_pkts);
 			if (p->n_pkts == 0)
 				return n_pkts_out;
diff --git a/lib/librte_port/rte_port_ring.c b/lib/librte_port/rte_port_ring.c
index c5dbe07..85fad44 100644
--- a/lib/librte_port/rte_port_ring.c
+++ b/lib/librte_port/rte_port_ring.c
@@ -111,7 +111,8 @@ rte_port_ring_reader_rx(void *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
@@ -124,7 +125,8 @@ rte_port_ring_multi_reader_rx(void *port, struct rte_mbuf **pkts,
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 61a4dc8..b05fecb 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -488,7 +488,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -497,11 +498,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	unsigned int i;
 	uint32_t mask = r->mask;
 
-	/* Avoid the unnecessary cmpset operation below, which is also
-	 * potentially harmful when n equals 0. */
-	if (n == 0)
-		return 0;
-
 	/* move cons.head atomically */
 	do {
 		/* Restore n as it may change every loop */
@@ -516,15 +512,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		entries = (prod_tail - cons_head);
 
 		/* Set the actual entries for dequeue */
-		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED)
-				return 0;
-			else {
-				if (unlikely(entries == 0))
-					return 0;
-				n = entries;
-			}
-		}
+		if (n > entries)
+			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+		if (unlikely(n == 0))
+			goto end;
 
 		cons_next = cons_head + n;
 		success = rte_atomic32_cmpset(&r->cons.head, cons_head,
@@ -543,7 +535,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		rte_pause();
 
 	r->cons.tail = cons_next;
-
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -567,7 +561,8 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  */
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -582,15 +577,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * and size(ring)-1. */
 	entries = prod_tail - cons_head;
 
-	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED)
-			return 0;
-		else {
-			if (unlikely(entries == 0))
-				return 0;
-			n = entries;
-		}
-	}
+	if (n > entries)
+		n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+	if (unlikely(entries == 0))
+		goto end;
 
 	cons_next = cons_head + n;
 	r->cons.head = cons_next;
@@ -600,6 +591,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -746,9 +740,11 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -765,9 +761,11 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -787,12 +785,13 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned int n,
+		unsigned int *available)
 {
 	if (r->cons.single)
-		return rte_ring_sc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_sc_dequeue_bulk(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_mc_dequeue_bulk(r, obj_table, n, available);
 }
 
 /**
@@ -813,7 +812,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1, NULL)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -831,7 +830,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -853,7 +852,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -1043,9 +1042,11 @@ rte_ring_enqueue_burst(struct rte_ring *r, void * const *obj_table,
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1063,9 +1064,11 @@ rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1085,12 +1088,13 @@ rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   - Number of objects dequeued
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
 	if (r->cons.single)
-		return rte_ring_sc_dequeue_burst(r, obj_table, n);
+		return rte_ring_sc_dequeue_burst(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_burst(r, obj_table, n);
+		return rte_ring_mc_dequeue_burst(r, obj_table, n, available);
 }
 
 #ifdef __cplusplus
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index c06ff54..8970e1c 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -121,7 +121,8 @@ app_main_loop_worker(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_rx[i],
 			(void **) worker_mbuf->array,
-			app.burst_size_worker_read);
+			app.burst_size_worker_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
@@ -151,7 +152,8 @@ app_main_loop_tx(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_tx[i],
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
-			app.burst_size_tx_read);
+			app.burst_size_tx_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
diff --git a/test/test/test_link_bonding_mode4.c b/test/test/test_link_bonding_mode4.c
index 8df28b4..15091b1 100644
--- a/test/test/test_link_bonding_mode4.c
+++ b/test/test/test_link_bonding_mode4.c
@@ -193,7 +193,8 @@ static uint8_t lacpdu_rx_count[RTE_MAX_ETHPORTS] = {0, };
 static int
 slave_get_pkts(struct slave_conf *slave, struct rte_mbuf **buf, uint16_t size)
 {
-	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf, size);
+	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf,
+			size, NULL);
 }
 
 /*
diff --git a/test/test/test_pmd_ring_perf.c b/test/test/test_pmd_ring_perf.c
index 045a7f2..004882a 100644
--- a/test/test/test_pmd_ring_perf.c
+++ b/test/test/test_pmd_ring_perf.c
@@ -67,7 +67,7 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t eth_start = rte_rdtsc();
@@ -99,7 +99,7 @@ test_single_enqueue_dequeue(void)
 	rte_compiler_barrier();
 	for (i = 0; i < iterations; i++) {
 		rte_ring_enqueue_bulk(r, &burst, 1, NULL);
-		rte_ring_dequeue_bulk(r, &burst, 1);
+		rte_ring_dequeue_bulk(r, &burst, 1, NULL);
 	}
 	const uint64_t sc_end = rte_rdtsc_precise();
 	rte_compiler_barrier();
@@ -133,7 +133,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, (void *)burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, (void *)burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, (void *)burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index b0ca88b..858ebc1 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -119,7 +119,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		    __func__, i, rand);
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand,
 				NULL) != 0);
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand,
+				NULL) == rand);
 
 		/* fill the ring */
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz, NULL) != 0);
@@ -129,7 +130,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz,
+				NULL) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -186,19 +188,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -232,19 +234,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -265,7 +267,7 @@ test_ring_basic(void)
 		cur_src += MAX_BULK;
 		if (ret == 0)
 			goto fail;
-		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if (ret == 0)
 			goto fail;
@@ -303,13 +305,13 @@ test_ring_basic(void)
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue2\n");
@@ -390,19 +392,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1) ;
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -451,19 +453,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available memory space for the exact MAX_BULK entries */
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -505,19 +507,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -539,7 +541,7 @@ test_ring_burst_basic(void)
 		cur_src += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
@@ -578,19 +580,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available objects - the exact MAX_BULK */
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -613,7 +615,7 @@ test_ring_burst_basic(void)
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret != 2)
 		goto fail;
@@ -753,7 +755,7 @@ test_ring_basic_ex(void)
 		goto fail_test;
 	}
 
-	ret = rte_ring_dequeue_burst(rp, obj, 2);
+	ret = rte_ring_dequeue_burst(rp, obj, 2, NULL);
 	if (ret != 2) {
 		printf("test_ring_basic_ex: rte_ring_dequeue_burst fails \n");
 		goto fail_test;
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index f95a8e9..ed89896 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -152,12 +152,12 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t mc_end = rte_rdtsc();
 
 	printf("SC empty dequeue: %.2F\n",
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
@@ -325,7 +325,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -333,7 +334,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
@@ -361,7 +363,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -369,7 +372,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
diff --git a/test/test/test_table_acl.c b/test/test/test_table_acl.c
index b3bfda4..4d43be7 100644
--- a/test/test/test_table_acl.c
+++ b/test/test/test_table_acl.c
@@ -713,7 +713,7 @@ test_pipeline_single_filter(int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0) {
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_pipeline.c b/test/test/test_table_pipeline.c
index 36bfeda..b58aa5d 100644
--- a/test/test/test_table_pipeline.c
+++ b/test/test/test_table_pipeline.c
@@ -494,7 +494,7 @@ test_pipeline_single_filter(int test_type, int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0)
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_ports.c b/test/test/test_table_ports.c
index 395f4f3..39592ce 100644
--- a/test/test/test_table_ports.c
+++ b/test/test/test_table_ports.c
@@ -163,7 +163,7 @@ test_port_ring_writer(void)
 	rte_port_ring_writer_ops.f_flush(port);
 	expected_pkts = 1;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -7;
@@ -178,7 +178,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -193,7 +193,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -208,7 +208,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -9;
diff --git a/test/test/virtual_pmd.c b/test/test/virtual_pmd.c
index 39e070c..b209355 100644
--- a/test/test/virtual_pmd.c
+++ b/test/test/virtual_pmd.c
@@ -342,7 +342,7 @@ virtual_ethdev_rx_burst_success(void *queue __rte_unused,
 	dev_private = vrtl_eth_dev->data->dev_private;
 
 	rx_count = rte_ring_dequeue_burst(dev_private->rx_queue, (void **) bufs,
-			nb_pkts);
+			nb_pkts, NULL);
 
 	/* increments ipackets count */
 	dev_private->eth_stats.ipackets += rx_count;
@@ -508,7 +508,7 @@ virtual_ethdev_get_mbufs_from_tx_queue(uint8_t port_id,
 
 	dev_private = vrtl_eth_dev->data->dev_private;
 	return rte_ring_dequeue_burst(dev_private->tx_queue, (void **)pkt_burst,
-		burst_length);
+		burst_length, NULL);
 }
 
 static uint8_t
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3 07/14] ring: make bulk and burst fn return vals consistent
                         ` (4 preceding siblings ...)
  2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 06/14] ring: remove watermark support Bruce Richardson
@ 2017-03-24 17:10  2%     ` Bruce Richardson
  2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
    7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-24 17:10 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

The bulk fns for rings returns 0 for all elements enqueued and negative
for no space. Change that to make them consistent with the burst functions
in returning the number of elements enqueued/dequeued, i.e. 0 or N.
This change also allows the return value from enq/deq to be used directly
without a branch for error checking.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/release_17_05.rst             |  11 +++
 doc/guides/sample_app_ug/server_node_efd.rst       |   2 +-
 examples/load_balancer/runtime.c                   |  16 ++-
 .../client_server_mp/mp_client/client.c            |   8 +-
 .../client_server_mp/mp_server/main.c              |   2 +-
 examples/qos_sched/app_thread.c                    |   8 +-
 examples/server_node_efd/node/node.c               |   2 +-
 examples/server_node_efd/server/main.c             |   2 +-
 lib/librte_mempool/rte_mempool_ring.c              |  12 ++-
 lib/librte_ring/rte_ring.h                         | 109 +++++++--------------
 test/test-pipeline/pipeline_hash.c                 |   2 +-
 test/test-pipeline/runtime.c                       |   8 +-
 test/test/test_ring.c                              |  46 +++++----
 test/test/test_ring_perf.c                         |   8 +-
 14 files changed, 106 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index af907b8..a465c69 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -130,6 +130,17 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
   * removed the function ``rte_ring_set_water_mark`` as part of a general
     removal of watermarks support in the library.
+  * changed the return value of the enqueue and dequeue bulk functions to
+    match that of the burst equivalents. In all cases, ring functions which
+    operate on multiple packets now return the number of elements enqueued
+    or dequeued, as appropriate. The updated functions are:
+
+    - ``rte_ring_mp_enqueue_bulk``
+    - ``rte_ring_sp_enqueue_bulk``
+    - ``rte_ring_enqueue_bulk``
+    - ``rte_ring_mc_dequeue_bulk``
+    - ``rte_ring_sc_dequeue_bulk``
+    - ``rte_ring_dequeue_bulk``
 
 ABI Changes
 -----------
diff --git a/doc/guides/sample_app_ug/server_node_efd.rst b/doc/guides/sample_app_ug/server_node_efd.rst
index 9b69cfe..e3a63c8 100644
--- a/doc/guides/sample_app_ug/server_node_efd.rst
+++ b/doc/guides/sample_app_ug/server_node_efd.rst
@@ -286,7 +286,7 @@ repeated infinitely.
 
         cl = &nodes[node];
         if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-                cl_rx_buf[node].count) != 0){
+                cl_rx_buf[node].count) != cl_rx_buf[node].count){
             for (j = 0; j < cl_rx_buf[node].count; j++)
                 rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
             cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 6944325..82b10bc 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -146,7 +146,7 @@ app_lcore_io_rx_buffer_to_send (
 		(void **) lp->rx.mbuf_out[worker].array,
 		bsz);
 
-	if (unlikely(ret == -ENOBUFS)) {
+	if (unlikely(ret == 0)) {
 		uint32_t k;
 		for (k = 0; k < bsz; k ++) {
 			struct rte_mbuf *m = lp->rx.mbuf_out[worker].array[k];
@@ -312,7 +312,7 @@ app_lcore_io_rx_flush(struct app_lcore_params_io *lp, uint32_t n_workers)
 			(void **) lp->rx.mbuf_out[worker].array,
 			lp->rx.mbuf_out[worker].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->rx.mbuf_out[worker].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->rx.mbuf_out[worker].array[k];
@@ -349,9 +349,8 @@ app_lcore_io_tx(
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
 				bsz_rd);
 
-			if (unlikely(ret == -ENOENT)) {
+			if (unlikely(ret == 0))
 				continue;
-			}
 
 			n_mbufs += bsz_rd;
 
@@ -505,9 +504,8 @@ app_lcore_worker(
 			(void **) lp->mbuf_in.array,
 			bsz_rd);
 
-		if (unlikely(ret == -ENOENT)) {
+		if (unlikely(ret == 0))
 			continue;
-		}
 
 #if APP_WORKER_DROP_ALL_PACKETS
 		for (j = 0; j < bsz_rd; j ++) {
@@ -559,7 +557,7 @@ app_lcore_worker(
 
 #if APP_STATS
 			lp->rings_out_iters[port] ++;
-			if (ret == 0) {
+			if (ret > 0) {
 				lp->rings_out_count[port] += 1;
 			}
 			if (lp->rings_out_iters[port] == APP_STATS){
@@ -572,7 +570,7 @@ app_lcore_worker(
 			}
 #endif
 
-			if (unlikely(ret == -ENOBUFS)) {
+			if (unlikely(ret == 0)) {
 				uint32_t k;
 				for (k = 0; k < bsz_wr; k ++) {
 					struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
@@ -609,7 +607,7 @@ app_lcore_worker_flush(struct app_lcore_params_worker *lp)
 			(void **) lp->mbuf_out[port].array,
 			lp->mbuf_out[port].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->mbuf_out[port].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index d4f9ca3..dca9eb9 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -276,14 +276,10 @@ main(int argc, char *argv[])
 	printf("[Press Ctrl-C to quit ...]\n");
 
 	for (;;) {
-		uint16_t i, rx_pkts = PKT_READ_SIZE;
+		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		/* try dequeuing max possible packets first, if that fails, get the
-		 * most we can. Loop body should only execute once, maximum */
-		while (rx_pkts > 0 &&
-				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts, rx_pkts) != 0))
-			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring), PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/multi_process/client_server_mp/mp_server/main.c b/examples/multi_process/client_server_mp/mp_server/main.c
index a6dc12d..19c95b2 100644
--- a/examples/multi_process/client_server_mp/mp_server/main.c
+++ b/examples/multi_process/client_server_mp/mp_server/main.c
@@ -227,7 +227,7 @@ flush_rx_queue(uint16_t client)
 
 	cl = &clients[client];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[client].buffer,
-			cl_rx_buf[client].count) != 0){
+			cl_rx_buf[client].count) == 0){
 		for (j = 0; j < cl_rx_buf[client].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[client].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[client].count;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 70fdcdb..dab4594 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -107,7 +107,7 @@ app_rx_thread(struct thread_conf **confs)
 			}
 
 			if (unlikely(rte_ring_sp_enqueue_bulk(conf->rx_ring,
-								(void **)rx_mbufs, nb_rx) != 0)) {
+					(void **)rx_mbufs, nb_rx) == 0)) {
 				for(i = 0; i < nb_rx; i++) {
 					rte_pktmbuf_free(rx_mbufs[i]);
 
@@ -180,7 +180,7 @@ app_tx_thread(struct thread_conf **confs)
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
 					burst_conf.qos_dequeue);
-		if (likely(retval == 0)) {
+		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
 			conf->counter = 0; /* reset empty read loop counter */
@@ -230,7 +230,9 @@ app_worker_thread(struct thread_conf **confs)
 		nb_pkt = rte_sched_port_dequeue(conf->sched_port, mbufs,
 					burst_conf.qos_dequeue);
 		if (likely(nb_pkt > 0))
-			while (rte_ring_sp_enqueue_bulk(conf->tx_ring, (void **)mbufs, nb_pkt) != 0);
+			while (rte_ring_sp_enqueue_bulk(conf->tx_ring,
+					(void **)mbufs, nb_pkt) == 0)
+				; /* empty body */
 
 		conf_idx++;
 		if (confs[conf_idx] == NULL)
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index a6c0c70..9ec6a05 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) != 0))
+					rx_pkts) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/examples/server_node_efd/server/main.c b/examples/server_node_efd/server/main.c
index 1a54d1b..3eb7fac 100644
--- a/examples/server_node_efd/server/main.c
+++ b/examples/server_node_efd/server/main.c
@@ -247,7 +247,7 @@ flush_rx_queue(uint16_t node)
 
 	cl = &nodes[node];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-			cl_rx_buf[node].count) != 0){
+			cl_rx_buf[node].count) != cl_rx_buf[node].count){
 		for (j = 0; j < cl_rx_buf[node].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index b9aa64d..409b860 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -42,26 +42,30 @@ static int
 common_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_mp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_sp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_mc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_sc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 906e8ae..34b438c 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -349,14 +349,10 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -388,7 +384,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOBUFS;
+				return 0;
 			else {
 				/* No free entry available */
 				if (unlikely(free_entries == 0))
@@ -414,7 +410,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -430,14 +426,10 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -457,7 +449,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOBUFS;
+			return 0;
 		else {
 			/* No free entry available */
 			if (unlikely(free_entries == 0))
@@ -474,7 +466,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -495,16 +487,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
 
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -536,7 +523,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOENT;
+				return 0;
 			else {
 				if (unlikely(entries == 0))
 					return 0;
@@ -562,7 +549,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	r->cons.tail = cons_next;
 
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -580,15 +567,10 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -607,7 +589,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	if (n > entries) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOENT;
+			return 0;
 		else {
 			if (unlikely(entries == 0))
 				return 0;
@@ -623,7 +605,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -639,10 +621,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -659,10 +640,9 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -683,10 +663,9 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 		      unsigned n)
 {
@@ -713,7 +692,7 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 static inline int __attribute__((always_inline))
 rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_mp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_mp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -730,7 +709,7 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_sp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_sp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -751,10 +730,7 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_enqueue(struct rte_ring *r, void *obj)
 {
-	if (r->prod.single)
-		return rte_ring_sp_enqueue(r, obj);
-	else
-		return rte_ring_mp_enqueue(r, obj);
+	return rte_ring_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -770,11 +746,9 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -791,11 +765,9 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects to dequeue from the ring to the obj_table,
  *   must be strictly positive.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -815,11 +787,9 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue, no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	if (r->cons.single)
@@ -846,7 +816,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -864,7 +834,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -886,10 +856,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	if (r->cons.single)
-		return rte_ring_sc_dequeue(r, obj_p);
-	else
-		return rte_ring_mc_dequeue(r, obj_p);
+	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
diff --git a/test/test-pipeline/pipeline_hash.c b/test/test-pipeline/pipeline_hash.c
index 10d2869..1ac0aa8 100644
--- a/test/test-pipeline/pipeline_hash.c
+++ b/test/test-pipeline/pipeline_hash.c
@@ -547,6 +547,6 @@ app_main_loop_rx_metadata(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index 42a6142..4e20669 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -98,7 +98,7 @@ app_main_loop_rx(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -123,7 +123,7 @@ app_main_loop_worker(void) {
 			(void **) worker_mbuf->array,
 			app.burst_size_worker_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		do {
@@ -131,7 +131,7 @@ app_main_loop_worker(void) {
 				app.rings_tx[i ^ 1],
 				(void **) worker_mbuf->array,
 				app.burst_size_worker_write);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -152,7 +152,7 @@ app_main_loop_tx(void) {
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
 			app.burst_size_tx_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		n_mbufs += app.burst_size_tx_read;
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 666a451..112433b 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -117,20 +117,18 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		rand = RTE_MAX(rte_rand() % RING_SIZE, 1UL);
 		printf("%s: iteration %u, random shift: %u;\n",
 		    __func__, i, rand);
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rand));
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rand));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand) != 0);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
 
 		/* fill the ring */
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rsz));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz) != 0);
 		TEST_RING_VERIFY(0 == rte_ring_free_count(r));
 		TEST_RING_VERIFY(rsz == rte_ring_count(r));
 		TEST_RING_VERIFY(rte_ring_full(r));
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rsz));
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -171,37 +169,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -217,37 +215,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -264,11 +262,11 @@ test_ring_basic(void)
 	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
 		ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 		cur_src += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 		cur_dst += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 	}
 
@@ -294,25 +292,25 @@ test_ring_basic(void)
 
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue2\n");
 		goto fail;
 	}
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index 320c20c..8ccbdef 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -195,13 +195,13 @@ enqueue_bulk(void *p)
 
 	const uint64_t sp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_sp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sp_end = rte_rdtsc();
 
 	const uint64_t mp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_mp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mp_end = rte_rdtsc();
 
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3 06/14] ring: remove watermark support
                         ` (3 preceding siblings ...)
  2017-03-24 17:09  4%     ` [dpdk-dev] [PATCH v3 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
@ 2017-03-24 17:10  2%     ` Bruce Richardson
  2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-24 17:10 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

Remove the watermark support. A future commit will add support for having
enqueue functions return the amount of free space in the ring, which will
allow applications to implement their own watermark checks, while also
being more useful to the app.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V2: fix missed references to watermarks in v1
---
 doc/guides/prog_guide/ring_lib.rst     |   8 --
 doc/guides/rel_notes/release_17_05.rst |   2 +
 examples/Makefile                      |   2 +-
 lib/librte_ring/rte_ring.c             |  23 -----
 lib/librte_ring/rte_ring.h             |  58 +------------
 test/test/autotest_test_funcs.py       |   7 --
 test/test/commands.c                   |  52 ------------
 test/test/test_ring.c                  | 149 +--------------------------------
 8 files changed, 8 insertions(+), 293 deletions(-)

diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index d4ab502..b31ab7a 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -102,14 +102,6 @@ Name
 A ring is identified by a unique name.
 It is not possible to create two rings with the same name (rte_ring_create() returns NULL if this is attempted).
 
-Water Marking
-~~~~~~~~~~~~~
-
-The ring can have a high water mark (threshold).
-Once an enqueue operation reaches the high water mark, the producer is notified, if the water mark is configured.
-
-This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 556869f..af907b8 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -128,6 +128,8 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
+  * removed the function ``rte_ring_set_water_mark`` as part of a general
+    removal of watermarks support in the library.
 
 ABI Changes
 -----------
diff --git a/examples/Makefile b/examples/Makefile
index da2bfdd..19cd5ad 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -81,7 +81,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += packet_ordering
 DIRS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ptpclient
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
-DIRS-y += quota_watermark
+#DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
 ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 934ce87..25f64f0 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -138,7 +138,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->watermark = count;
 	r->prod.single = !!(flags & RING_F_SP_ENQ);
 	r->cons.single = !!(flags & RING_F_SC_DEQ);
 	r->size = count;
@@ -256,24 +255,6 @@ rte_ring_free(struct rte_ring *r)
 	rte_free(te);
 }
 
-/*
- * change the high water mark. If *count* is 0, water marking is
- * disabled
- */
-int
-rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
-{
-	if (count >= r->size)
-		return -EINVAL;
-
-	/* if count is 0, disable the watermarking */
-	if (count == 0)
-		count = r->size;
-
-	r->watermark = count;
-	return 0;
-}
-
 /* dump the status of the ring on the console */
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
@@ -287,10 +268,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->watermark == r->size)
-		fprintf(f, "  watermark=0\n");
-	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index f8ac7f5..906e8ae 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -153,7 +153,6 @@ struct rte_ring {
 			/**< Memzone, if any, containing the rte_ring */
 	uint32_t size;           /**< Size of ring. */
 	uint32_t mask;           /**< Mask (size-1) of ring. */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -168,7 +167,6 @@ struct rte_ring {
 
 #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer". */
 #define RING_F_SC_DEQ 0x0002 /**< The default dequeue is "single-consumer". */
-#define RTE_RING_QUOT_EXCEED (1 << 31)  /**< Quota exceed for burst ops */
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
@@ -274,26 +272,6 @@ struct rte_ring *rte_ring_create(const char *name, unsigned count,
 void rte_ring_free(struct rte_ring *r);
 
 /**
- * Change the high water mark.
- *
- * If *count* is 0, water marking is disabled. Otherwise, it is set to the
- * *count* value. The *count* value must be greater than 0 and less
- * than the ring size.
- *
- * This function can be called at any time (not necessarily at
- * initialization).
- *
- * @param r
- *   A pointer to the ring structure.
- * @param count
- *   The new water mark value.
- * @return
- *   - 0: Success; water mark changed.
- *   - -EINVAL: Invalid water mark value.
- */
-int rte_ring_set_water_mark(struct rte_ring *r, unsigned count);
-
-/**
  * Dump the status of the ring to a file.
  *
  * @param f
@@ -374,8 +352,6 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -390,7 +366,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	int success;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -431,13 +406,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-				(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	/*
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
@@ -446,7 +414,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -465,8 +433,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -479,7 +445,6 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_next, free_entries;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	prod_head = r->prod.head;
 	cons_tail = r->cons.tail;
@@ -508,15 +473,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-			(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -682,8 +640,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -704,8 +660,6 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -730,8 +684,6 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -756,8 +708,6 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -775,8 +725,6 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -798,8 +746,6 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
diff --git a/test/test/autotest_test_funcs.py b/test/test/autotest_test_funcs.py
index 1c5f390..8da8fcd 100644
--- a/test/test/autotest_test_funcs.py
+++ b/test/test/autotest_test_funcs.py
@@ -292,11 +292,4 @@ def ring_autotest(child, test_name):
     elif index == 2:
         return -1, "Fail [Timeout]"
 
-    child.sendline("set_watermark test 100")
-    child.sendline("dump_ring test")
-    index = child.expect(["  watermark=100",
-                          pexpect.TIMEOUT], timeout=1)
-    if index != 0:
-        return -1, "Fail [Bad watermark]"
-
     return 0, "Success"
diff --git a/test/test/commands.c b/test/test/commands.c
index 2df46b0..551c81d 100644
--- a/test/test/commands.c
+++ b/test/test/commands.c
@@ -228,57 +228,6 @@ cmdline_parse_inst_t cmd_dump_one = {
 
 /****************/
 
-struct cmd_set_ring_result {
-	cmdline_fixed_string_t set;
-	cmdline_fixed_string_t name;
-	uint32_t value;
-};
-
-static void cmd_set_ring_parsed(void *parsed_result, struct cmdline *cl,
-				__attribute__((unused)) void *data)
-{
-	struct cmd_set_ring_result *res = parsed_result;
-	struct rte_ring *r;
-	int ret;
-
-	r = rte_ring_lookup(res->name);
-	if (r == NULL) {
-		cmdline_printf(cl, "Cannot find ring\n");
-		return;
-	}
-
-	if (!strcmp(res->set, "set_watermark")) {
-		ret = rte_ring_set_water_mark(r, res->value);
-		if (ret != 0)
-			cmdline_printf(cl, "Cannot set water mark\n");
-	}
-}
-
-cmdline_parse_token_string_t cmd_set_ring_set =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, set,
-				 "set_watermark");
-
-cmdline_parse_token_string_t cmd_set_ring_name =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, name, NULL);
-
-cmdline_parse_token_num_t cmd_set_ring_value =
-	TOKEN_NUM_INITIALIZER(struct cmd_set_ring_result, value, UINT32);
-
-cmdline_parse_inst_t cmd_set_ring = {
-	.f = cmd_set_ring_parsed,  /* function to call */
-	.data = NULL,      /* 2nd arg of func */
-	.help_str = "set watermark: "
-			"set_watermark <ring_name> <value>",
-	.tokens = {        /* token list, NULL terminated */
-		(void *)&cmd_set_ring_set,
-		(void *)&cmd_set_ring_name,
-		(void *)&cmd_set_ring_value,
-		NULL,
-	},
-};
-
-/****************/
-
 struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
@@ -419,7 +368,6 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_autotest,
 	(cmdline_parse_inst_t *)&cmd_dump,
 	(cmdline_parse_inst_t *)&cmd_dump_one,
-	(cmdline_parse_inst_t *)&cmd_set_ring,
 	(cmdline_parse_inst_t *)&cmd_quit,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx_anchor,
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 3891f5d..666a451 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -78,21 +78,6 @@
  *      - Dequeue one object, two objects, MAX_BULK objects
  *      - Check that dequeued pointers are correct
  *
- *    - Test watermark and default bulk enqueue/dequeue:
- *
- *      - Set watermark
- *      - Set default bulk value
- *      - Enqueue objects, check that -EDQUOT is returned when
- *        watermark is exceeded
- *      - Check that dequeued pointers are correct
- *
- * #. Check live watermark change
- *
- *    - Start a loop on another lcore that will enqueue and dequeue
- *      objects in a ring. It will monitor the value of watermark.
- *    - At the same time, change the watermark on the master lcore.
- *    - The slave lcore will check that watermark changes from 16 to 32.
- *
  * #. Performance tests.
  *
  * Tests done in test_ring_perf.c
@@ -115,123 +100,6 @@ static struct rte_ring *r;
 
 #define	TEST_RING_FULL_EMTPY_ITER	8
 
-static int
-check_live_watermark_change(__attribute__((unused)) void *dummy)
-{
-	uint64_t hz = rte_get_timer_hz();
-	void *obj_table[MAX_BULK];
-	unsigned watermark, watermark_old = 16;
-	uint64_t cur_time, end_time;
-	int64_t diff = 0;
-	int i, ret;
-	unsigned count = 4;
-
-	/* init the object table */
-	memset(obj_table, 0, sizeof(obj_table));
-	end_time = rte_get_timer_cycles() + (hz / 4);
-
-	/* check that bulk and watermark are 4 and 32 (respectively) */
-	while (diff >= 0) {
-
-		/* add in ring until we reach watermark */
-		ret = 0;
-		for (i = 0; i < 16; i ++) {
-			if (ret != 0)
-				break;
-			ret = rte_ring_enqueue_bulk(r, obj_table, count);
-		}
-
-		if (ret != -EDQUOT) {
-			printf("Cannot enqueue objects, or watermark not "
-			       "reached (ret=%d)\n", ret);
-			return -1;
-		}
-
-		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->watermark;
-		if (watermark != watermark_old &&
-		    (watermark_old != 16 || watermark != 32)) {
-			printf("Bad watermark change %u -> %u\n", watermark_old,
-			       watermark);
-			return -1;
-		}
-		watermark_old = watermark;
-
-		/* dequeue objects from ring */
-		while (i--) {
-			ret = rte_ring_dequeue_bulk(r, obj_table, count);
-			if (ret != 0) {
-				printf("Cannot dequeue (ret=%d)\n", ret);
-				return -1;
-			}
-		}
-
-		cur_time = rte_get_timer_cycles();
-		diff = end_time - cur_time;
-	}
-
-	if (watermark_old != 32 ) {
-		printf(" watermark was not updated (wm=%u)\n",
-		       watermark_old);
-		return -1;
-	}
-
-	return 0;
-}
-
-static int
-test_live_watermark_change(void)
-{
-	unsigned lcore_id = rte_lcore_id();
-	unsigned lcore_id2 = rte_get_next_lcore(lcore_id, 0, 1);
-
-	printf("Test watermark live modification\n");
-	rte_ring_set_water_mark(r, 16);
-
-	/* launch a thread that will enqueue and dequeue, checking
-	 * watermark and quota */
-	rte_eal_remote_launch(check_live_watermark_change, NULL, lcore_id2);
-
-	rte_delay_ms(100);
-	rte_ring_set_water_mark(r, 32);
-	rte_delay_ms(100);
-
-	if (rte_eal_wait_lcore(lcore_id2) < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Test for catch on invalid watermark values */
-static int
-test_set_watermark( void ){
-	unsigned count;
-	int setwm;
-
-	struct rte_ring *r = rte_ring_lookup("test_ring_basic_ex");
-	if(r == NULL){
-		printf( " ring lookup failed\n" );
-		goto error;
-	}
-	count = r->size * 2;
-	setwm = rte_ring_set_water_mark(r, count);
-	if (setwm != -EINVAL){
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-
-	count = 0;
-	rte_ring_set_water_mark(r, count);
-	if (r->watermark != r->size) {
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-	return 0;
-
-error:
-	return -1;
-}
-
 /*
  * helper routine for test_ring_basic
  */
@@ -418,8 +286,7 @@ test_ring_basic(void)
 	cur_src = src;
 	cur_dst = dst;
 
-	printf("test watermark and default bulk enqueue / dequeue\n");
-	rte_ring_set_water_mark(r, 20);
+	printf("test default bulk enqueue / dequeue\n");
 	num_elems = 16;
 
 	cur_src = src;
@@ -433,8 +300,8 @@ test_ring_basic(void)
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != -EDQUOT) {
-		printf("Watermark not exceeded\n");
+	if (ret != 0) {
+		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
@@ -930,16 +797,6 @@ test_ring(void)
 		return -1;
 
 	/* basic operations */
-	if (test_live_watermark_change() < 0)
-		return -1;
-
-	if ( test_set_watermark() < 0){
-		printf ("Test failed to detect invalid parameter\n");
-		return -1;
-	}
-	else
-		printf ( "Test detected forced bad watermark values\n");
-
 	if ( test_create_count_odd() < 0){
 			printf ("Test failed to detect odd count\n");
 			return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3 05/14] ring: remove the yield when waiting for tail update
                         ` (2 preceding siblings ...)
  2017-03-24 17:09  2%     ` [dpdk-dev] [PATCH v3 04/14] ring: remove debug setting Bruce Richardson
@ 2017-03-24 17:09  4%     ` Bruce Richardson
  2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 06/14] ring: remove watermark support Bruce Richardson
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-24 17:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base                              |  1 -
 doc/guides/prog_guide/env_abstraction_layer.rst |  5 ----
 doc/guides/rel_notes/release_17_05.rst          |  1 +
 lib/librte_ring/rte_ring.h                      | 35 +++++--------------------
 4 files changed, 7 insertions(+), 35 deletions(-)

diff --git a/config/common_base b/config/common_base
index 69e91ae..2d54ddf 100644
--- a/config/common_base
+++ b/config/common_base
@@ -452,7 +452,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
 # Compile librte_mempool
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 10a10a8..7c39cd2 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -352,11 +352,6 @@ Known Issues
 
   3. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
 
-  ``RTE_RING_PAUSE_REP_COUNT`` is defined for rte_ring to reduce contention. It's mainly for case 2, a yield is issued after number of times pause repeat.
-
-  It adds a sched_yield() syscall if the thread spins for too long while waiting on the other thread to finish its operations on the ring.
-  This gives the preempted thread a chance to proceed and finish with the ring enqueue/dequeue operation.
-
 + rte_timer
 
   Running  ``rte_timer_manager()`` on a non-EAL pthread is not allowed. However, resetting/stopping the timer from a non-EAL pthread is allowed.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 742ad6c..556869f 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -127,6 +127,7 @@ API Changes
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
+  * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 2777b41..f8ac7f5 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -114,11 +114,6 @@ enum rte_ring_queue_behavior {
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
 			   sizeof(RTE_RING_MZ_PREFIX) + 1)
 
-#ifndef RTE_RING_PAUSE_REP_COUNT
-#define RTE_RING_PAUSE_REP_COUNT 0 /**< Yield after pause num of times, no yield
-                                    *   if RTE_RING_PAUSE_REP not defined. */
-#endif
-
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
 #if RTE_CACHE_LINE_SIZE < 128
@@ -393,7 +388,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -447,18 +442,9 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head)) {
+	while (unlikely(r->prod.tail != prod_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->prod.tail = prod_next;
 	return ret;
 }
@@ -491,7 +477,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 {
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -568,7 +554,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -613,18 +599,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head)) {
+	while (unlikely(r->cons.tail != cons_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -659,7 +636,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
-- 
2.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3 04/14] ring: remove debug setting
    2017-03-24 17:09  5%     ` [dpdk-dev] [PATCH v3 01/14] ring: remove split cacheline build setting Bruce Richardson
  2017-03-24 17:09  3%     ` [dpdk-dev] [PATCH v3 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
@ 2017-03-24 17:09  2%     ` Bruce Richardson
  2017-03-24 17:09  4%     ` [dpdk-dev] [PATCH v3 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-24 17:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 config/common_base                     |   1 -
 doc/guides/prog_guide/ring_lib.rst     |   7 -
 doc/guides/rel_notes/release_17_05.rst |   1 +
 lib/librte_ring/rte_ring.c             |  41 ----
 lib/librte_ring/rte_ring.h             |  97 +-------
 test/test/test_ring.c                  | 410 ---------------------------------
 6 files changed, 13 insertions(+), 544 deletions(-)

diff --git a/config/common_base b/config/common_base
index c394651..69e91ae 100644
--- a/config/common_base
+++ b/config/common_base
@@ -452,7 +452,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index 9f69753..d4ab502 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -110,13 +110,6 @@ Once an enqueue operation reaches the high water mark, the producer is notified,
 
 This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
 
-Debug
-~~~~~
-
-When debug is enabled (CONFIG_RTE_LIBRTE_RING_DEBUG is set),
-the library stores some per-ring statistic counters about the number of enqueues/dequeues.
-These statistics are per-core to avoid concurrent accesses or atomic operations.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 57ae8bf..742ad6c 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -126,6 +126,7 @@ API Changes
   have been made to it:
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+  * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 93485d4..934ce87 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -131,12 +131,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 			  RTE_CACHE_LINE_MASK) != 0);
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_LIBRTE_RING_DEBUG
-	RTE_BUILD_BUG_ON((sizeof(struct rte_ring_debug_stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
@@ -284,11 +278,6 @@ rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
 {
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats sum;
-	unsigned lcore_id;
-#endif
-
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
 	fprintf(f, "  size=%"PRIu32"\n", r->size);
@@ -302,36 +291,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		fprintf(f, "  watermark=0\n");
 	else
 		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
-
-	/* sum and dump statistics */
-#ifdef RTE_LIBRTE_RING_DEBUG
-	memset(&sum, 0, sizeof(sum));
-	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
-		sum.enq_success_bulk += r->stats[lcore_id].enq_success_bulk;
-		sum.enq_success_objs += r->stats[lcore_id].enq_success_objs;
-		sum.enq_quota_bulk += r->stats[lcore_id].enq_quota_bulk;
-		sum.enq_quota_objs += r->stats[lcore_id].enq_quota_objs;
-		sum.enq_fail_bulk += r->stats[lcore_id].enq_fail_bulk;
-		sum.enq_fail_objs += r->stats[lcore_id].enq_fail_objs;
-		sum.deq_success_bulk += r->stats[lcore_id].deq_success_bulk;
-		sum.deq_success_objs += r->stats[lcore_id].deq_success_objs;
-		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
-		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
-	}
-	fprintf(f, "  size=%"PRIu32"\n", r->size);
-	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
-	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
-	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
-	fprintf(f, "  enq_quota_objs=%"PRIu64"\n", sum.enq_quota_objs);
-	fprintf(f, "  enq_fail_bulk=%"PRIu64"\n", sum.enq_fail_bulk);
-	fprintf(f, "  enq_fail_objs=%"PRIu64"\n", sum.enq_fail_objs);
-	fprintf(f, "  deq_success_bulk=%"PRIu64"\n", sum.deq_success_bulk);
-	fprintf(f, "  deq_success_objs=%"PRIu64"\n", sum.deq_success_objs);
-	fprintf(f, "  deq_fail_bulk=%"PRIu64"\n", sum.deq_fail_bulk);
-	fprintf(f, "  deq_fail_objs=%"PRIu64"\n", sum.deq_fail_objs);
-#else
-	fprintf(f, "  no statistics available\n");
-#endif
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index d650215..2777b41 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -109,24 +109,6 @@ enum rte_ring_queue_behavior {
 	RTE_RING_QUEUE_VARIABLE   /* Enq/Deq as many items as possible from ring */
 };
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-/**
- * A structure that stores the ring statistics (per-lcore).
- */
-struct rte_ring_debug_stats {
-	uint64_t enq_success_bulk; /**< Successful enqueues number. */
-	uint64_t enq_success_objs; /**< Objects successfully enqueued. */
-	uint64_t enq_quota_bulk;   /**< Successful enqueues above watermark. */
-	uint64_t enq_quota_objs;   /**< Objects enqueued above watermark. */
-	uint64_t enq_fail_bulk;    /**< Failed enqueues number. */
-	uint64_t enq_fail_objs;    /**< Objects that failed to be enqueued. */
-	uint64_t deq_success_bulk; /**< Successful dequeues number. */
-	uint64_t deq_success_objs; /**< Objects successfully dequeued. */
-	uint64_t deq_fail_bulk;    /**< Failed dequeues number. */
-	uint64_t deq_fail_objs;    /**< Objects that failed to be dequeued. */
-} __rte_cache_aligned;
-#endif
-
 #define RTE_RING_MZ_PREFIX "RG_"
 /**< The maximum length of a ring name. */
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
@@ -184,10 +166,6 @@ struct rte_ring {
 	/** Ring consumer status. */
 	struct rte_ring_headtail cons __rte_aligned(CONS_ALIGN);
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-#endif
-
 	void *ring[] __rte_cache_aligned;   /**< Memory space of ring starts here.
 	                                     * not volatile so need to be careful
 	                                     * about compiler re-ordering */
@@ -199,27 +177,6 @@ struct rte_ring {
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
- * @internal When debug is enabled, store ring statistics.
- * @param r
- *   A pointer to the ring.
- * @param name
- *   The name of the statistics field to increment in the ring.
- * @param n
- *   The number to add to the object-oriented statistics.
- */
-#ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {                        \
-		unsigned __lcore_id = rte_lcore_id();           \
-		if (__lcore_id < RTE_MAX_LCORE) {               \
-			r->stats[__lcore_id].name##_objs += n;  \
-			r->stats[__lcore_id].name##_bulk += 1;  \
-		}                                               \
-	} while(0)
-#else
-#define __RING_STAT_ADD(r, name, n) do {} while(0)
-#endif
-
-/**
  * Calculate the memory size needed for a ring
  *
  * This function returns the number of bytes needed for a ring, given
@@ -460,17 +417,12 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOBUFS;
-			}
 			else {
 				/* No free entry available */
-				if (unlikely(free_entries == 0)) {
-					__RING_STAT_ADD(r, enq_fail, n);
+				if (unlikely(free_entries == 0))
 					return 0;
-				}
-
 				n = free_entries;
 			}
 		}
@@ -485,15 +437,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	/*
 	 * If there are other enqueues in progress that preceded us,
@@ -557,17 +505,12 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, enq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOBUFS;
-		}
 		else {
 			/* No free entry available */
-			if (unlikely(free_entries == 0)) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (unlikely(free_entries == 0))
 				return 0;
-			}
-
 			n = free_entries;
 		}
 	}
@@ -580,15 +523,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	r->prod.tail = prod_next;
 	return ret;
@@ -652,16 +591,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOENT;
-			}
 			else {
-				if (unlikely(entries == 0)){
-					__RING_STAT_ADD(r, deq_fail, n);
+				if (unlikely(entries == 0))
 					return 0;
-				}
-
 				n = entries;
 			}
 		}
@@ -691,7 +625,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 			sched_yield();
 		}
 	}
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -738,16 +671,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	entries = prod_tail - cons_head;
 
 	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, deq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOENT;
-		}
 		else {
-			if (unlikely(entries == 0)){
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (unlikely(entries == 0))
 				return 0;
-			}
-
 			n = entries;
 		}
 	}
@@ -759,7 +687,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	DEQUEUE_PTRS();
 	rte_smp_rmb();
 
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
 }
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 5f09097..3891f5d 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -763,412 +763,6 @@ test_ring_burst_basic(void)
 	return -1;
 }
 
-static int
-test_ring_stats(void)
-{
-
-#ifndef RTE_LIBRTE_RING_DEBUG
-	printf("Enable RTE_LIBRTE_RING_DEBUG to test ring stats.\n");
-	return 0;
-#else
-	void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
-	int ret;
-	unsigned i;
-	unsigned num_items            = 0;
-	unsigned failed_enqueue_ops   = 0;
-	unsigned failed_enqueue_items = 0;
-	unsigned failed_dequeue_ops   = 0;
-	unsigned failed_dequeue_items = 0;
-	unsigned last_enqueue_ops     = 0;
-	unsigned last_enqueue_items   = 0;
-	unsigned last_quota_ops       = 0;
-	unsigned last_quota_items     = 0;
-	unsigned lcore_id = rte_lcore_id();
-	struct rte_ring_debug_stats *ring_stats = &r->stats[lcore_id];
-
-	printf("Test the ring stats.\n");
-
-	/* Reset the watermark in case it was set in another test. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Allocate some dummy object pointers. */
-	src = malloc(RING_SIZE*2*sizeof(void *));
-	if (src == NULL)
-		goto fail;
-
-	for (i = 0; i < RING_SIZE*2 ; i++) {
-		src[i] = (void *)(unsigned long)i;
-	}
-
-	/* Allocate some memory for copied objects. */
-	dst = malloc(RING_SIZE*2*sizeof(void *));
-	if (dst == NULL)
-		goto fail;
-
-	memset(dst, 0, RING_SIZE*2*sizeof(void *));
-
-	/* Set the head and tail pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	/* Do Enqueue tests. */
-	printf("Test the dequeue stats.\n");
-
-	/* Fill the ring up to RING_SIZE -1. */
-	printf("Fill the ring.\n");
-	for (i = 0; i< (RING_SIZE/MAX_BULK); i++) {
-		rte_ring_sp_enqueue_burst(r, cur_src, MAX_BULK);
-		cur_src += MAX_BULK;
-	}
-
-	/* Adjust for final enqueue = MAX_BULK -1. */
-	cur_src--;
-
-	printf("Verify that the ring is full.\n");
-	if (rte_ring_full(r) != 1)
-		goto fail;
-
-
-	printf("Verify the enqueue success stats.\n");
-	/* Stats should match above enqueue operations to fill the ring. */
-	if (ring_stats->enq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Current max objects is RING_SIZE -1. */
-	if (ring_stats->enq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any failures yet. */
-	if (ring_stats->enq_fail_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_fail_objs != 0)
-		goto fail;
-
-
-	printf("Test stats for SP burst enqueue to a full ring.\n");
-	num_items = 2;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for SP bulk enqueue to a full ring.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP burst enqueue to a full ring.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP bulk enqueue to a full ring.\n");
-	num_items = 16;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	/* Do Dequeue tests. */
-	printf("Test the dequeue stats.\n");
-
-	printf("Empty the ring.\n");
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* There was only RING_SIZE -1 objects to dequeue. */
-	cur_dst++;
-
-	printf("Verify ring is empty.\n");
-	if (1 != rte_ring_empty(r))
-		goto fail;
-
-	printf("Verify the dequeue success stats.\n");
-	/* Stats should match above dequeue operations. */
-	if (ring_stats->deq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Objects dequeued is RING_SIZE -1. */
-	if (ring_stats->deq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any dequeue failure stats yet. */
-	if (ring_stats->deq_fail_bulk != 0)
-		goto fail;
-
-	printf("Test stats for SC burst dequeue with an empty ring.\n");
-	num_items = 2;
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for SC bulk dequeue with an empty ring.\n");
-	num_items = 4;
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC burst dequeue with an empty ring.\n");
-	num_items = 8;
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC bulk dequeue with an empty ring.\n");
-	num_items = 16;
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test total enqueue/dequeue stats.\n");
-	/* At this point the enqueue and dequeue stats should be the same. */
-	if (ring_stats->enq_success_bulk != ring_stats->deq_success_bulk)
-		goto fail;
-	if (ring_stats->enq_success_objs != ring_stats->deq_success_objs)
-		goto fail;
-	if (ring_stats->enq_fail_bulk    != ring_stats->deq_fail_bulk)
-		goto fail;
-	if (ring_stats->enq_fail_objs    != ring_stats->deq_fail_objs)
-		goto fail;
-
-
-	/* Watermark Tests. */
-	printf("Test the watermark/quota stats.\n");
-
-	printf("Verify the initial watermark stats.\n");
-	/* Watermark stats should be 0 since there is no watermark. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Set a watermark. */
-	rte_ring_set_water_mark(r, 16);
-
-	/* Reset pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue below watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should still be 0. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Success stats should have increased. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops + 1)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items + num_items)
-		goto fail;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue at watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != 1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP burst enqueue above watermark.\n");
-	num_items = 1;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP burst enqueue above watermark.\n");
-	num_items = 2;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP bulk enqueue above watermark.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP bulk enqueue above watermark.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	printf("Test watermark success stats.\n");
-	/* Success stats should be same as last non-watermarked enqueue. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items)
-		goto fail;
-
-
-	/* Cleanup. */
-
-	/* Empty the ring. */
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* Reset the watermark. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Free memory before test completed */
-	free(src);
-	free(dst);
-	return 0;
-
-fail:
-	free(src);
-	free(dst);
-	return -1;
-#endif
-}
-
 /*
  * it will always fail to create ring with a wrong ring size number in this function
  */
@@ -1335,10 +929,6 @@ test_ring(void)
 	if (test_ring_basic() < 0)
 		return -1;
 
-	/* ring stats */
-	if (test_ring_stats() < 0)
-		return -1;
-
 	/* basic operations */
 	if (test_live_watermark_change() < 0)
 		return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3 03/14] ring: eliminate duplication of size and mask fields
    2017-03-24 17:09  5%     ` [dpdk-dev] [PATCH v3 01/14] ring: remove split cacheline build setting Bruce Richardson
@ 2017-03-24 17:09  3%     ` Bruce Richardson
    2017-03-24 17:09  2%     ` [dpdk-dev] [PATCH v3 04/14] ring: remove debug setting Bruce Richardson
                       ` (5 subsequent siblings)
  7 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2017-03-24 17:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

The size and mask fields are duplicated in both the producer and
consumer data structures. Move them out of that into the top level
structure so they are not duplicated.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_ring/rte_ring.c | 20 ++++++++++----------
 lib/librte_ring/rte_ring.h | 32 ++++++++++++++++----------------
 test/test/test_ring.c      |  6 +++---
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 93a8692..93485d4 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -144,11 +144,11 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->prod.watermark = count;
+	r->watermark = count;
 	r->prod.single = !!(flags & RING_F_SP_ENQ);
 	r->cons.single = !!(flags & RING_F_SC_DEQ);
-	r->prod.size = r->cons.size = count;
-	r->prod.mask = r->cons.mask = count-1;
+	r->size = count;
+	r->mask = count - 1;
 	r->prod.head = r->cons.head = 0;
 	r->prod.tail = r->cons.tail = 0;
 
@@ -269,14 +269,14 @@ rte_ring_free(struct rte_ring *r)
 int
 rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 {
-	if (count >= r->prod.size)
+	if (count >= r->size)
 		return -EINVAL;
 
 	/* if count is 0, disable the watermarking */
 	if (count == 0)
-		count = r->prod.size;
+		count = r->size;
 
-	r->prod.watermark = count;
+	r->watermark = count;
 	return 0;
 }
 
@@ -291,17 +291,17 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  ct=%"PRIu32"\n", r->cons.tail);
 	fprintf(f, "  ch=%"PRIu32"\n", r->cons.head);
 	fprintf(f, "  pt=%"PRIu32"\n", r->prod.tail);
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->prod.watermark == r->prod.size)
+	if (r->watermark == r->size)
 		fprintf(f, "  watermark=0\n");
 	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->prod.watermark);
+		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 
 	/* sum and dump statistics */
 #ifdef RTE_LIBRTE_RING_DEBUG
@@ -318,7 +318,7 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
 		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
 	}
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
 	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
 	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 331c94f..d650215 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -151,10 +151,7 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 struct rte_ring_headtail {
 	volatile uint32_t head;  /**< Prod/consumer head. */
 	volatile uint32_t tail;  /**< Prod/consumer tail. */
-	uint32_t size;           /**< Size of ring. */
-	uint32_t mask;           /**< Mask (size-1) of ring. */
 	uint32_t single;         /**< True if single prod/cons */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 };
 
 /**
@@ -174,9 +171,12 @@ struct rte_ring {
 	 * next time the ABI changes
 	 */
 	char name[RTE_MEMZONE_NAMESIZE];    /**< Name of the ring. */
-	int flags;                       /**< Flags supplied at creation. */
+	int flags;               /**< Flags supplied at creation. */
 	const struct rte_memzone *memzone;
 			/**< Memzone, if any, containing the rte_ring */
+	uint32_t size;           /**< Size of ring. */
+	uint32_t mask;           /**< Mask (size-1) of ring. */
+	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -355,7 +355,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * Placed here since identical code needed in both
  * single and multi producer enqueue functions */
 #define ENQUEUE_PTRS() do { \
-	const uint32_t size = r->prod.size; \
+	const uint32_t size = r->size; \
 	uint32_t idx = prod_head & mask; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & ((~(unsigned)0x3))); i+=4, idx+=4) { \
@@ -382,7 +382,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * single and multi consumer dequeue functions */
 #define DEQUEUE_PTRS() do { \
 	uint32_t idx = cons_head & mask; \
-	const uint32_t size = r->cons.size; \
+	const uint32_t size = r->size; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & (~(unsigned)0x3)); i+=4, idx+=4) {\
 			obj_table[i] = r->ring[idx]; \
@@ -437,7 +437,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -485,7 +485,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -544,7 +544,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	prod_head = r->prod.head;
@@ -580,7 +580,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -630,7 +630,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -727,7 +727,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
 	prod_tail = r->prod.tail;
@@ -1056,7 +1056,7 @@ rte_ring_full(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return ((cons_tail - prod_tail - 1) & r->prod.mask) == 0;
+	return ((cons_tail - prod_tail - 1) & r->mask) == 0;
 }
 
 /**
@@ -1089,7 +1089,7 @@ rte_ring_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (prod_tail - cons_tail) & r->prod.mask;
+	return (prod_tail - cons_tail) & r->mask;
 }
 
 /**
@@ -1105,7 +1105,7 @@ rte_ring_free_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (cons_tail - prod_tail - 1) & r->prod.mask;
+	return (cons_tail - prod_tail - 1) & r->mask;
 }
 
 /**
@@ -1119,7 +1119,7 @@ rte_ring_free_count(const struct rte_ring *r)
 static inline unsigned int
 rte_ring_get_size(const struct rte_ring *r)
 {
-	return r->prod.size;
+	return r->size;
 }
 
 /**
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index ebcb896..5f09097 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -148,7 +148,7 @@ check_live_watermark_change(__attribute__((unused)) void *dummy)
 		}
 
 		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->prod.watermark;
+		watermark = r->watermark;
 		if (watermark != watermark_old &&
 		    (watermark_old != 16 || watermark != 32)) {
 			printf("Bad watermark change %u -> %u\n", watermark_old,
@@ -213,7 +213,7 @@ test_set_watermark( void ){
 		printf( " ring lookup failed\n" );
 		goto error;
 	}
-	count = r->prod.size*2;
+	count = r->size * 2;
 	setwm = rte_ring_set_water_mark(r, count);
 	if (setwm != -EINVAL){
 		printf("Test failed to detect invalid watermark count value\n");
@@ -222,7 +222,7 @@ test_set_watermark( void ){
 
 	count = 0;
 	rte_ring_set_water_mark(r, count);
-	if (r->prod.watermark != r->prod.size) {
+	if (r->watermark != r->size) {
 		printf("Test failed to detect invalid watermark count value\n");
 		goto error;
 	}
-- 
2.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v3 01/14] ring: remove split cacheline build setting
  @ 2017-03-24 17:09  5%     ` Bruce Richardson
  2017-03-24 17:09  3%     ` [dpdk-dev] [PATCH v3 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-24 17:09 UTC (permalink / raw)
  To: olivier.matz; +Cc: dev, jerin.jacob, thomas.monjalon, Bruce Richardson

Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure.  Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
V2: Limit the cacheline * 2 alignment to platforms with < 128B line size
---
 config/common_base                     |  1 -
 doc/guides/rel_notes/release_17_05.rst |  7 +++++++
 lib/librte_ring/rte_ring.c             |  2 --
 lib/librte_ring/rte_ring.h             | 16 ++++++++++------
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/config/common_base b/config/common_base
index 37aa1e1..c394651 100644
--- a/config/common_base
+++ b/config/common_base
@@ -453,7 +453,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 #
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
-CONFIG_RTE_RING_SPLIT_PROD_CONS=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 918f483..57ae8bf 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -120,6 +120,13 @@ API Changes
 * The LPM ``next_hop`` field is extended from 8 bits to 21 bits for IPv6
   while keeping ABI compatibility.
 
+* **Reworked rte_ring library**
+
+  The rte_ring library has been reworked and updated. The following changes
+  have been made to it:
+
+  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+
 ABI Changes
 -----------
 
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index ca0a108..4bc6da1 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_RING_SPLIT_PROD_CONS
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
 #ifdef RTE_LIBRTE_RING_DEBUG
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 72ccca5..399ae3b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -139,6 +139,14 @@ struct rte_ring_debug_stats {
 
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
+#if RTE_CACHE_LINE_SIZE < 128
+#define PROD_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#define CONS_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#else
+#define PROD_ALIGN RTE_CACHE_LINE_SIZE
+#define CONS_ALIGN RTE_CACHE_LINE_SIZE
+#endif
+
 /**
  * An RTE ring structure.
  *
@@ -168,7 +176,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Producer head. */
 		volatile uint32_t tail;  /**< Producer tail. */
-	} prod __rte_cache_aligned;
+	} prod __rte_aligned(PROD_ALIGN);
 
 	/** Ring consumer status. */
 	struct cons {
@@ -177,11 +185,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Consumer head. */
 		volatile uint32_t tail;  /**< Consumer tail. */
-#ifdef RTE_RING_SPLIT_PROD_CONS
-	} cons __rte_cache_aligned;
-#else
-	} cons;
-#endif
+	} cons __rte_aligned(CONS_ALIGN);
 
 #ifdef RTE_LIBRTE_RING_DEBUG
 	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-- 
2.9.3

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v5 1/7] net/ark: PMD for Atomic Rules Arkville driver stub
  2017-03-23  1:03  3% [dpdk-dev] [PATCH v4 " Ed Czeck
@ 2017-03-23 22:59  3% ` Ed Czeck
  2017-03-29  1:04  3% ` [dpdk-dev] [PATCH v6 " Ed Czeck
  1 sibling, 0 replies; 200+ results
From: Ed Czeck @ 2017-03-23 22:59 UTC (permalink / raw)
  To: dev; +Cc: john.miller, shepard.siegel, ferruh.yigit, stephen, Ed Czeck

Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time

v5:
* Address comments from Ferruh Yigit <ferruh.yigit@intel.com>
* Added documentation on driver args
* Makefile fixes
* Safe argument processing
* vdev args to dev args

v4:
* Address issues report from review
* Add internal comments on driver arg
* provide a bare-bones dev init to avoid compiler warnings

v3:
* Split large patch into several smaller ones

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
---
 MAINTAINERS                                 |   8 +
 config/common_base                          |  10 +
 config/defconfig_arm-armv7a-linuxapp-gcc    |   1 +
 config/defconfig_ppc_64-power8-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                     | 310 ++++++++++++++++++++++++++++
 doc/guides/nics/index.rst                   |   1 +
 drivers/net/Makefile                        |   1 +
 drivers/net/ark/Makefile                    |  62 ++++++
 drivers/net/ark/ark_debug.h                 |  71 +++++++
 drivers/net/ark/ark_ethdev.c                | 294 ++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h                |  39 ++++
 drivers/net/ark/ark_global.h                | 116 +++++++++++
 drivers/net/ark/rte_pmd_ark_version.map     |   4 +
 mk/rte.app.mk                               |   1 +
 14 files changed, 919 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_debug.h
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 0c78b58..19ee27f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -278,6 +278,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ARK
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index 37aa1e1..4feb5e4 100644
--- a/config/common_base
+++ b/config/common_base
@@ -353,6 +353,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+#
 # Compile the TAP PMD
 # It is enabled by default for Linux only.
 #
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 35f7fb6..89bc396 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
 # Note: Initially, all of the PMD drivers compilation are turned off on Power
 # Will turn on them only after the successful testing on Power
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..7df07ce
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,310 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+The ARK PMD also supports optional user extensions, through dynamic linking.
+The ARK PMD user extensions are a feature of Arkville’s DPDK
+net/ark poll mode driver, allowing users to add their
+own code to extend the net/ark functionality without
+having to make source code changes to the driver. One motivation for
+this capability is that while DPDK provides a rich set of functions
+to interact with NIC-like capabilities (e.g. MAC addresses and statistics),
+the Arkville RTL IP does not include a MAC.  Users can supply their
+own MAC or custom FPGA applications, which may require control from
+the PMD.  The user extension is the means providing the control
+between the user's FPGA application and the existing DPDK features via
+the PMD.
+
+Device Parameters
+-------------------
+
+The ARK PMD supports a series of parameters that are used for packet routing
+and for internal packet generation and packet checking.  This section describes
+the supported parameters.  These features are primarily used for
+diagnostics, testing, and performance verification.  The nominal use
+of Arkville does not require any configuration using these parameters.
+
+"Pkt_dir"
+
+The Packet Director controls connectivity between the Packet Generator,
+Packet Checker, UDM, DDM, and external ingress and egress interfaces for
+diagnostic purposes. It includes an internal loopback path from the DDM to the UDM.
+
+NOTE: Packets from the packet generator to the UDM are all directed to UDM RX
+queue 0. Packets looped back from the DDM to the UDM are directed to the same
+queue number they originated from.
+
+bit 24: enRxChk (default 0)
+bit 20: enDDMChk (default 1)
+bit 16: enPGIngress (default 1)
+bit 12: enPGEgress (default 0)
+bit 8:  enExtIngress (default 1)
+bit 4:  enDDMEgress (default 1)
+bit 0:  enIntLpbk (default 0)
+
+Power On state
+0x00110110
+
+These bits control which diagnostic paths are enabled. Refer to the PktDirector block
+diagram in the Arkville documentation.
+
+Format:
+Pkt_dir=0x00110110
+
+"Pkt_gen"
+
+The packet generator parameter takes a file as its argument.  The file contains configuration
+parameters used internally for regression testing and are not intended to be published at this
+level.
+
+Format:
+Pkt_gen=./config/pg.conf
+
+"Pkt_chkr"
+
+The packet checker parameter takes a file as its argument.  The file contains configuration
+parameters used internally for regression testing and are not intended to be published at this
+level.
+
+Format:
+Pkt_chkr=./config/pc.conf
+
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_PAD_TX** (default y):  When enabled TX
+     packets are padded to 60 bytes to support downstream MACS.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK
+Release Notes*.  ARM and PowerPC architectures are not supported at this time.
+
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 87f9334..381d82c 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     bnx2x
     bnxt
     cxgbe
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a16f25e..ea9868b 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
 DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..afe69c4
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,62 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS) -Werror
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_kvargs
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mempool
+
+LDLIBS += -lpthread
+LDLIBS += -ldl
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_debug.h b/drivers/net/ark/ark_debug.h
new file mode 100644
index 0000000..62f7462
--- /dev/null
+++ b/drivers/net/ark/ark_debug.h
@@ -0,0 +1,71 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIu32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIu64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+#define ARK_TRACE_ON(fmt, ...) \
+	PMD_DRV_LOG(DEBUG, fmt, ##__VA_ARGS__)
+
+#define ARK_TRACE_OFF(fmt, ...) \
+	do {if (0) PMD_DRV_LOG(DEBUG, fmt, ##__VA_ARGS__); } while (0)
+
+/* Debug macro for reporting Packet stats */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define ARK_DEBUG_STATS(fmt, ...) ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_STATS(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for tracing full behavior*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* tracing including the function name */
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..0a47543
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,294 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(struct ark_adapter *ark, const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+				 struct rte_eth_dev_info *dev_info);
+
+#define ARK_DEV_TO_PCI(eth_dev)			\
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+/*
+ * The packet generator is a functional block used to generate egress packet
+ * patterns.
+ */
+#define ARK_PKTGEN_ARG "Pkt_gen"
+
+/*
+ * The packet checker is a functional block used to test ingress packet
+ * patterns.
+ */
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+
+/*
+ * The packet director is used to select the internal ingress and egress packets
+ * paths.
+ */
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+/* Devinfo configurations */
+#define ARK_RX_MAX_QUEUE (4096 * 4)
+#define ARK_RX_MIN_QUEUE (512)
+#define ARK_RX_MAX_PKT_LEN ((16 * 1024) - 128)
+#define ARK_RX_MIN_BUFSIZE (1024)
+
+#define ARK_TX_MAX_QUEUE (4096 * 4)
+#define ARK_TX_MIN_QUEUE (256)
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	NULL
+};
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
+	},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_infos_get = eth_ark_dev_info_get,
+};
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret = -1;
+
+	ark->eth_dev = dev;
+
+	ARK_DEBUG_TRACE("eth_ark_dev_init(struct rte_eth_dev *dev)\n");
+
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	if (pci_dev->device.devargs)
+		eth_ark_check_args(ark, pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+
+	ark->bar0 = (uint8_t *)pci_dev->mem_resource[0].addr;
+	ark->a_bar = (uint8_t *)pci_dev->mem_resource[2].addr;
+
+	dev->dev_ops = &ark_eth_dev_ops;
+
+	/*  We process our args last as they require everything to be setup */
+	if (pci_dev->device.devargs)
+		eth_ark_check_args(ark, pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+	return ret;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	ARK_DEBUG_TRACE("ARKP: In %s\n", __func__);
+	return 0;
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+		     struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_rx_pktlen = ARK_RX_MAX_PKT_LEN;
+	dev_info->min_rx_bufsize = ARK_RX_MIN_BUFSIZE;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_RX_MAX_QUEUE,
+		.nb_min = ARK_RX_MIN_QUEUE,
+		.nb_align = ARK_RX_MIN_QUEUE}; /* power of 2 */
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_TX_MAX_QUEUE,
+		.nb_min = ARK_TX_MIN_QUEUE,
+		.nb_align = ARK_TX_MIN_QUEUE}; /* power of 2 */
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
+				ETH_LINK_SPEED_10G |
+				ETH_LINK_SPEED_25G |
+				ETH_LINK_SPEED_40G |
+				ETH_LINK_SPEED_50G |
+				ETH_LINK_SPEED_100G);
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+		   void *extra_args)
+{
+	ARK_DEBUG_TRACE("In process_pktdir_arg, key = %s, value = %s\n",
+			key, value);
+	struct ark_adapter *ark =
+		(struct ark_adapter *)extra_args;
+
+	ark->pkt_dir_v = strtol(value, NULL, 16);
+	ARK_DEBUG_TRACE("pkt_dir_v = 0x%x\n", ark->pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktgen_arg, key = %s, value = %s\n",
+			key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[ARK_MAX_ARG_LEN];
+	int  size = 0;
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		size += strlen(line);
+		if (size >= ARK_MAX_ARG_LEN) {
+			PMD_DRV_LOG(ERR, "Unable to parse file %s args, "
+				    "parameter list is too long\n", value);
+			fclose(file);
+			return -1;
+		}
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	ARK_DEBUG_TRACE("file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(struct ark_adapter *ark, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	ark->pkt_gen_args[0] = 0;
+	ark->pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		ARK_DEBUG_TRACE("**** Arg passed to PMD = %s:%s\n", pair->key,
+				pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTDIR_ARG,
+			       &process_pktdir_arg,
+			       ark) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTGEN_ARG,
+			       &process_file_args,
+			       ark->pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTCHKR_ARG,
+			       &process_file_args,
+			       ark->pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+	}
+
+	ARK_DEBUG_TRACE("INFO: packet director set to 0x%x\n", ark->pkt_dir_v);
+
+	return 1;
+}
+
+RTE_PMD_REGISTER_PCI(eth_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(eth_ark, pci_id_ark_map);
+RTE_PMD_REGISTER_PARAM_STRING(eth_ark,
+			      ARK_PKTGEN_ARG "=<filename> "
+			      ARK_PKTCHKR_ARG "=<filename> "
+			      ARK_PKTDIR_ARG "=<bitmap>");
+
+
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..08d7fb1
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,39 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+/* STUB */
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..033ac87
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,116 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/* Maximum length of arg list in bytes */
+#define ARK_MAX_ARG_LEN 256
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Packet generator/checker args */
+	char pkt_gen_args[ARK_MAX_ARG_LEN];
+	char pkt_chkr_args[ARK_MAX_ARG_LEN];
+	uint32_t pkt_dir_v;
+
+	/* eth device */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* Application Bar */
+	uint8_t *a_bar;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..1062e04
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 0e0b600..da23898 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -104,6 +104,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 19/22] vhost: rename header file
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
                     ` (6 preceding siblings ...)
  2017-03-23  7:10  3%   ` [dpdk-dev] [PATCH v2 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
@ 2017-03-23  7:10  5%   ` Yuanhan Liu
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/rel_notes/deprecation.rst   |   9 -
 doc/guides/rel_notes/release_17_05.rst |   3 +
 drivers/net/vhost/rte_eth_vhost.c      |   2 +-
 drivers/net/vhost/rte_eth_vhost.h      |   2 +-
 examples/tep_termination/main.c        |   2 +-
 examples/tep_termination/vxlan_setup.c |   2 +-
 examples/vhost/main.c                  |   2 +-
 lib/librte_vhost/Makefile              |   2 +-
 lib/librte_vhost/rte_vhost.h           | 421 +++++++++++++++++++++++++++++++++
 lib/librte_vhost/rte_virtio_net.h      | 421 ---------------------------------
 lib/librte_vhost/vhost.c               |   2 +-
 lib/librte_vhost/vhost.h               |   2 +-
 lib/librte_vhost/vhost_user.h          |   2 +-
 lib/librte_vhost/virtio_net.c          |   2 +-
 14 files changed, 434 insertions(+), 440 deletions(-)
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d6544ed..9708b39 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -95,15 +95,6 @@ Deprecation Notices
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
 
-* vhost: API/ABI changes are planned for 17.05, for making DPDK vhost library
-  generic enough so that applications can build different vhost-user drivers
-  (instead of vhost-user net only) on top of that.
-  Specifically, ``virtio_net_device_ops`` will be renamed to ``vhost_device_ops``.
-  Correspondingly, some API's parameter need be changed. Few more functions also
-  need be reworked to let it be device aware. For example, different virtio device
-  has different feature set, meaning functions like ``rte_vhost_feature_disable``
-  need be changed. Last, file rte_virtio_net.h will be renamed to rte_vhost.h.
-
 * ABI changes are planned for 17.05 in the ``rte_cryptodev_ops`` structure.
   A pointer to a rte_cryptodev_config structure will be added to the
   function prototype ``cryptodev_configure_t``, as a new parameter.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 8f06fc4..c053fff 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -165,6 +165,9 @@ API Changes
    * The vhost API ``rte_vhost_driver_session_start`` is removed. Instead,
      ``rte_vhost_driver_start`` should be used.
 
+   * The vhost public header file ``rte_virtio_net.h`` is renamed to
+     ``rte_vhost.h``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index e6c0758..32e774b 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -40,7 +40,7 @@
 #include <rte_memcpy.h>
 #include <rte_vdev.h>
 #include <rte_kvargs.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_spinlock.h>
 
 #include "rte_eth_vhost.h"
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index ea4bce4..39ca771 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -41,7 +41,7 @@
 #include <stdint.h>
 #include <stdbool.h>
 
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 /*
  * Event description.
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 24c62cd..cd6e3f1 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "main.h"
 #include "vxlan.h"
diff --git a/examples/tep_termination/vxlan_setup.c b/examples/tep_termination/vxlan_setup.c
index 8f1f15b..87de74d 100644
--- a/examples/tep_termination/vxlan_setup.c
+++ b/examples/tep_termination/vxlan_setup.c
@@ -49,7 +49,7 @@
 #include <rte_tcp.h>
 
 #include "main.h"
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 #include "vxlan.h"
 #include "vxlan_setup.h"
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 64b3eea..08b82f6 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 5cf4e93..4847069 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -51,7 +51,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \
 				   virtio_net.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h
 
 # dependencies
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal
diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
new file mode 100644
index 0000000..d4ee210
--- /dev/null
+++ b/lib/librte_vhost/rte_vhost.h
@@ -0,0 +1,421 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VHOST_H_
+#define _RTE_VHOST_H_
+
+/**
+ * @file
+ * Interface to vhost-user
+ */
+
+#include <stdint.h>
+#include <linux/vhost.h>
+#include <linux/virtio_ring.h>
+#include <sys/eventfd.h>
+
+#include <rte_memory.h>
+#include <rte_mempool.h>
+
+#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
+#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
+#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
+
+/**
+ * Information relating to memory regions including offsets to
+ * addresses in QEMUs memory file.
+ */
+struct rte_vhost_mem_region {
+	uint64_t guest_phys_addr;
+	uint64_t guest_user_addr;
+	uint64_t host_user_addr;
+	uint64_t size;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+/**
+ * Memory structure includes region and mapping information.
+ */
+struct rte_vhost_memory {
+	uint32_t nregions;
+	struct rte_vhost_mem_region regions[0];
+};
+
+struct rte_vhost_vring {
+	struct vring_desc	*desc;
+	struct vring_avail	*avail;
+	struct vring_used	*used;
+	uint64_t		log_guest_addr;
+
+	int			callfd;
+	int			kickfd;
+	uint16_t		size;
+};
+
+/**
+ * Device and vring operations.
+ */
+struct vhost_device_ops {
+	int (*new_device)(int vid);		/**< Add device. */
+	void (*destroy_device)(int vid);	/**< Remove device. */
+
+	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
+
+	/**
+	 * Features could be changed after the feature negotiation.
+	 * For example, VHOST_F_LOG_ALL will be set/cleared at the
+	 * start/end of live migration, respectively. This callback
+	 * is used to inform the application on such change.
+	 */
+	int (*features_changed)(int vid, uint64_t features);
+
+	void *reserved[4]; /**< Reserved for future extension */
+};
+
+/**
+ * Convert guest physical address to host virtual address
+ *
+ * @param mem
+ *  the guest memory regions
+ * @param gpa
+ *  the guest physical address for querying
+ * @return
+ *  the host virtual address on success, 0 on failure
+ */
+static inline uint64_t __attribute__((always_inline))
+rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
+{
+	struct rte_vhost_mem_region *reg;
+	uint32_t i;
+
+	for (i = 0; i < mem->nregions; i++) {
+		reg = &mem->regions[i];
+		if (gpa >= reg->guest_phys_addr &&
+		    gpa <  reg->guest_phys_addr + reg->size) {
+			return gpa - reg->guest_phys_addr +
+			       reg->host_user_addr;
+		}
+	}
+
+	return 0;
+}
+
+#define RTE_VHOST_NEED_LOG(features)	((features) & (1ULL << VHOST_F_LOG_ALL))
+
+/**
+ * Log the memory write start with given address.
+ *
+ * This function only need be invoked when the live migration starts.
+ * Therefore, we won't need call it at all in the most of time. For
+ * making the performance impact be minimum, it's suggested to do a
+ * check before calling it:
+ *
+ *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
+ *                rte_vhost_log_write(vid, addr, len);
+ *
+ * @param vid
+ *  vhost device ID
+ * @param addr
+ *  the starting address for write
+ * @param len
+ *  the length to write
+ */
+void rte_vhost_log_write(int vid, uint64_t addr, uint64_t len);
+
+/**
+ * Log the used ring update start at given offset.
+ *
+ * Same as rte_vhost_log_write, it's suggested to do a check before
+ * calling it:
+ *
+ *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
+ *                rte_vhost_log_used_vring(vid, vring_idx, offset, len);
+ *
+ * @param vid
+ *  vhost device ID
+ * @param vring_idx
+ *  the vring index
+ * @param offset
+ *  the offset inside the used ring
+ * @param len
+ *  the length to write
+ */
+void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
+			      uint64_t offset, uint64_t len);
+
+int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
+
+/**
+ * Register vhost driver. path could be different for multiple
+ * instance support.
+ */
+int rte_vhost_driver_register(const char *path, uint64_t flags);
+
+/* Unregister vhost driver. This is only meaningful to vhost user. */
+int rte_vhost_driver_unregister(const char *path);
+
+/**
+ * Set the feature bits the vhost-user driver supports.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_set_features(const char *path, uint64_t features);
+
+/**
+ * Enable vhost-user driver features.
+ *
+ * Note that
+ * - the param @features should be a subset of the feature bits provided
+ *   by rte_vhost_driver_set_features().
+ * - it must be invoked before vhost-user negotiation starts.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @param features
+ *  Features to enable
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_enable_features(const char *path, uint64_t features);
+
+/**
+ * Disable vhost-user driver features.
+ *
+ * The two notes at rte_vhost_driver_enable_features() also apply here.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @param features
+ *  Features to disable
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_disable_features(const char *path, uint64_t features);
+
+/**
+ * Get the final feature bits for feature negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  Feature bits on success, 0 on failure
+ */
+uint64_t rte_vhost_driver_get_features(const char *path);
+
+/**
+ * Get the feature bits after negotiation
+ *
+ * @param vid
+ *  Vhost device ID
+ * @return
+ *  Negotiated feature bits on success, 0 on failure
+ */
+uint64_t rte_vhost_get_negotiated_features(int vid);
+
+/* Register callbacks. */
+int rte_vhost_driver_callback_register(const char *path,
+	struct vhost_device_ops const * const ops);
+
+/**
+ *
+ * Start the vhost-user driver.
+ *
+ * This function triggers the vhost-user negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_start(const char *path);
+
+/**
+ * Get the MTU value of the device if set in QEMU.
+ *
+ * @param vid
+ *  virtio-net device ID
+ * @param mtu
+ *  The variable to store the MTU value
+ *
+ * @return
+ *  0: success
+ *  -EAGAIN: device not yet started
+ *  -ENOTSUP: device does not support MTU feature
+ */
+int rte_vhost_get_mtu(int vid, uint16_t *mtu);
+
+/**
+ * Get the numa node from which the virtio net device's memory
+ * is allocated.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The numa node, -1 on failure
+ */
+int rte_vhost_get_numa_node(int vid);
+
+/**
+ * @deprecated
+ * Get the number of queues the device supports.
+ *
+ * Note this function is deprecated, as it returns a queue pair number,
+ * which is vhost specific. Instead, rte_vhost_get_vring_num should
+ * be used.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of queues, 0 on failure
+ */
+__rte_deprecated
+uint32_t rte_vhost_get_queue_num(int vid);
+
+/**
+ * Get the number of vrings the device supports.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of vrings, 0 on failure
+ */
+uint16_t rte_vhost_get_vring_num(int vid);
+
+/**
+ * Get the virtio net device's ifname, which is the vhost-user socket
+ * file path.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param buf
+ *  The buffer to stored the queried ifname
+ * @param len
+ *  The length of buf
+ *
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_ifname(int vid, char *buf, size_t len);
+
+/**
+ * Get how many avail entries are left in the queue
+ *
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index
+ *
+ * @return
+ *  num of avail entires left
+ */
+uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
+
+/**
+ * This function adds buffers to the virtio devices RX virtqueue. Buffers can
+ * be received from the physical port or from another virtual device. A packet
+ * count is returned to indicate the number of packets that were succesfully
+ * added to the RX queue.
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param pkts
+ *  array to contain packets to be enqueued
+ * @param count
+ *  packets num to be enqueued
+ * @return
+ *  num of packets enqueued
+ */
+uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
+	struct rte_mbuf **pkts, uint16_t count);
+
+/**
+ * This function gets guest buffers from the virtio device TX virtqueue,
+ * construct host mbufs, copies guest buffer content to host mbufs and
+ * store them in pkts to be processed.
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param mbuf_pool
+ *  mbuf_pool where host mbuf is allocated.
+ * @param pkts
+ *  array to contain packets to be dequeued
+ * @param count
+ *  packets num to be dequeued
+ * @return
+ *  num of packets dequeued
+ */
+uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
+	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
+
+/**
+ * Get guest mem table: a list of memory regions.
+ *
+ * An rte_vhost_vhost_memory object will be allocated internaly, to hold the
+ * guest memory regions. Application should free it at destroy_device()
+ * callback.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param mem
+ *  To store the returned mem regions
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_mem_table(int vid, struct rte_vhost_memory **mem);
+
+/**
+ * Get guest vring info, including the vring address, vring size, etc.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param vring_idx
+ *  vring index
+ * @param vring
+ *  the structure to hold the requested vring info
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
+			      struct rte_vhost_vring *vring);
+
+#endif /* _RTE_VHOST_H_ */
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
deleted file mode 100644
index 627708d..0000000
--- a/lib/librte_vhost/rte_virtio_net.h
+++ /dev/null
@@ -1,421 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _VIRTIO_NET_H_
-#define _VIRTIO_NET_H_
-
-/**
- * @file
- * Interface to vhost net
- */
-
-#include <stdint.h>
-#include <linux/vhost.h>
-#include <linux/virtio_ring.h>
-#include <sys/eventfd.h>
-
-#include <rte_memory.h>
-#include <rte_mempool.h>
-
-#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
-#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
-#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
-
-/**
- * Information relating to memory regions including offsets to
- * addresses in QEMUs memory file.
- */
-struct rte_vhost_mem_region {
-	uint64_t guest_phys_addr;
-	uint64_t guest_user_addr;
-	uint64_t host_user_addr;
-	uint64_t size;
-	void	 *mmap_addr;
-	uint64_t mmap_size;
-	int fd;
-};
-
-/**
- * Memory structure includes region and mapping information.
- */
-struct rte_vhost_memory {
-	uint32_t nregions;
-	struct rte_vhost_mem_region regions[0];
-};
-
-struct rte_vhost_vring {
-	struct vring_desc	*desc;
-	struct vring_avail	*avail;
-	struct vring_used	*used;
-	uint64_t		log_guest_addr;
-
-	int			callfd;
-	int			kickfd;
-	uint16_t		size;
-};
-
-/**
- * Device and vring operations.
- */
-struct vhost_device_ops {
-	int (*new_device)(int vid);		/**< Add device. */
-	void (*destroy_device)(int vid);	/**< Remove device. */
-
-	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
-
-	/**
-	 * Features could be changed after the feature negotiation.
-	 * For example, VHOST_F_LOG_ALL will be set/cleared at the
-	 * start/end of live migration, respectively. This callback
-	 * is used to inform the application on such change.
-	 */
-	int (*features_changed)(int vid, uint64_t features);
-
-	void *reserved[4]; /**< Reserved for future extension */
-};
-
-/**
- * Convert guest physical address to host virtual address
- *
- * @param mem
- *  the guest memory regions
- * @param gpa
- *  the guest physical address for querying
- * @return
- *  the host virtual address on success, 0 on failure
- */
-static inline uint64_t __attribute__((always_inline))
-rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
-{
-	struct rte_vhost_mem_region *reg;
-	uint32_t i;
-
-	for (i = 0; i < mem->nregions; i++) {
-		reg = &mem->regions[i];
-		if (gpa >= reg->guest_phys_addr &&
-		    gpa <  reg->guest_phys_addr + reg->size) {
-			return gpa - reg->guest_phys_addr +
-			       reg->host_user_addr;
-		}
-	}
-
-	return 0;
-}
-
-#define RTE_VHOST_NEED_LOG(features)	((features) & (1ULL << VHOST_F_LOG_ALL))
-
-/**
- * Log the memory write start with given address.
- *
- * This function only need be invoked when the live migration starts.
- * Therefore, we won't need call it at all in the most of time. For
- * making the performance impact be minimum, it's suggested to do a
- * check before calling it:
- *
- *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
- *                rte_vhost_log_write(vid, addr, len);
- *
- * @param vid
- *  vhost device ID
- * @param addr
- *  the starting address for write
- * @param len
- *  the length to write
- */
-void rte_vhost_log_write(int vid, uint64_t addr, uint64_t len);
-
-/**
- * Log the used ring update start at given offset.
- *
- * Same as rte_vhost_log_write, it's suggested to do a check before
- * calling it:
- *
- *        if (unlikely(RTE_VHOST_NEED_LOG(features)))
- *                rte_vhost_log_used_vring(vid, vring_idx, offset, len);
- *
- * @param vid
- *  vhost device ID
- * @param vring_idx
- *  the vring index
- * @param offset
- *  the offset inside the used ring
- * @param len
- *  the length to write
- */
-void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
-			      uint64_t offset, uint64_t len);
-
-int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
-
-/**
- * Register vhost driver. path could be different for multiple
- * instance support.
- */
-int rte_vhost_driver_register(const char *path, uint64_t flags);
-
-/* Unregister vhost driver. This is only meaningful to vhost user. */
-int rte_vhost_driver_unregister(const char *path);
-
-/**
- * Set the feature bits the vhost-user driver supports.
- *
- * @param path
- *  The vhost-user socket file path
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_set_features(const char *path, uint64_t features);
-
-/**
- * Enable vhost-user driver features.
- *
- * Note that
- * - the param @features should be a subset of the feature bits provided
- *   by rte_vhost_driver_set_features().
- * - it must be invoked before vhost-user negotiation starts.
- *
- * @param path
- *  The vhost-user socket file path
- * @param features
- *  Features to enable
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_enable_features(const char *path, uint64_t features);
-
-/**
- * Disable vhost-user driver features.
- *
- * The two notes at rte_vhost_driver_enable_features() also apply here.
- *
- * @param path
- *  The vhost-user socket file path
- * @param features
- *  Features to disable
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_disable_features(const char *path, uint64_t features);
-
-/**
- * Get the final feature bits for feature negotiation.
- *
- * @param path
- *  The vhost-user socket file path
- * @return
- *  Feature bits on success, 0 on failure
- */
-uint64_t rte_vhost_driver_get_features(const char *path);
-
-/**
- * Get the feature bits after negotiation
- *
- * @param vid
- *  Vhost device ID
- * @return
- *  Negotiated feature bits on success, 0 on failure
- */
-uint64_t rte_vhost_get_negotiated_features(int vid);
-
-/* Register callbacks. */
-int rte_vhost_driver_callback_register(const char *path,
-	struct vhost_device_ops const * const ops);
-
-/**
- *
- * Start the vhost-user driver.
- *
- * This function triggers the vhost-user negotiation.
- *
- * @param path
- *  The vhost-user socket file path
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_driver_start(const char *path);
-
-/**
- * Get the MTU value of the device if set in QEMU.
- *
- * @param vid
- *  virtio-net device ID
- * @param mtu
- *  The variable to store the MTU value
- *
- * @return
- *  0: success
- *  -EAGAIN: device not yet started
- *  -ENOTSUP: device does not support MTU feature
- */
-int rte_vhost_get_mtu(int vid, uint16_t *mtu);
-
-/**
- * Get the numa node from which the virtio net device's memory
- * is allocated.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The numa node, -1 on failure
- */
-int rte_vhost_get_numa_node(int vid);
-
-/**
- * @deprecated
- * Get the number of queues the device supports.
- *
- * Note this function is deprecated, as it returns a queue pair number,
- * which is vhost specific. Instead, rte_vhost_get_vring_num should
- * be used.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The number of queues, 0 on failure
- */
-__rte_deprecated
-uint32_t rte_vhost_get_queue_num(int vid);
-
-/**
- * Get the number of vrings the device supports.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The number of vrings, 0 on failure
- */
-uint16_t rte_vhost_get_vring_num(int vid);
-
-/**
- * Get the virtio net device's ifname, which is the vhost-user socket
- * file path.
- *
- * @param vid
- *  vhost device ID
- * @param buf
- *  The buffer to stored the queried ifname
- * @param len
- *  The length of buf
- *
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_ifname(int vid, char *buf, size_t len);
-
-/**
- * Get how many avail entries are left in the queue
- *
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index
- *
- * @return
- *  num of avail entires left
- */
-uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
-
-/**
- * This function adds buffers to the virtio devices RX virtqueue. Buffers can
- * be received from the physical port or from another virtual device. A packet
- * count is returned to indicate the number of packets that were succesfully
- * added to the RX queue.
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index in mq case
- * @param pkts
- *  array to contain packets to be enqueued
- * @param count
- *  packets num to be enqueued
- * @return
- *  num of packets enqueued
- */
-uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
-	struct rte_mbuf **pkts, uint16_t count);
-
-/**
- * This function gets guest buffers from the virtio device TX virtqueue,
- * construct host mbufs, copies guest buffer content to host mbufs and
- * store them in pkts to be processed.
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index in mq case
- * @param mbuf_pool
- *  mbuf_pool where host mbuf is allocated.
- * @param pkts
- *  array to contain packets to be dequeued
- * @param count
- *  packets num to be dequeued
- * @return
- *  num of packets dequeued
- */
-uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
-	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
-
-/**
- * Get guest mem table: a list of memory regions.
- *
- * An rte_vhost_vhost_memory object will be allocated internaly, to hold the
- * guest memory regions. Application should free it at destroy_device()
- * callback.
- *
- * @param vid
- *  vhost device ID
- * @param mem
- *  To store the returned mem regions
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_mem_table(int vid, struct rte_vhost_memory **mem);
-
-/**
- * Get guest vring info, including the vring address, vring size, etc.
- *
- * @param vid
- *  vhost device ID
- * @param vring_idx
- *  vring index
- * @param vring
- *  the structure to hold the requested vring info
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
-			      struct rte_vhost_vring *vring);
-
-#endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 8be5b6a..3105a47 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -45,7 +45,7 @@
 #include <rte_string_fns.h>
 #include <rte_memory.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "vhost.h"
 
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index a199ee6..ddd8a9c 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -46,7 +46,7 @@
 #include <rte_log.h>
 #include <rte_ether.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* Used to indicate that the device is running on a data core */
 #define VIRTIO_DEV_RUNNING 1
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 838dec8..2ba22db 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -37,7 +37,7 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* refer to hw/virtio/vhost-user.c */
 
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 7ae7904..1004ae6 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -39,7 +39,7 @@
 #include <rte_memcpy.h>
 #include <rte_ether.h>
 #include <rte_ip.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
-- 
1.9.0

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v2 18/22] vhost: introduce API to start a specific driver
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
                     ` (5 preceding siblings ...)
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 14/22] vhost: rename device ops struct Yuanhan Liu
@ 2017-03-23  7:10  3%   ` Yuanhan Liu
  2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 19/22] vhost: rename header file Yuanhan Liu
  2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

We used to use rte_vhost_driver_session_start() to trigger the vhost-user
session. It takes no argument, thus it's a global trigger. And it could
be problematic.

The issue is, currently, rte_vhost_driver_register(path, flags) actually
tries to put it into the session loop (by fdset_add). However, it needs
a set of APIs to set a vhost-user driver properly:
  * rte_vhost_driver_register(path, flags);
  * rte_vhost_driver_set_features(path, features);
  * rte_vhost_driver_callback_register(path, vhost_device_ops);

If a new vhost-user driver is registered after the trigger (think OVS-DPDK
that could add a port dynamically from cmdline), the current code will
effectively starts the session for the new driver just after the first
API rte_vhost_driver_register() is invoked, leaving later calls taking
no effect at all.

To handle the case properly, this patch introduce a new API,
rte_vhost_driver_start(path), to trigger a specific vhost-user driver.
To do that, the rte_vhost_driver_register(path, flags) is simplified
to create the socket only and let rte_vhost_driver_start(path) to
actually put it into the session loop.

Meanwhile, the rte_vhost_driver_session_start is removed: we could hide
the session thread internally (create the thread if it has not been
created). This would also simplify the application.

NOTE: the API order in prog guide is slightly adjusted for showing the
correct invoke order.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
 doc/guides/prog_guide/vhost_lib.rst    | 24 +++++------
 doc/guides/rel_notes/release_17_05.rst |  8 ++++
 drivers/net/vhost/rte_eth_vhost.c      | 50 ++-------------------
 examples/tep_termination/main.c        |  8 +++-
 examples/vhost/main.c                  |  9 +++-
 lib/librte_vhost/fd_man.c              |  9 ++--
 lib/librte_vhost/fd_man.h              |  2 +-
 lib/librte_vhost/rte_vhost_version.map |  2 +-
 lib/librte_vhost/rte_virtio_net.h      | 15 ++++++-
 lib/librte_vhost/socket.c              | 79 +++++++++++++++++++---------------
 10 files changed, 104 insertions(+), 102 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index a4fb1f1..5979290 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -116,12 +116,6 @@ The following is an overview of some key Vhost API functions:
   vhost-user driver could be vhost-user net, yet it could be something else,
   say, vhost-user SCSI.
 
-* ``rte_vhost_driver_session_start()``
-
-  This function starts the vhost session loop to handle vhost messages. It
-  starts an infinite loop, therefore it should be called in a dedicated
-  thread.
-
 * ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
@@ -149,6 +143,17 @@ The following is an overview of some key Vhost API functions:
     ``VHOST_F_LOG_ALL`` will be set/cleared at the start/end of live
     migration, respectively.
 
+* ``rte_vhost_driver_disable/enable_features(path, features))``
+
+  This function disables/enables some features. For example, it can be used to
+  disable mergeable buffers and TSO features, which both are enabled by
+  default.
+
+* ``rte_vhost_driver_start(path)``
+
+  This function triggers the vhost-user negotiation. It should be invoked at
+  the end of initializing a vhost-user driver.
+
 * ``rte_vhost_enqueue_burst(vid, queue_id, pkts, count)``
 
   Transmits (enqueues) ``count`` packets from host to guest.
@@ -157,13 +162,6 @@ The following is an overview of some key Vhost API functions:
 
   Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``.
 
-* ``rte_vhost_driver_disable/enable_features(path, features))``
-
-  This function disables/enables some features. For example, it can be used to
-  disable mergeable buffers and TSO features, which both are enabled by
-  default.
-
-
 Vhost-user Implementations
 --------------------------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2efe292..8f06fc4 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -57,6 +57,11 @@ New Features
   * Enable Vhost PMD's MTU get feature.
   * Get max MTU value from host in Virtio PMD
 
+* **Made the vhost lib be a generic vhost-user lib.**
+
+  Now it could be used to implement any other vhost-user drivers, such
+  as, vhost-user SCSI.
+
 
 Resolved Issues
 ---------------
@@ -157,6 +162,9 @@ API Changes
    * The vhost struct ``virtio_net_device_ops`` is renamed to
      ``vhost_device_ops``
 
+   * The vhost API ``rte_vhost_driver_session_start`` is removed. Instead,
+     ``rte_vhost_driver_start`` should be used.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 97a765f..e6c0758 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -127,9 +127,6 @@ struct internal_list {
 
 static pthread_mutex_t internal_list_lock = PTHREAD_MUTEX_INITIALIZER;
 
-static rte_atomic16_t nb_started_ports;
-static pthread_t session_th;
-
 static struct rte_eth_link pmd_link = {
 		.link_speed = 10000,
 		.link_duplex = ETH_LINK_FULL_DUPLEX,
@@ -743,42 +740,6 @@ struct vhost_xstats_name_off {
 	return vid;
 }
 
-static void *
-vhost_driver_session(void *param __rte_unused)
-{
-	/* start event handling */
-	rte_vhost_driver_session_start();
-
-	return NULL;
-}
-
-static int
-vhost_driver_session_start(void)
-{
-	int ret;
-
-	ret = pthread_create(&session_th,
-			NULL, vhost_driver_session, NULL);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't create a thread\n");
-
-	return ret;
-}
-
-static void
-vhost_driver_session_stop(void)
-{
-	int ret;
-
-	ret = pthread_cancel(session_th);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't cancel the thread\n");
-
-	ret = pthread_join(session_th, NULL);
-	if (ret)
-		RTE_LOG(ERR, PMD, "Can't join the thread\n");
-}
-
 static int
 eth_dev_start(struct rte_eth_dev *dev)
 {
@@ -1083,10 +1044,10 @@ struct vhost_xstats_name_off {
 		goto error;
 	}
 
-	/* We need only one message handling thread */
-	if (rte_atomic16_add_return(&nb_started_ports, 1) == 1) {
-		if (vhost_driver_session_start())
-			goto error;
+	if (rte_vhost_driver_start(iface_name) < 0) {
+		RTE_LOG(ERR, PMD, "Failed to start driver for %s\n",
+			iface_name);
+		goto error;
 	}
 
 	return data->port_id;
@@ -1213,9 +1174,6 @@ struct vhost_xstats_name_off {
 
 	eth_dev_close(eth_dev);
 
-	if (rte_atomic16_sub_return(&nb_started_ports, 1) == 0)
-		vhost_driver_session_stop();
-
 	rte_free(vring_states[eth_dev->data->port_id]);
 	vring_states[eth_dev->data->port_id] = NULL;
 
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 738f2d2..24c62cd 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1263,7 +1263,13 @@ static inline void __attribute__((always_inline))
 			"failed to register vhost driver callbacks.\n");
 	}
 
-	rte_vhost_driver_session_start();
+	if (rte_vhost_driver_start(dev_basename) < 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to start vhost driver.\n");
+	}
+
+	RTE_LCORE_FOREACH_SLAVE(lcore_id)
+		rte_eal_wait_lcore(lcore_id);
 
 	return 0;
 }
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 4395306..64b3eea 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1545,9 +1545,16 @@ static inline void __attribute__((always_inline))
 			rte_exit(EXIT_FAILURE,
 				"failed to register vhost driver callbacks.\n");
 		}
+
+		if (rte_vhost_driver_start(file) < 0) {
+			rte_exit(EXIT_FAILURE,
+				"failed to start vhost driver.\n");
+		}
 	}
 
-	rte_vhost_driver_session_start();
+	RTE_LCORE_FOREACH_SLAVE(lcore_id)
+		rte_eal_wait_lcore(lcore_id);
+
 	return 0;
 
 }
diff --git a/lib/librte_vhost/fd_man.c b/lib/librte_vhost/fd_man.c
index c7a4490..2ceacc9 100644
--- a/lib/librte_vhost/fd_man.c
+++ b/lib/librte_vhost/fd_man.c
@@ -210,8 +210,8 @@
  * will wait until the flag is reset to zero(which indicates the callback is
  * finished), then it could free the context after fdset_del.
  */
-void
-fdset_event_dispatch(struct fdset *pfdset)
+void *
+fdset_event_dispatch(void *arg)
 {
 	int i;
 	struct pollfd *pfd;
@@ -221,9 +221,10 @@
 	int fd, numfds;
 	int remove1, remove2;
 	int need_shrink;
+	struct fdset *pfdset = arg;
 
 	if (pfdset == NULL)
-		return;
+		return NULL;
 
 	while (1) {
 
@@ -294,4 +295,6 @@
 		if (need_shrink)
 			fdset_shrink(pfdset);
 	}
+
+	return NULL;
 }
diff --git a/lib/librte_vhost/fd_man.h b/lib/librte_vhost/fd_man.h
index d319cac..90d34db 100644
--- a/lib/librte_vhost/fd_man.h
+++ b/lib/librte_vhost/fd_man.h
@@ -64,6 +64,6 @@ int fdset_add(struct fdset *pfdset, int fd,
 
 void *fdset_del(struct fdset *pfdset, int fd);
 
-void fdset_event_dispatch(struct fdset *pfdset);
+void *fdset_event_dispatch(void *arg);
 
 #endif
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 70c28f7..4395fa5 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -4,7 +4,6 @@ DPDK_2.0 {
 	rte_vhost_dequeue_burst;
 	rte_vhost_driver_callback_register;
 	rte_vhost_driver_register;
-	rte_vhost_driver_session_start;
 	rte_vhost_enable_guest_notification;
 	rte_vhost_enqueue_burst;
 
@@ -35,6 +34,7 @@ DPDK_17.05 {
 	rte_vhost_driver_enable_features;
 	rte_vhost_driver_get_features;
 	rte_vhost_driver_set_features;
+	rte_vhost_driver_start;
 	rte_vhost_get_mem_table;
 	rte_vhost_get_mtu;
 	rte_vhost_get_negotiated_features
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 11b204d..627708d 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -250,8 +250,19 @@ void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(const char *path,
 	struct vhost_device_ops const * const ops);
-/* Start vhost driver session blocking loop. */
-int rte_vhost_driver_session_start(void);
+
+/**
+ *
+ * Start the vhost-user driver.
+ *
+ * This function triggers the vhost-user negotiation.
+ *
+ * @param path
+ *  The vhost-user socket file path
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_driver_start(const char *path);
 
 /**
  * Get the MTU value of the device if set in QEMU.
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 31b868d..b056a17 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -58,8 +58,9 @@
  */
 struct vhost_user_socket {
 	char *path;
-	int listenfd;
 	int connfd;
+	struct sockaddr_un un;
+	int socket_fd;
 	bool is_server;
 	bool reconnect;
 	bool dequeue_zero_copy;
@@ -94,7 +95,7 @@ struct vhost_user {
 
 static void vhost_user_server_new_connection(int fd, void *data, int *remove);
 static void vhost_user_read_cb(int fd, void *dat, int *remove);
-static int vhost_user_create_client(struct vhost_user_socket *vsocket);
+static int vhost_user_start_client(struct vhost_user_socket *vsocket);
 
 static struct vhost_user vhost_user = {
 	.fdset = {
@@ -266,22 +267,23 @@ struct vhost_user {
 		free(conn);
 
 		if (vsocket->reconnect)
-			vhost_user_create_client(vsocket);
+			vhost_user_start_client(vsocket);
 	}
 }
 
 static int
-create_unix_socket(const char *path, struct sockaddr_un *un, bool is_server)
+create_unix_socket(struct vhost_user_socket *vsocket)
 {
 	int fd;
+	struct sockaddr_un *un = &vsocket->un;
 
 	fd = socket(AF_UNIX, SOCK_STREAM, 0);
 	if (fd < 0)
 		return -1;
 	RTE_LOG(INFO, VHOST_CONFIG, "vhost-user %s: socket created, fd: %d\n",
-		is_server ? "server" : "client", fd);
+		vsocket->is_server ? "server" : "client", fd);
 
-	if (!is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) {
+	if (!vsocket->is_server && fcntl(fd, F_SETFL, O_NONBLOCK)) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"vhost-user: can't set nonblocking mode for socket, fd: "
 			"%d (%s)\n", fd, strerror(errno));
@@ -291,25 +293,21 @@ struct vhost_user {
 
 	memset(un, 0, sizeof(*un));
 	un->sun_family = AF_UNIX;
-	strncpy(un->sun_path, path, sizeof(un->sun_path));
+	strncpy(un->sun_path, vsocket->path, sizeof(un->sun_path));
 	un->sun_path[sizeof(un->sun_path) - 1] = '\0';
 
-	return fd;
+	vsocket->socket_fd = fd;
+	return 0;
 }
 
 static int
-vhost_user_create_server(struct vhost_user_socket *vsocket)
+vhost_user_start_server(struct vhost_user_socket *vsocket)
 {
-	int fd;
 	int ret;
-	struct sockaddr_un un;
+	int fd = vsocket->socket_fd;
 	const char *path = vsocket->path;
 
-	fd = create_unix_socket(path, &un, vsocket->is_server);
-	if (fd < 0)
-		return -1;
-
-	ret = bind(fd, (struct sockaddr *)&un, sizeof(un));
+	ret = bind(fd, (struct sockaddr *)&vsocket->un, sizeof(vsocket->un));
 	if (ret < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"failed to bind to %s: %s; remove it and try again\n",
@@ -322,7 +320,6 @@ struct vhost_user {
 	if (ret < 0)
 		goto err;
 
-	vsocket->listenfd = fd;
 	ret = fdset_add(&vhost_user.fdset, fd, vhost_user_server_new_connection,
 		  NULL, vsocket);
 	if (ret < 0) {
@@ -441,20 +438,15 @@ struct vhost_user_reconnect_list {
 }
 
 static int
-vhost_user_create_client(struct vhost_user_socket *vsocket)
+vhost_user_start_client(struct vhost_user_socket *vsocket)
 {
-	int fd;
 	int ret;
-	struct sockaddr_un un;
+	int fd = vsocket->socket_fd;
 	const char *path = vsocket->path;
 	struct vhost_user_reconnect *reconn;
 
-	fd = create_unix_socket(path, &un, vsocket->is_server);
-	if (fd < 0)
-		return -1;
-
-	ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&un,
-					  sizeof(un));
+	ret = vhost_user_connect_nonblock(fd, (struct sockaddr *)&vsocket->un,
+					  sizeof(vsocket->un));
 	if (ret == 0) {
 		vhost_user_add_connection(fd, vsocket);
 		return 0;
@@ -477,7 +469,7 @@ struct vhost_user_reconnect_list {
 		close(fd);
 		return -1;
 	}
-	reconn->un = un;
+	reconn->un = vsocket->un;
 	reconn->fd = fd;
 	reconn->vsocket = vsocket;
 	pthread_mutex_lock(&reconn_list.mutex);
@@ -627,11 +619,10 @@ struct vhost_user_reconnect_list {
 				goto out;
 			}
 		}
-		ret = vhost_user_create_client(vsocket);
 	} else {
 		vsocket->is_server = true;
-		ret = vhost_user_create_server(vsocket);
 	}
+	ret = create_unix_socket(vsocket);
 	if (ret < 0) {
 		free(vsocket->path);
 		free(vsocket);
@@ -687,8 +678,8 @@ struct vhost_user_reconnect_list {
 
 		if (!strcmp(vsocket->path, path)) {
 			if (vsocket->is_server) {
-				fdset_del(&vhost_user.fdset, vsocket->listenfd);
-				close(vsocket->listenfd);
+				fdset_del(&vhost_user.fdset, vsocket->socket_fd);
+				close(vsocket->socket_fd);
 				unlink(path);
 			} else if (vsocket->reconnect) {
 				vhost_user_remove_reconnect(vsocket);
@@ -751,8 +742,28 @@ struct vhost_device_ops const *
 }
 
 int
-rte_vhost_driver_session_start(void)
+rte_vhost_driver_start(const char *path)
 {
-	fdset_event_dispatch(&vhost_user.fdset);
-	return 0;
+	struct vhost_user_socket *vsocket;
+	static pthread_t fdset_tid;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	if (!vsocket)
+		return -1;
+
+	if (fdset_tid == 0) {
+		int ret = pthread_create(&fdset_tid, NULL, fdset_event_dispatch,
+				     &vhost_user.fdset);
+		if (ret < 0)
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"failed to create fdset handling thread");
+	}
+
+	if (vsocket->is_server)
+		return vhost_user_start_server(vsocket);
+	else
+		return vhost_user_start_client(vsocket);
 }
-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 14/22] vhost: rename device ops struct
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
                     ` (4 preceding siblings ...)
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 13/22] vhost: do not include net specific headers Yuanhan Liu
@ 2017-03-23  7:10  4%   ` Yuanhan Liu
  2017-03-23  7:10  3%   ` [dpdk-dev] [PATCH v2 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

rename "virtio_net_device_ops" to "vhost_device_ops", to not let it
be virtio-net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/prog_guide/vhost_lib.rst    | 2 +-
 doc/guides/rel_notes/release_17_05.rst | 3 +++
 drivers/net/vhost/rte_eth_vhost.c      | 2 +-
 examples/tep_termination/main.c        | 2 +-
 examples/vhost/main.c                  | 2 +-
 lib/librte_vhost/Makefile              | 2 +-
 lib/librte_vhost/rte_virtio_net.h      | 4 ++--
 lib/librte_vhost/socket.c              | 6 +++---
 lib/librte_vhost/vhost.h               | 4 ++--
 9 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 40f3b3b..e6e34f3 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -122,7 +122,7 @@ The following is an overview of some key Vhost API functions:
   starts an infinite loop, therefore it should be called in a dedicated
   thread.
 
-* ``rte_vhost_driver_callback_register(path, virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
   the appropriate action when some events happen. The following events are
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 2b56e80..2efe292 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -154,6 +154,9 @@ API Changes
      * ``linux/if.h``
      * ``rte_ether.h``
 
+   * The vhost struct ``virtio_net_device_ops`` is renamed to
+     ``vhost_device_ops``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 891ee70..97a765f 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -671,7 +671,7 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
-static struct virtio_net_device_ops vhost_ops = {
+static struct vhost_device_ops vhost_ops = {
 	.new_device          = new_device,
 	.destroy_device      = destroy_device,
 	.vring_state_changed = vring_state_changed,
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 18b977e..738f2d2 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1081,7 +1081,7 @@ static inline void __attribute__((always_inline))
  * These callback allow devices to be added to the data core when configuration
  * has been fully complete.
  */
-static const struct virtio_net_device_ops virtio_net_device_ops = {
+static const struct vhost_device_ops virtio_net_device_ops = {
 	.new_device =  new_device,
 	.destroy_device = destroy_device,
 };
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 72a9d69..4395306 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1270,7 +1270,7 @@ static inline void __attribute__((always_inline))
  * These callback allow devices to be added to the data core when configuration
  * has been fully complete.
  */
-static const struct virtio_net_device_ops virtio_net_device_ops =
+static const struct vhost_device_ops virtio_net_device_ops =
 {
 	.new_device =  new_device,
 	.destroy_device = destroy_device,
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 415ffc6..5cf4e93 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -36,7 +36,7 @@ LIB = librte_vhost.a
 
 EXPORT_MAP := rte_vhost_version.map
 
-LIBABIVER := 3
+LIBABIVER := 4
 
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64
 CFLAGS += -I vhost_user
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 0063949..26ac35f 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -87,7 +87,7 @@ struct rte_vhost_vring {
 /**
  * Device and vring operations.
  */
-struct virtio_net_device_ops {
+struct vhost_device_ops {
 	int (*new_device)(int vid);		/**< Add device. */
 	void (*destroy_device)(int vid);	/**< Remove device. */
 
@@ -198,7 +198,7 @@ static inline uint64_t __attribute__((always_inline))
 
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(const char *path,
-	struct virtio_net_device_ops const * const ops);
+	struct vhost_device_ops const * const ops);
 /* Start vhost driver session blocking loop. */
 int rte_vhost_driver_session_start(void);
 
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 8431511..31b868d 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -74,7 +74,7 @@ struct vhost_user_socket {
 	uint64_t supported_features;
 	uint64_t features;
 
-	struct virtio_net_device_ops const *notify_ops;
+	struct vhost_device_ops const *notify_ops;
 };
 
 struct vhost_user_connection {
@@ -725,7 +725,7 @@ struct vhost_user_reconnect_list {
  */
 int
 rte_vhost_driver_callback_register(const char *path,
-	struct virtio_net_device_ops const * const ops)
+	struct vhost_device_ops const * const ops)
 {
 	struct vhost_user_socket *vsocket;
 
@@ -738,7 +738,7 @@ struct vhost_user_reconnect_list {
 	return vsocket ? 0 : -1;
 }
 
-struct virtio_net_device_ops const *
+struct vhost_device_ops const *
 vhost_driver_callback_get(const char *path)
 {
 	struct vhost_user_socket *vsocket;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 672098b..225ff2e 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -191,7 +191,7 @@ struct virtio_net {
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
-	struct virtio_net_device_ops const *notify_ops;
+	struct vhost_device_ops const *notify_ops;
 
 	uint32_t		nr_guest_pages;
 	uint32_t		max_guest_pages;
@@ -265,7 +265,7 @@ static inline phys_addr_t __attribute__((always_inline))
 void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);
 
-struct virtio_net_device_ops const *vhost_driver_callback_get(const char *path);
+struct vhost_device_ops const *vhost_driver_callback_get(const char *path);
 
 /*
  * Backend-specific cleanup.
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 13/22] vhost: do not include net specific headers
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
                     ` (3 preceding siblings ...)
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
@ 2017-03-23  7:10  4%   ` Yuanhan Liu
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 14/22] vhost: rename device ops struct Yuanhan Liu
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Include it internally, at vhost.h.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst | 7 +++++++
 examples/vhost/main.h                  | 2 ++
 lib/librte_vhost/rte_virtio_net.h      | 4 ----
 lib/librte_vhost/vhost.h               | 4 ++++
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 55bf136..2b56e80 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -147,6 +147,13 @@ API Changes
      * ``VIRTIO_TXQ``
      * ``VIRTIO_QNUM``
 
+   * Few net specific header files are removed in ``rte_virtio_net.h``
+
+     * ``linux/virtio_net.h``
+     * ``sys/socket.h``
+     * ``linux/if.h``
+     * ``rte_ether.h``
+
 
 ABI Changes
 -----------
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index 7a3d251..ddcd858 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -36,6 +36,8 @@
 
 #include <sys/queue.h>
 
+#include <rte_ether.h>
+
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER2
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 1ae1920..0063949 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -42,14 +42,10 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 #include <linux/virtio_ring.h>
-#include <linux/virtio_net.h>
 #include <sys/eventfd.h>
-#include <sys/socket.h>
-#include <linux/if.h>
 
 #include <rte_memory.h>
 #include <rte_mempool.h>
-#include <rte_ether.h>
 
 #define RTE_VHOST_USER_CLIENT		(1ULL << 0)
 #define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 84e379a..672098b 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -39,8 +39,12 @@
 #include <sys/queue.h>
 #include <unistd.h>
 #include <linux/vhost.h>
+#include <linux/virtio_net.h>
+#include <sys/socket.h>
+#include <linux/if.h>
 
 #include <rte_log.h>
+#include <rte_ether.h>
 
 #include "rte_virtio_net.h"
 
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 12/22] vhost: drop the Rx and Tx queue macro
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
                     ` (2 preceding siblings ...)
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 10/22] vhost: export the number of vrings Yuanhan Liu
@ 2017-03-23  7:10  4%   ` Yuanhan Liu
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 13/22] vhost: do not include net specific headers Yuanhan Liu
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

They are virtio-net specific and should be defined inside the virtio-net
driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst | 6 ++++++
 drivers/net/vhost/rte_eth_vhost.c      | 2 ++
 examples/tep_termination/main.h        | 2 ++
 examples/vhost/main.h                  | 2 ++
 lib/librte_vhost/rte_virtio_net.h      | 3 ---
 5 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index eca9451..55bf136 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -141,6 +141,12 @@ API Changes
    * The vhost API ``rte_vhost_get_queue_num`` is deprecated, instead,
      ``rte_vhost_get_vring_num`` should be used.
 
+   * Few macros are removed in ``rte_virtio_net.h``
+
+     * ``VIRTIO_RXQ``
+     * ``VIRTIO_TXQ``
+     * ``VIRTIO_QNUM``
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index dc583e4..891ee70 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -45,6 +45,8 @@
 
 #include "rte_eth_vhost.h"
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 #define ETH_VHOST_IFACE_ARG		"iface"
 #define ETH_VHOST_QUEUES_ARG		"queues"
 #define ETH_VHOST_CLIENT_ARG		"client"
diff --git a/examples/tep_termination/main.h b/examples/tep_termination/main.h
index c0ea766..8ed817d 100644
--- a/examples/tep_termination/main.h
+++ b/examples/tep_termination/main.h
@@ -54,6 +54,8 @@
 /* Max number of devices. Limited by the application. */
 #define MAX_DEVICES 64
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 /* Per-device statistics struct */
 struct device_statistics {
 	uint64_t tx_total;
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index 6bb42e8..7a3d251 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -41,6 +41,8 @@
 #define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER2
 #define RTE_LOGTYPE_VHOST_PORT   RTE_LOGTYPE_USER3
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 struct device_statistics {
 	uint64_t	tx;
 	uint64_t	tx_total;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index f700d2f..1ae1920 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -55,9 +55,6 @@
 #define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
 #define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
 
-/* Enum for virtqueue management. */
-enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
-
 /**
  * Information relating to memory regions including offsets to
  * addresses in QEMUs memory file.
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 10/22] vhost: export the number of vrings
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
  2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 02/22] net/vhost: remove feature related APIs Yuanhan Liu
  2017-03-23  7:10  3%   ` [dpdk-dev] [PATCH v2 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
@ 2017-03-23  7:10  4%   ` Yuanhan Liu
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

We used to use rte_vhost_get_queue_num() for telling how many vrings.
However, the return value is the number of "queue pairs", which is
very virtio-net specific. To make it generic, we should return the
number of vrings instead, and let the driver do the proper translation.
Say, virtio-net driver could turn it to the number of queue pairs by
dividing 2.

Meanwhile, mark rte_vhost_get_queue_num as deprecated.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
v2: - update release note
---
 doc/guides/rel_notes/release_17_05.rst |  3 +++
 drivers/net/vhost/rte_eth_vhost.c      |  2 +-
 lib/librte_vhost/rte_vhost_version.map |  1 +
 lib/librte_vhost/rte_virtio_net.h      | 17 +++++++++++++++++
 lib/librte_vhost/vhost.c               | 11 +++++++++++
 5 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index dfa636d..eca9451 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -138,6 +138,9 @@ API Changes
    * The vhost API ``rte_vhost_driver_callback_register(ops)`` takes one
      more argument: ``rte_vhost_driver_callback_register(path, ops)``.
 
+   * The vhost API ``rte_vhost_get_queue_num`` is deprecated, instead,
+     ``rte_vhost_get_vring_num`` should be used.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index f6ad616..dc583e4 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -569,7 +569,7 @@ struct vhost_xstats_name_off {
 		vq->port = eth_dev->data->port_id;
 	}
 
-	for (i = 0; i < rte_vhost_get_queue_num(vid) * VIRTIO_QNUM; i++)
+	for (i = 0; i < rte_vhost_get_vring_num(vid); i++)
 		rte_vhost_enable_guest_notification(vid, i, 0);
 
 	rte_vhost_get_mtu(vid, &eth_dev->data->mtu);
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 7df7af6..ff62c39 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -40,6 +40,7 @@ DPDK_17.05 {
 	rte_vhost_get_negotiated_features
 	rte_vhost_get_vhost_memory;
 	rte_vhost_get_vhost_vring;
+	rte_vhost_get_vring_num;
 	rte_vhost_gpa_to_vva;
 
 } DPDK_16.07;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 36674bb..f700d2f 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -237,17 +237,34 @@ int rte_vhost_driver_callback_register(const char *path,
 int rte_vhost_get_numa_node(int vid);
 
 /**
+ * @deprecated
  * Get the number of queues the device supports.
  *
+ * Note this function is deprecated, as it returns a queue pair number,
+ * which is virtio-net specific. Instead, rte_vhost_get_vring_num should
+ * be used.
+ *
  * @param vid
  *  virtio-net device ID
  *
  * @return
  *  The number of queues, 0 on failure
  */
+__rte_deprecated
 uint32_t rte_vhost_get_queue_num(int vid);
 
 /**
+ * Get the number of vrings the device supports.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of vrings, 0 on failure
+ */
+uint16_t rte_vhost_get_vring_num(int vid);
+
+/**
  * Get the virtio net device's ifname, which is the vhost-user socket
  * file path.
  *
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 70477c6..74ae3b2 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -317,6 +317,17 @@ struct virtio_net *
 	return dev->nr_vring / 2;
 }
 
+uint16_t
+rte_vhost_get_vring_num(int vid)
+{
+	struct virtio_net *dev = get_device(vid);
+
+	if (dev == NULL)
+		return 0;
+
+	return dev->nr_vring;
+}
+
 int
 rte_vhost_get_ifname(int vid, char *buf, size_t len)
 {
-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 04/22] vhost: make notify ops per vhost driver
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
  2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 02/22] net/vhost: remove feature related APIs Yuanhan Liu
@ 2017-03-23  7:10  3%   ` Yuanhan Liu
  2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 10/22] vhost: export the number of vrings Yuanhan Liu
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Assume there is an application both support vhost-user net and
vhost-user scsi, the callback should be different. Making notify
ops per vhost driver allow application define different set of
callbacks for different driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---

v2: - check the return value of callback_register and callback_get
    - update release note
---
 doc/guides/prog_guide/vhost_lib.rst    |  2 +-
 doc/guides/rel_notes/release_17_05.rst |  3 +++
 drivers/net/vhost/rte_eth_vhost.c      | 20 +++++++++++---------
 examples/tep_termination/main.c        |  7 ++++++-
 examples/vhost/main.c                  |  9 +++++++--
 lib/librte_vhost/rte_virtio_net.h      |  3 ++-
 lib/librte_vhost/socket.c              | 32 ++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost.c               | 16 +---------------
 lib/librte_vhost/vhost.h               |  5 ++++-
 lib/librte_vhost/vhost_user.c          | 22 ++++++++++++++++------
 10 files changed, 83 insertions(+), 36 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 6a4d206..40f3b3b 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -122,7 +122,7 @@ The following is an overview of some key Vhost API functions:
   starts an infinite loop, therefore it should be called in a dedicated
   thread.
 
-* ``rte_vhost_driver_callback_register(virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, virtio_net_device_ops)``
 
   This function registers a set of callbacks, to let DPDK applications take
   the appropriate action when some events happen. The following events are
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 4e405b1..dfa636d 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -135,6 +135,9 @@ API Changes
      * ``rte_eth_vhost_feature_enable``
      * ``rte_eth_vhost_feature_get``
 
+   * The vhost API ``rte_vhost_driver_callback_register(ops)`` takes one
+     more argument: ``rte_vhost_driver_callback_register(path, ops)``.
+
 
 ABI Changes
 -----------
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index 83063c2..f6ad616 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -669,6 +669,12 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
+static struct virtio_net_device_ops vhost_ops = {
+	.new_device          = new_device,
+	.destroy_device      = destroy_device,
+	.vring_state_changed = vring_state_changed,
+};
+
 int
 rte_eth_vhost_get_queue_event(uint8_t port_id,
 		struct rte_eth_vhost_queue_event *event)
@@ -738,15 +744,6 @@ struct vhost_xstats_name_off {
 static void *
 vhost_driver_session(void *param __rte_unused)
 {
-	static struct virtio_net_device_ops vhost_ops;
-
-	/* set vhost arguments */
-	vhost_ops.new_device = new_device;
-	vhost_ops.destroy_device = destroy_device;
-	vhost_ops.vring_state_changed = vring_state_changed;
-	if (rte_vhost_driver_callback_register(&vhost_ops) < 0)
-		RTE_LOG(ERR, PMD, "Can't register callbacks\n");
-
 	/* start event handling */
 	rte_vhost_driver_session_start();
 
@@ -1079,6 +1076,11 @@ struct vhost_xstats_name_off {
 	if (rte_vhost_driver_register(iface_name, flags))
 		goto error;
 
+	if (rte_vhost_driver_callback_register(iface_name, &vhost_ops) < 0) {
+		RTE_LOG(ERR, PMD, "Can't register callbacks\n");
+		goto error;
+	}
+
 	/* We need only one message handling thread */
 	if (rte_atomic16_add_return(&nb_started_ports, 1) == 1) {
 		if (vhost_driver_session_start())
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index 8097dcd..18b977e 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -1256,7 +1256,12 @@ static inline void __attribute__((always_inline))
 	rte_vhost_driver_disable_features(dev_basename,
 		1ULL << VIRTIO_NET_F_MRG_RXBUF);
 
-	rte_vhost_driver_callback_register(&virtio_net_device_ops);
+	ret = rte_vhost_driver_callback_register(dev_basename,
+		&virtio_net_device_ops);
+	if (ret != 0) {
+		rte_exit(EXIT_FAILURE,
+			"failed to register vhost driver callbacks.\n");
+	}
 
 	rte_vhost_driver_session_start();
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 972a6a8..72a9d69 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1538,9 +1538,14 @@ static inline void __attribute__((always_inline))
 			rte_vhost_driver_enable_features(file,
 				1ULL << VIRTIO_NET_F_CTRL_RX);
 		}
-	}
 
-	rte_vhost_driver_callback_register(&virtio_net_device_ops);
+		ret = rte_vhost_driver_callback_register(file,
+			&virtio_net_device_ops);
+		if (ret != 0) {
+			rte_exit(EXIT_FAILURE,
+				"failed to register vhost driver callbacks.\n");
+		}
+	}
 
 	rte_vhost_driver_session_start();
 	return 0;
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 5dadd3d..67bd125 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -133,7 +133,8 @@ struct virtio_net_device_ops {
 uint64_t rte_vhost_driver_get_features(const char *path);
 
 /* Register callbacks. */
-int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const);
+int rte_vhost_driver_callback_register(const char *path,
+	struct virtio_net_device_ops const * const ops);
 /* Start vhost driver session blocking loop. */
 int rte_vhost_driver_session_start(void);
 
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index bbb4112..8431511 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -73,6 +73,8 @@ struct vhost_user_socket {
 	 */
 	uint64_t supported_features;
 	uint64_t features;
+
+	struct virtio_net_device_ops const *notify_ops;
 };
 
 struct vhost_user_connection {
@@ -718,6 +720,36 @@ struct vhost_user_reconnect_list {
 	return -1;
 }
 
+/*
+ * Register ops so that we can add/remove device to data core.
+ */
+int
+rte_vhost_driver_callback_register(const char *path,
+	struct virtio_net_device_ops const * const ops)
+{
+	struct vhost_user_socket *vsocket;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	if (vsocket)
+		vsocket->notify_ops = ops;
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	return vsocket ? 0 : -1;
+}
+
+struct virtio_net_device_ops const *
+vhost_driver_callback_get(const char *path)
+{
+	struct vhost_user_socket *vsocket;
+
+	pthread_mutex_lock(&vhost_user.mutex);
+	vsocket = find_vhost_user_socket(path);
+	pthread_mutex_unlock(&vhost_user.mutex);
+
+	return vsocket ? vsocket->notify_ops : NULL;
+}
+
 int
 rte_vhost_driver_session_start(void)
 {
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 7b40a92..7d7bb3c 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -51,9 +51,6 @@
 
 struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
 
-/* device ops to add/remove device to/from data core. */
-struct virtio_net_device_ops const *notify_ops;
-
 struct virtio_net *
 get_device(int vid)
 {
@@ -253,7 +250,7 @@ struct virtio_net *
 
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(vid);
+		dev->notify_ops->destroy_device(vid);
 	}
 
 	cleanup_device(dev, 1);
@@ -396,14 +393,3 @@ struct virtio_net *
 	dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
 	return 0;
 }
-
-/*
- * Register ops so that we can add/remove device to data core.
- */
-int
-rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
-{
-	notify_ops = ops;
-
-	return 0;
-}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 692691b..6186216 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -185,6 +185,8 @@ struct virtio_net {
 	struct ether_addr	mac;
 	uint16_t		mtu;
 
+	struct virtio_net_device_ops const *notify_ops;
+
 	uint32_t		nr_guest_pages;
 	uint32_t		max_guest_pages;
 	struct guest_page       *guest_pages;
@@ -288,7 +290,6 @@ static inline phys_addr_t __attribute__((always_inline))
 	return 0;
 }
 
-struct virtio_net_device_ops const *notify_ops;
 struct virtio_net *get_device(int vid);
 
 int vhost_new_device(void);
@@ -301,6 +302,8 @@ static inline phys_addr_t __attribute__((always_inline))
 void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
 void vhost_enable_dequeue_zero_copy(int vid);
 
+struct virtio_net_device_ops const *vhost_driver_callback_get(const char *path);
+
 /*
  * Backend-specific cleanup.
  *
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index d630098..0cadd79 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -135,7 +135,7 @@
 {
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	cleanup_device(dev, 0);
@@ -503,7 +503,7 @@
 	/* Remove from the data plane. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	if (dev->mem) {
@@ -687,7 +687,7 @@
 						"dequeue zero copy is enabled\n");
 			}
 
-			if (notify_ops->new_device(dev->vid) == 0)
+			if (dev->notify_ops->new_device(dev->vid) == 0)
 				dev->flags |= VIRTIO_DEV_RUNNING;
 		}
 	}
@@ -721,7 +721,7 @@
 	/* We have to stop the queue (virtio) if it is running. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	dev->flags &= ~VIRTIO_DEV_READY;
@@ -763,8 +763,8 @@
 		"set queue enable: %d to qp idx: %d\n",
 		enable, state->index);
 
-	if (notify_ops->vring_state_changed)
-		notify_ops->vring_state_changed(dev->vid, state->index, enable);
+	if (dev->notify_ops->vring_state_changed)
+		dev->notify_ops->vring_state_changed(dev->vid, state->index, enable);
 
 	dev->virtqueue[state->index]->enabled = enable;
 
@@ -978,6 +978,16 @@
 	if (dev == NULL)
 		return -1;
 
+	if (!dev->notify_ops) {
+		dev->notify_ops = vhost_driver_callback_get(dev->ifname);
+		if (!dev->notify_ops) {
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"failed to get callback ops for driver %s\n",
+				dev->ifname);
+			return -1;
+		}
+	}
+
 	ret = read_vhost_message(fd, &msg);
 	if (ret <= 0 || msg.request >= VHOST_USER_MAX) {
 		if (ret < 0)
-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 02/22] net/vhost: remove feature related APIs
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
@ 2017-03-23  7:10  5%   ` Yuanhan Liu
  2017-03-23  7:10  3%   ` [dpdk-dev] [PATCH v2 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

The rte_eth_vhost_feature_disable/enable/get APIs are just a wrapper of
rte_vhost_feature_disable/enable/get. However, the later are going to
be refactored; it's going to take an extra parameter (socket_file path),
to let it be per-device.

Instead of changing those vhost-pmd APIs to adapt to the new vhost APIs,
we could simply remove them, and let vdev to serve this purpose. After
all, vdev options is better for disabling/enabling some features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---

v2: - write more informative commit log on why they are removed.
    - update release note
---
 doc/guides/rel_notes/release_17_05.rst      |  7 +++++++
 drivers/net/vhost/rte_eth_vhost.c           | 25 ------------------------
 drivers/net/vhost/rte_eth_vhost.h           | 30 -----------------------------
 drivers/net/vhost/rte_pmd_vhost_version.map |  3 ---
 4 files changed, 7 insertions(+), 58 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index bb64428..4e405b1 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -129,6 +129,13 @@ API Changes
 * The LPM ``next_hop`` field is extended from 8 bits to 21 bits for IPv6
   while keeping ABI compatibility.
 
+   * The following vhost-pmd APIs are removed
+
+     * ``rte_eth_vhost_feature_disable``
+     * ``rte_eth_vhost_feature_enable``
+     * ``rte_eth_vhost_feature_get``
+
+
 ABI Changes
 -----------
 
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index a4435da..83063c2 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -965,31 +965,6 @@ struct vhost_xstats_name_off {
 	return 0;
 }
 
-/**
- * Disable features in feature_mask. Returns 0 on success.
- */
-int
-rte_eth_vhost_feature_disable(uint64_t feature_mask)
-{
-	return rte_vhost_feature_disable(feature_mask);
-}
-
-/**
- * Enable features in feature_mask. Returns 0 on success.
- */
-int
-rte_eth_vhost_feature_enable(uint64_t feature_mask)
-{
-	return rte_vhost_feature_enable(feature_mask);
-}
-
-/* Returns currently supported vhost features */
-uint64_t
-rte_eth_vhost_feature_get(void)
-{
-	return rte_vhost_feature_get();
-}
-
 static const struct eth_dev_ops ops = {
 	.dev_start = eth_dev_start,
 	.dev_stop = eth_dev_stop,
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index 7c98b1a..ea4bce4 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -43,36 +43,6 @@
 
 #include <rte_virtio_net.h>
 
-/**
- * Disable features in feature_mask.
- *
- * @param feature_mask
- *  Vhost features defined in "linux/virtio_net.h".
- * @return
- *  - On success, zero.
- *  - On failure, a negative value.
- */
-int rte_eth_vhost_feature_disable(uint64_t feature_mask);
-
-/**
- * Enable features in feature_mask.
- *
- * @param feature_mask
- *  Vhost features defined in "linux/virtio_net.h".
- * @return
- *  - On success, zero.
- *  - On failure, a negative value.
- */
-int rte_eth_vhost_feature_enable(uint64_t feature_mask);
-
-/**
- * Returns currently supported vhost features.
- *
- * @return
- *  Vhost features defined in "linux/virtio_net.h".
- */
-uint64_t rte_eth_vhost_feature_get(void);
-
 /*
  * Event description.
  */
diff --git a/drivers/net/vhost/rte_pmd_vhost_version.map b/drivers/net/vhost/rte_pmd_vhost_version.map
index 3d44083..695db85 100644
--- a/drivers/net/vhost/rte_pmd_vhost_version.map
+++ b/drivers/net/vhost/rte_pmd_vhost_version.map
@@ -1,9 +1,6 @@
 DPDK_16.04 {
 	global:
 
-	rte_eth_vhost_feature_disable;
-	rte_eth_vhost_feature_enable;
-	rte_eth_vhost_feature_get;
 	rte_eth_vhost_get_queue_event;
 
 	local: *;
-- 
1.9.0

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API
  2017-03-03  9:51  4% [dpdk-dev] [PATCH 00/17] vhost: generic vhost API Yuanhan Liu
  2017-03-03  9:51  3% ` [dpdk-dev] [PATCH 16/17] vhost: rename header file Yuanhan Liu
@ 2017-03-23  7:10  4% ` Yuanhan Liu
  2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 02/22] net/vhost: remove feature related APIs Yuanhan Liu
                     ` (8 more replies)
  1 sibling, 9 replies; 200+ results
From: Yuanhan Liu @ 2017-03-23  7:10 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

This patchset makes DPDK vhost library be generic enough, so that we could
build other vhost-user drivers on top of it. For example, SPDK (Storage
Performance Development Kit) is trying to enable vhost-user SCSI.

The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
the info about the virtio device (i.e. vring address, negotiated features,
etc) and let the specific vhost-user driver to fetch them (by the API
provided by DPDK vhost lib). With those info being provided, the vhost-user
driver then could get/put vring entries, thus, it could exchange data
between the guest and host.

The last patch demonstrates how to use these new APIs to implement a
very simple vhost-user net driver, without any fancy features enabled.


Major API/ABI Changes summary
=============================

- some renames
  * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
  * "rte_virtio_net.h"  ==> "rte_vhost.h"

- driver related APIs are bond with the socket file
  * rte_vhost_driver_set_features(socket_file, features);
  * rte_vhost_driver_get_features(socket_file, features);
  * rte_vhost_driver_enable_features(socket_file, features)
  * rte_vhost_driver_disable_features(socket_file, features)
  * rte_vhost_driver_callback_register(socket_file, notify_ops);
  * rte_vhost_driver_start(socket_file);
    This function replaces rte_vhost_driver_session_start(). Check patch
    18 for more information.

- new APIs to fetch guest and vring info
  * rte_vhost_get_mem_table(vid, mem);
  * rte_vhost_get_negotiated_features(vid);
  * rte_vhost_get_vhost_vring(vid, vring_idx, vring);

- new exported structures 
  * struct rte_vhost_vring
  * struct rte_vhost_mem_region
  * struct rte_vhost_memory

- a new device ops callback: features_changed().


Change log
==========

v2: - rebase
    - updated release note
    - updated API comments
    - renamed rte_vhost_get_vhost_memory to rte_vhost_get_mem_table

    - added a new device callback: features_changed(), bascially for live
      migration support
    - introduced rte_vhost_driver_start() to start a specific driver
    - misc fixes


Some design choices
===================

While making this patchset, I met quite few design choices and here are
two of them, with the issue and the reason I made such choices provided.
Please let me know if you have any comments (or better ideas).

Export public structures or not
-------------------------------

I made an ABI refactor last time (v16.07): move all the structures
internally and let applications use a "vid" to reference the internal
struct. With that, I hope we could never worry about the annoying ABI
issues.

It works great (and as expected) since then, as far as we only support
virito-net, as far as we can handle all the descs inside vhost lib. It
becomes problematic when a user wants to implement a vhost-user driver
somewhere. For example, it needs do the GPA to VVA translation. Without
any structs exported, some functions like gpa_to_vva() can't be inlined.
Calling it would be costly, especially it's a function we have to invoke
for processing each vring desc.

For that reason, the guest memory regions are exported. With that, the
gpa_to_vva could be inlined.

  
Add helper functions to fetch/update descs or not
-------------------------------------------------

I intended to do it like this way: introduce one function to get @count
of descs from a specific vring and another one to update the used descs.
It's something like
    rte_vhost_vring_get_descs(vid, vring_idx, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vid, vring_idx, count, offset, descs);

With that, vhost-user driver programmer's task would be easier, as he/she
doesn't have to parse the descs any more (such as to handle indirect desc).

But judging that virtio 1.1 is just emerged and it proposes a completely
ring layout, and most importantly, the vring desc structure is also changed,
I'd like to hold the introducation of such two functions. Otherwise, it's
very likely the two will be invalid when virtio 1.1 is out. Though I think
it may could be addressed with a care design, something like making the IOV
generic enough:

	struct rte_vhost_iov {
		uint64_t	gpa;
		uint64_t	vva;
		uint64_t	len;
	};

Instead, I go with the other way: introduce few APIs to export all the vring
infos (vring size, vring addr, callfd, etc), and let the vhost-user driver
read and update the descs. Those info could be passed to vhost-user driver
by introducing one API for each, but for saving few APIs and reducing few
calls for the programmer, I packed few key fields into a new structure, so
that it can be fetched with one call:
        struct rte_vhost_vring {
                struct vring_desc       *desc;
                struct vring_avail      *avail;
                struct vring_used       *used;
                uint64_t                log_guest_addr;
       
                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

When virtio 1.1 comes out, likely a simple change like following would
just work:
        struct rte_vhost_vring {
		union {
			struct {
                		struct vring_desc       *desc;
                		struct vring_avail      *avail;
                		struct vring_used       *used;
                		uint64_t                log_guest_addr;
			};
			struct desc	*desc_1_1;	/* vring addr for virtio 1.1 */
		};
       
                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

AFAIK, it's not an ABI breakage. Even if it does, we could introduce a new
API to get the virtio 1.1 ring address.

Those fields are the minimum set I got for a specific vring, with the mind
it would bring the minimum chance to break ABI for future extension. If we
need more info, we could introduce a new API.

OTOH, for getting the best performance, the two functions also have to be
inlined ("vid + vring_idx" combo is replaced with "vring"):
    rte_vhost_vring_get_descs(vring, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vring, count, offset, descs);

That said, one way or another, we have to export rte_vhost_vring struct.
For this reason, I didn't rush into introducing the two APIs.

	--yliu

---
Yuanhan Liu (22):
  vhost: introduce driver features related APIs
  net/vhost: remove feature related APIs
  vhost: use new APIs to handle features
  vhost: make notify ops per vhost driver
  vhost: export guest memory regions
  vhost: introduce API to fetch negotiated features
  vhost: export vhost vring info
  vhost: export API to translate gpa to vva
  vhost: turn queue pair to vring
  vhost: export the number of vrings
  vhost: move the device ready check at proper place
  vhost: drop the Rx and Tx queue macro
  vhost: do not include net specific headers
  vhost: rename device ops struct
  vhost: rename virtio-net to vhost
  vhost: add features changed callback
  vhost: export APIs for live migration support
  vhost: introduce API to start a specific driver
  vhost: rename header file
  vhost: workaround the build dependency on mbuf header
  vhost: do not destroy device on repeat mem table message
  examples/vhost: demonstrate the new generic vhost APIs

 doc/guides/prog_guide/vhost_lib.rst         |  42 +--
 doc/guides/rel_notes/deprecation.rst        |   9 -
 doc/guides/rel_notes/release_17_05.rst      |  40 +++
 drivers/net/vhost/rte_eth_vhost.c           | 101 ++-----
 drivers/net/vhost/rte_eth_vhost.h           |  32 +--
 drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
 examples/tep_termination/main.c             |  23 +-
 examples/tep_termination/main.h             |   2 +
 examples/tep_termination/vxlan_setup.c      |   2 +-
 examples/vhost/Makefile                     |   2 +-
 examples/vhost/main.c                       | 100 +++++--
 examples/vhost/main.h                       |  33 ++-
 examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++
 lib/librte_vhost/Makefile                   |   4 +-
 lib/librte_vhost/fd_man.c                   |   9 +-
 lib/librte_vhost/fd_man.h                   |   2 +-
 lib/librte_vhost/rte_vhost.h                | 423 ++++++++++++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map      |  17 +-
 lib/librte_vhost/rte_virtio_net.h           | 208 --------------
 lib/librte_vhost/socket.c                   | 222 ++++++++++++---
 lib/librte_vhost/vhost.c                    | 229 ++++++++-------
 lib/librte_vhost/vhost.h                    | 113 +++++---
 lib/librte_vhost/vhost_user.c               | 115 ++++----
 lib/librte_vhost/vhost_user.h               |   2 +-
 lib/librte_vhost/virtio_net.c               |  71 ++---
 25 files changed, 1523 insertions(+), 686 deletions(-)
 create mode 100644 examples/vhost/virtio_net.c
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 1/7] net/ark: PMD for Atomic Rules Arkville driver stub
@ 2017-03-23  1:03  3% Ed Czeck
  2017-03-23 22:59  3% ` [dpdk-dev] [PATCH v5 " Ed Czeck
  2017-03-29  1:04  3% ` [dpdk-dev] [PATCH v6 " Ed Czeck
  0 siblings, 2 replies; 200+ results
From: Ed Czeck @ 2017-03-23  1:03 UTC (permalink / raw)
  To: dev; +Cc: Ed Czeck

Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time

v4:
* Address issues report from review
* Add internal comments on driver arg
* provide a bare-biones dev init to avoid compiler warnings

v3:
* Split large patch into several smaller ones

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
---
 MAINTAINERS                                 |   8 +
 config/common_base                          |  10 +
 config/defconfig_arm-armv7a-linuxapp-gcc    |   1 +
 config/defconfig_ppc_64-power8-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                     | 242 +++++++++++++++++++++
 doc/guides/nics/features/ark.ini            |  15 ++
 doc/guides/nics/index.rst                   |   1 +
 drivers/net/Makefile                        |   1 +
 drivers/net/ark/Makefile                    |  62 ++++++
 drivers/net/ark/ark_debug.h                 |  71 +++++++
 drivers/net/ark/ark_ethdev.c                | 316 ++++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h                |  39 ++++
 drivers/net/ark/ark_global.h                | 108 ++++++++++
 drivers/net/ark/rte_pmd_ark_version.map     |   4 +
 mk/rte.app.mk                               |   1 +
 15 files changed, 880 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 doc/guides/nics/features/ark.ini
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_debug.h
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 0c78b58..19ee27f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -278,6 +278,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ARK
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index 37aa1e1..4feb5e4 100644
--- a/config/common_base
+++ b/config/common_base
@@ -353,6 +353,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+#
 # Compile the TAP PMD
 # It is enabled by default for Linux only.
 #
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 35f7fb6..89bc396 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
 # Note: Initially, all of the PMD drivers compilation are turned off on Power
 # Will turn on them only after the successful testing on Power
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..ff3090a
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,242 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_PAD_TX** (default y):  When enabled TX
+     packets are padded to 60 bytes to support downstream MACS.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK
+Release Notes*.  ARM and PowerPC architectures are not supported at this time.
+
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/features/ark.ini b/doc/guides/nics/features/ark.ini
new file mode 100644
index 0000000..dc8a0e2
--- /dev/null
+++ b/doc/guides/nics/features/ark.ini
@@ -0,0 +1,15 @@
+;
+; Supported features of the 'ark' poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Queue start/stop     = Y
+Jumbo frame          = Y
+Scattered Rx         = Y
+Basic stats          = Y
+Stats per queue      = Y
+FW version           = Y
+Linux UIO            = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 87f9334..381d82c 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     bnx2x
     bnxt
     cxgbe
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a16f25e..ea9868b 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
 DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..cf5d618
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,62 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += ark_ethdev.c
+
+
+# this lib depends upon:
+DEPDIRS-y += lib/librte_mbuf
+DEPDIRS-y += lib/librte_ether
+DEPDIRS-y += lib/librte_kvargs
+DEPDIRS-y += lib/librte_eal
+DEPDIRS-y += lib/librte_mempool
+
+LDLIBS += lpthread
+LDLIBS += ldl
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_debug.h b/drivers/net/ark/ark_debug.h
new file mode 100644
index 0000000..52b08a1
--- /dev/null
+++ b/drivers/net/ark/ark_debug.h
@@ -0,0 +1,71 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIu32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIu64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+#define ARK_TRACE_ON(fmt, ...) \
+	PMD_DRV_LOG(ERR, fmt, ##__VA_ARGS__)
+
+#define ARK_TRACE_OFF(fmt, ...) \
+	do {if (0) PMD_DRV_LOG(ERR, fmt, ##__VA_ARGS__); } while (0)
+
+/* Debug macro for reporting Packet stats */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define ARK_DEBUG_STATS(fmt, ...) ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_STATS(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for tracing full behavior*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* tracing including the function name */
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..6ae5ffc
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,316 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+#include <rte_vdev.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+				 struct rte_eth_dev_info *dev_info);
+
+#define ARK_DEV_TO_PCI(eth_dev)			\
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+#define ARK_MAX_ARG_LEN 256
+static uint32_t pkt_dir_v;
+static char pkt_gen_args[ARK_MAX_ARG_LEN];
+static char pkt_chkr_args[ARK_MAX_ARG_LEN];
+
+/*
+ * The packet generator is a functional block used to generate egress packet
+ * patterns.
+ */
+#define ARK_PKTGEN_ARG "Pkt_gen"
+
+/*
+ * The packet checker is a functional block used to test ingress packet
+ * patterns.
+ */
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+
+/*
+ * The packet director is used to select the internal ingress and egress packets
+ * paths.
+ */
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+#define ARK_RX_MAX_QUEUE (4096 * 4)
+#define ARK_RX_MIN_QUEUE (512)
+#define ARK_TX_MAX_QUEUE (4096 * 4)
+#define ARK_TX_MIN_QUEUE (256)
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	NULL
+};
+
+#define MAX_ARK_PHYS 16
+struct ark_adapter *gark[MAX_ARK_PHYS];
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
+	},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_infos_get = eth_ark_dev_info_get,
+};
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev __rte_unused)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret = -1;
+
+	ark->eth_dev = dev;
+
+	ARK_DEBUG_TRACE("eth_ark_dev_init(struct rte_eth_dev *dev)\n");
+	gark[0] = ark;
+
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	if (pci_dev->device.devargs)
+		eth_ark_check_args(pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+
+	ark->bar0 = (uint8_t *)pci_dev->mem_resource[0].addr;
+	ark->a_bar = (uint8_t *)pci_dev->mem_resource[2].addr;
+
+	dev->dev_ops = &ark_eth_dev_ops;
+
+	return ret;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	ARK_DEBUG_TRACE("ARKP: In %s\n", __func__);
+	return 0;
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+		     struct rte_eth_dev_info *dev_info)
+{
+	/* device specific configuration */
+	memset(dev_info, 0, sizeof(*dev_info));
+
+	dev_info->max_rx_pktlen = (16 * 1024) - 128;
+	dev_info->min_rx_bufsize = 1024;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_RX_MAX_QUEUE,
+		.nb_min = ARK_RX_MIN_QUEUE,
+		.nb_align = ARK_RX_MIN_QUEUE}; /* power of 2 */
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = ARK_TX_MAX_QUEUE,
+		.nb_min = ARK_TX_MIN_QUEUE,
+		.nb_align = ARK_TX_MIN_QUEUE}; /* power of 2 */
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
+				ETH_LINK_SPEED_10G |
+				ETH_LINK_SPEED_25G |
+				ETH_LINK_SPEED_40G |
+				ETH_LINK_SPEED_50G |
+				ETH_LINK_SPEED_100G);
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+		   void *extra_args __rte_unused)
+{
+	ARK_DEBUG_TRACE("In process_pktdir_arg, key = %s, value = %s\n",
+			key, value);
+	pkt_dir_v = strtol(value, NULL, 16);
+	ARK_DEBUG_TRACE("pkt_dir_v = 0x%x\n", pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktgen_arg, key = %s, value = %s\n",
+			key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[ARK_MAX_ARG_LEN];
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	ARK_DEBUG_TRACE("file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	pkt_gen_args[0] = 0;
+	pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		ARK_DEBUG_TRACE("**** Arg passed to PMD = %s:%s\n", pair->key,
+				pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTDIR_ARG,
+			       &process_pktdir_arg,
+			       NULL) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTGEN_ARG,
+			       &process_file_args,
+			       pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTCHKR_ARG,
+			       &process_file_args,
+			       pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+	}
+
+	ARK_DEBUG_TRACE("INFO: packet director set to 0x%x\n", pkt_dir_v);
+
+	return 1;
+}
+
+static int
+pmd_ark_probe(const char *name, const char *params)
+{
+	RTE_LOG(INFO, PMD, "Initializing pmd_ark for %s params = %s\n", name,
+		params);
+	eth_ark_check_args(params);
+	return 0;
+}
+
+static int
+pmd_ark_remove(const char *name)
+{
+	RTE_LOG(INFO, PMD, "Closing ark %s ethdev on numa socket %u\n", name,
+		rte_socket_id());
+	return 1;
+}
+
+/*
+ * Although Arkville is a physical device we take advantage of the virtual
+ * device initialization as a per test runtime initialization for
+ * regression testing.  Parameters are passed into the virtual device to
+ * configure the packet generator, packet director and packet checker.
+ */
+static struct rte_vdev_driver pmd_ark_drv = {
+	.probe = pmd_ark_probe,
+	.remove = pmd_ark_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_ark, pmd_ark_drv);
+RTE_PMD_REGISTER_ALIAS(net_ark, eth_ark);
+RTE_PMD_REGISTER_PCI(eth_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(eth_ark, pci_id_ark_map);
+RTE_PMD_REGISTER_PARAM_STRING(eth_ark,
+			      ARK_PKTGEN_ARG "=<filename> "
+			      ARK_PKTCHKR_ARG "=<filename> "
+			      ARK_PKTDIR_ARG "=<bitmap>");
+
+
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..08d7fb1
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,39 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+/* STUB */
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..7cd62d5
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,108 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Common for both PF and VF */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* Application Bar */
+	uint8_t *a_bar;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..7f84780
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_2.0 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 0e0b600..da23898 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -104,6 +104,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 1/7] net/ark: PMD for Atomic Rules Arkville driver stub
  2017-03-21 21:43  3% [dpdk-dev] [PATCH v3 1/7] net/ark: PMD for Atomic Rules Arkville driver stub Ed Czeck
@ 2017-03-22 18:16  0% ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2017-03-22 18:16 UTC (permalink / raw)
  To: Ed Czeck, dev

On 3/21/2017 9:43 PM, Ed Czeck wrote:
> Enable Arkville on supported configurations
> Add overview documentation
> Minimum driver support for valid compile
> 
> 
> Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
> ---
>  MAINTAINERS                                 |   8 +
>  config/common_base                          |  11 ++
>  config/defconfig_arm-armv7a-linuxapp-gcc    |   1 +
>  config/defconfig_ppc_64-power8-linuxapp-gcc |   1 +
>  doc/guides/nics/ark.rst                     | 237 +++++++++++++++++++++++
>  doc/guides/nics/features/ark.ini            |  15 ++
>  doc/guides/nics/index.rst                   |   1 +
>  drivers/net/Makefile                        |   1 +
>  drivers/net/ark/Makefile                    |  63 +++++++
>  drivers/net/ark/ark_debug.h                 |  74 ++++++++
>  drivers/net/ark/ark_ethdev.c                | 281 ++++++++++++++++++++++++++++
>  drivers/net/ark/ark_ethdev.h                |  39 ++++
>  drivers/net/ark/ark_global.h                | 108 +++++++++++
>  drivers/net/ark/rte_pmd_ark_version.map     |   4 +
>  mk/rte.app.mk                               |   1 +
>  15 files changed, 845 insertions(+)
>  create mode 100644 doc/guides/nics/ark.rst
>  create mode 100644 doc/guides/nics/features/ark.ini
>  create mode 100644 drivers/net/ark/Makefile
>  create mode 100644 drivers/net/ark/ark_debug.h
>  create mode 100644 drivers/net/ark/ark_ethdev.c
>  create mode 100644 drivers/net/ark/ark_ethdev.h
>  create mode 100644 drivers/net/ark/ark_global.h
>  create mode 100644 drivers/net/ark/rte_pmd_ark_version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0c78b58..8043d75 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -278,6 +278,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
>  F: drivers/net/ena/
>  F: doc/guides/nics/ena.rst
>  
> +Atomic Rules ark

Should prefer uppercase "ARK" here?

> +M: Shepard Siegel <shepard.siegel@atomicrules.com>
> +M: Ed Czeck       <ed.czeck@atomicrules.com>
> +M: John Miller    <john.miller@atomicrules.com>
> +F: /drivers/net/ark/

Can you please drop the leading "/". There is a script
"check-maintainers.sh", which is broken with that.

> +F: doc/guides/nics/ark.rst
> +F: doc/guides/nics/features/ark.ini
> +
>  Broadcom bnxt
>  M: Stephen Hurd <stephen.hurd@broadcom.com>
>  M: Ajit Khaparde <ajit.khaparde@broadcom.com>
> diff --git a/config/common_base b/config/common_base
> index 37aa1e1..0916c44 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -353,6 +353,17 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
>  CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
>  
>  #
> +# Compile ARK PMD
> +#
> +CONFIG_RTE_LIBRTE_ARK_PMD=y
> +CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
> +CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
> +CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
> +CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
> +CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
> +
> +

Extra line

> +#
>  # Compile the TAP PMD
>  # It is enabled by default for Linux only.
>  #
> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
> index d9bd2a8..6d2b5e0 100644
> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
> @@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
>  
>  # cannot use those on ARM
>  CONFIG_RTE_KNI_KMOD=n
> +CONFIG_RTE_LIBRTE_ARK_PMD=n
>  CONFIG_RTE_LIBRTE_EM_PMD=n
>  CONFIG_RTE_LIBRTE_IGB_PMD=n
>  CONFIG_RTE_LIBRTE_CXGBE_PMD=n
> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
> index 35f7fb6..89bc396 100644
> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
> @@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
>  
>  # Note: Initially, all of the PMD drivers compilation are turned off on Power
>  # Will turn on them only after the successful testing on Power
> +CONFIG_RTE_LIBRTE_ARK_PMD=n

Is it not tested or known that it is not supported?

>  CONFIG_RTE_LIBRTE_IXGBE_PMD=n
>  CONFIG_RTE_LIBRTE_I40E_PMD=n
>  CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
> diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
> new file mode 100644
> index 0000000..72fb8d6
> --- /dev/null
> +++ b/doc/guides/nics/ark.rst
> @@ -0,0 +1,237 @@
> +.. BSD LICENSE
> +
> +    Copyright (c) 2015-2017 Atomic Rules LLC
> +    All rights reserved.
> +
> +    Redistribution and use in source and binary forms, with or without
> +    modification, are permitted provided that the following conditions
> +    are met:
> +
> +    * Redistributions of source code must retain the above copyright
> +    notice, this list of conditions and the following disclaimer.
> +    * Redistributions in binary form must reproduce the above copyright
> +    notice, this list of conditions and the following disclaimer in
> +    the documentation and/or other materials provided with the
> +    distribution.
> +    * Neither the name of Atomic Rules LLC nor the names of its
> +    contributors may be used to endorse or promote products derived
> +    from this software without specific prior written permission.
> +
> +    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +ARK Poll Mode Driver
> +====================
> +
> +The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
> +(ARK) family of devices.
> +
> +More information can be found at the `Atomic Rules website
> +<http://atomicrules.com>`_.
> +
> +Overview
> +--------
> +
> +The Atomic Rules Arkville product is DPDK and AXI compliant product
> +that marshals packets across a PCIe conduit between host DPDK mbufs and
> +FPGA AXI streams.
> +
> +The ARK PMD, and the spirit of the overall Arkville product,
> +has been to take the DPDK API/ABI as a fixed specification;
> +then implement much of the business logic in FPGA RTL circuits.
> +The approach of *working backwards* from the DPDK API/ABI and having
> +the GPP host software *dictate*, while the FPGA hardware *copes*,
> +results in significant performance gains over a naive implementation.
> +
> +While this document describes the ARK PMD software, it is helpful to
> +understand what the FPGA hardware is and is not. The Arkville RTL
> +component provides a single PCIe Physical Function (PF) supporting
> +some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
> +the Arkville core through a dedicated opaque Core BAR (CBAR).
> +To allow users full freedom for their own FPGA application IP,
> +an independent FPGA Application BAR (ABAR) is provided.
> +
> +One popular way to imagine Arkville's FPGA hardware aspect is as the
> +FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
> +not contain any MACs, and is link-speed independent, as well as
> +agnostic to the number of physical ports the application chooses to
> +use. The ARK driver exposes the familiar PMD interface to allow packet
> +movement to and from mbufs across multiple queues.
> +
> +However FPGA RTL applications could contain a universe of added
> +functionality that an Arkville RTL core does not provide or can
> +not anticipate. To allow for this expectation of user-defined
> +innovation, the ARK PMD provides a dynamic mechanism of adding
> +capabilities without having to modify the ARK PMD.
> +
> +The ARK PMD is intended to support all instances of the Arkville
> +RTL Core, regardless of configuration, FPGA vendor, or target
> +board. While specific capabilities such as number of physical
> +hardware queue-pairs are negotiated; the driver is designed to
> +remain constant over a broad and extendable feature set.
> +
> +Intentionally, Arkville by itself DOES NOT provide common NIC
> +capabilities such as offload or receive-side scaling (RSS).
> +These capabilities would be viewed as a gate-level "tax" on
> +Green-box FPGA applications that do not require such function.
> +Instead, they can be added as needed with essentially no
> +overhead to the FPGA Application.
> +
> +Data Path Interface
> +-------------------
> +
> +Ingress RX and Egress TX operation is by the nominal DPDK API .
> +The driver supports single-port, multi-queue for both RX and TX.
> +
> +Refer to ``ark_ethdev.h`` for the list of supported methods to
> +act upon RX and TX Queues.
> +
> +Configuration Information
> +-------------------------
> +
> +**DPDK Configuration Parameters**
> +
> +  The following configuration options are available for the ARK PMD:
> +
> +   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
> +     of the ARK PMD driver in the DPDK compilation.

Missing "CONFIG_RTE_LIBRTE_ARK_PAD_TX"

> +
> +   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
> +     logging and internal checking of RX ingress logic within the ARK PMD driver.
> +
> +   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
> +     logging and internal checking of TX egress logic within the ARK PMD driver.
> +
> +   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
> +     logging of detailed packet and performance statistics gathered in
> +     the PMD and FPGA.
> +
> +   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
> +     logging of detailed PMD events and status.
> +
> +

Can you also please document the device arguments in this file?

> +Building DPDK
> +-------------
> +
> +See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
> +instructions on how to build DPDK.
> +
> +By default the ARK PMD library will be built into the DPDK library.
> +
> +For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
> +documentation that comes with DPDK suite <linux_gsg>`.
> +
> +Supported ARK RTL PCIe Instances
> +--------------------------------
> +
> +ARK PMD supports the following Arkville RTL PCIe instances including:
> +
> +* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
> +* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
> +
> +Supported Operating Systems
> +---------------------------
> +
> +Any Linux distribution fulfilling the conditions described in ``System Requirements``
> +section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*.
> +
> +Supported Features
> +------------------
> +
> +* Dynamic ARK PMD extensions
> +* Multiple receive and transmit queues
> +* Jumbo frames up to 9K
> +* Hardware Statistics
> +
> +Unsupported Features
> +--------------------
> +
> +Features that may be part of, or become part of, the Arkville RTL IP that are
> +not currently supported or exposed by the ARK PMD include:
> +
> +* PCIe SR-IOV Virtual Functions (VFs)
> +* Arkville's Packet Generator Control and Status
> +* Arkville's Packet Director Control and Status
> +* Arkville's Packet Checker Control and Status
> +* Arkville's Timebase Management
> +
> +Pre-Requisites
> +--------------
> +
> +#. Prepare the system as recommended by DPDK suite.  This includes environment
> +   variables, hugepages configuration, tool-chains and configuration
> +
> +#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
> +
> +#. Bind the intended ARK device to igb_uio module
> +
> +At this point the system should be ready to run DPDK applications. Once the
> +application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
> +
> +Usage Example
> +-------------
> +
> +This section demonstrates how to launch **testpmd** with Atomic Rules ARK
> +devices managed by librte_pmd_ark.
> +
> +#. Load the kernel modules:
> +
> +   .. code-block:: console
> +
> +      modprobe uio
> +      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> +
> +   .. note::
> +
> +      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
> +
> +#. Mount and request huge pages:
> +
> +   .. code-block:: console
> +
> +      mount -t hugetlbfs nodev /mnt/huge
> +      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> +
> +#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
> +
> +   .. code-block:: console
> +
> +      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
> +
> +   .. note::
> +
> +      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
> +      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
> +      and thus determine the correct 4-tuple argument to dpdk-devbind.py
> +
> +#. Start testpmd with basic parameters:
> +
> +   .. code-block:: console
> +
> +      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
> +
> +   Example output:
> +
> +   .. code-block:: console
> +
> +      [...]
> +      EAL: PCI device 0000:01:00.0 on NUMA socket -1
> +      EAL:   probe driver: 1d6c:100e rte_ark_pmd
> +      EAL:   PCI memory mapped at 0x7f9b6c400000
> +      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
> +      ARKP PMD CommitID: 378f3a67
> +      Configuring Port 0 (socket 0)
> +      Port 0: DC:3C:F6:00:00:01
> +      Checking link statuses...
> +      Port 0 Link Up - speed 100000 Mbps - full-duplex
> +      Done
> +      testpmd>
> diff --git a/doc/guides/nics/features/ark.ini b/doc/guides/nics/features/ark.ini
> new file mode 100644
> index 0000000..dc8a0e2
> --- /dev/null
> +++ b/doc/guides/nics/features/ark.ini
> @@ -0,0 +1,15 @@
> +;
> +; Supported features of the 'ark' poll mode driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Queue start/stop     = Y
> +Jumbo frame          = Y
> +Scattered Rx         = Y
> +Basic stats          = Y
> +Stats per queue      = Y
> +FW version           = Y

Features can be added with the patch that adds functionality. I believe
above features not supported with current patch.

> +Linux UIO            = Y
> +x86-64               = Y
> +Usage doc            = Y
> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
> index 87f9334..381d82c 100644
> --- a/doc/guides/nics/index.rst
> +++ b/doc/guides/nics/index.rst
> @@ -36,6 +36,7 @@ Network Interface Controller Drivers
>      :numbered:
>  
>      overview
> +    ark
>      bnx2x
>      bnxt
>      cxgbe
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index a16f25e..ea9868b 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -32,6 +32,7 @@
>  include $(RTE_SDK)/mk/rte.vars.mk
>  
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
> +DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
>  DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
>  DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
> diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
> new file mode 100644
> index 0000000..217bd34
> --- /dev/null
> +++ b/drivers/net/ark/Makefile
> @@ -0,0 +1,63 @@
> +# BSD LICENSE
> +#
> +# Copyright (c) 2015-2017 Atomic Rules LLC
> +# All rights reserved.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of copyright holder nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_ark.a
> +
> +CFLAGS += -O3 -I./
> +CFLAGS += $(WERROR_FLAGS)
> +
> +EXPORT_MAP := rte_pmd_ark_version.map
> +
> +LIBABIVER := 1
> +
> +#
> +# all source are stored in SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD)

No need to put config option to comment, SRCS-y looks more proper.

> +#
> +
> +SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
> +
> +
> +# this lib depends upon:
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mbuf
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_ether
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_kvargs
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_eal
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mempool

> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libpthread
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libdl

DEPDIRS is for internal library dependencies. Please use LDLIBS for
external dependencies, like:

LDLIBS += -lpthread
LDLIBS += -ldl

> +
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/net/ark/ark_debug.h b/drivers/net/ark/ark_debug.h
> new file mode 100644
> index 0000000..a108c28
> --- /dev/null
> +++ b/drivers/net/ark/ark_debug.h
> @@ -0,0 +1,74 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright (c) 2015-2017 Atomic Rules LLC
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of copyright holder nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _ARK_DEBUG_H_
> +#define _ARK_DEBUG_H_
> +
> +#include <inttypes.h>
> +#include <rte_log.h>
> +
> +/* Format specifiers for string data pairs */
> +#define ARK_SU32  "\n\t%-20s    %'20" PRIu32
> +#define ARK_SU64  "\n\t%-20s    %'20" PRIu64
> +#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
> +#define ARK_SPTR  "\n\t%-20s    %20p"
> +
> +#define ARK_TRACE_ON(fmt, ...) \
> +	PMD_DRV_LOG(ERR, fmt, ##__VA_ARGS__)
> +
> +#define ARK_TRACE_OFF(fmt, ...) \
> +	do {if (0) PMD_DRV_LOG(ERR, fmt, ##__VA_ARGS__); } while (0)
> +
> +/* Debug macro for reporting Packet stats */
> +#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
> +#define ARK_DEBUG_STATS(fmt, ...) ARK_TRACE_ON(fmt, ##__VA_ARGS__)
> +#else
> +#define ARK_DEBUG_STATS(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
> +#endif
> +
> +/* Debug macro for tracing full behavior*/
> +#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
> +#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_ON(fmt, ##__VA_ARGS__)
> +#else
> +#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
> +#endif
> +
> +#ifdef ARK_STD_LOG

How this define passed? Should it be something like
RTE_LIBRTE_ARK_DEBUG_DRIVER config option?

> +#define PMD_DRV_LOG(level, fmt, args...) \
> +	fprintf(stderr, fmt, args)

It is possible to use rte log functions instead of fprintf to stderr.

> +#else
> +#define PMD_DRV_LOG(level, fmt, args...) \
> +	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
> +#endif
> +
> +#endif

CONFIG_RTE_LIBRTE_ARK_DEBUG_RX, CONFIG_RTE_LIBRTE_ARK_DEBUG_TX not used,
if so can be removed.

> diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
> new file mode 100644
> index 0000000..124b73c
> --- /dev/null
> +++ b/drivers/net/ark/ark_ethdev.c
> @@ -0,0 +1,281 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright (c) 2015-2017 Atomic Rules LLC
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of copyright holder nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <unistd.h>
> +#include <sys/stat.h>
> +#include <dlfcn.h>
> +
> +#include <rte_kvargs.h>
> +#include <rte_vdev.h>
> +
> +#include "ark_global.h"
> +#include "ark_debug.h"
> +#include "ark_ethdev.h"
> +
> +/*  Internal prototypes */
> +static int eth_ark_check_args(const char *params);
> +static int eth_ark_dev_init(struct rte_eth_dev *dev);
> +static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
> +static int eth_ark_dev_configure(struct rte_eth_dev *dev);
> +static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
> +				 struct rte_eth_dev_info *dev_info);
> +
> +
> +#define ARK_DEV_TO_PCI(eth_dev)			\
> +	RTE_DEV_TO_PCI((eth_dev)->device)
> +
> +#define ARK_MAX_ARG_LEN 256
> +static uint32_t pkt_dir_v;
> +static char pkt_gen_args[ARK_MAX_ARG_LEN];
> +static char pkt_chkr_args[ARK_MAX_ARG_LEN];
> +
> +#define ARK_PKTGEN_ARG "Pkt_gen"
> +#define ARK_PKTCHKR_ARG "Pkt_chkr"
> +#define ARK_PKTDIR_ARG "Pkt_dir"

Is it possible to add one line comments to device arguments. For example
what "Pkt_dir" (packet director) is for?

> +
> +static const char * const valid_arguments[] = {
> +	ARK_PKTGEN_ARG,
> +	ARK_PKTCHKR_ARG,
> +	ARK_PKTDIR_ARG,
> +	"iface",

Why not make this one too a define?

> +	NULL
> +};
> +
> +#define MAX_ARK_PHYS 16
> +struct ark_adapter *gark[MAX_ARK_PHYS];
> +
> +static const struct rte_pci_id pci_id_ark_map[] = {
> +	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
> +	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
> +	{.vendor_id = 0, /* sentinel */ },
> +};
> +
> +static struct eth_driver rte_ark_pmd = {
> +	.pci_drv = {
> +		.probe = rte_eth_dev_pci_probe,
> +		.remove = rte_eth_dev_pci_remove,
> +		.id_table = pci_id_ark_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
> +	},
> +	.eth_dev_init = eth_ark_dev_init,
> +	.eth_dev_uninit = eth_ark_dev_uninit,
> +	.dev_private_size = sizeof(struct ark_adapter),
> +};
> +
> +static const struct eth_dev_ops ark_eth_dev_ops = {
> +	.dev_configure = eth_ark_dev_configure,
> +	.dev_infos_get = eth_ark_dev_info_get,
> +

Extra line.

> +};
> +
> +

Extra line.

> +static int
> +eth_ark_dev_init(struct rte_eth_dev *dev __rte_unused)
> +{
> +	return -1;					/* STUB */

You may want to set ark_eth_dev_ops here, since they already implemented.

And for proper .dev_infos_get implementation, you may want to have [1] here:

[1]
rte_eth_copy_pci_info(eth_dev, pci_dev);
eth_dev->data->dev_flags |= RTE_ETH_DEV_DETACHABLE;

Also you may want to parse device arguments in this stage.

> +}
> +
> +
> +static int
> +eth_ark_dev_uninit(struct rte_eth_dev *dev)
> +{
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +		return 0;
> +
> +	dev->dev_ops = NULL;
> +	dev->rx_pkt_burst = NULL;
> +	dev->tx_pkt_burst = NULL;
> +	return 0;
> +}
> +
> +static int
> +eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
> +{
> +	ARK_DEBUG_TRACE("ARKP: In %s\n", __func__);
> +	return 0;
> +}
> +
> +static void
> +eth_ark_dev_info_get(struct rte_eth_dev *dev,
> +		     struct rte_eth_dev_info *dev_info)
> +{
> +	/* device specific configuration */
> +	memset(dev_info, 0, sizeof(*dev_info));

memset not required, since already done by ethdev abstraction layer,
specially desc_lim values already overwritten below.

> +
> +	dev_info->max_rx_pktlen = (16 * 1024) - 128;
> +	dev_info->min_rx_bufsize = 1024;

Using some macros instead of hardcoded values helps to understand values.

> +	dev_info->rx_offload_capa = 0;
> +	dev_info->tx_offload_capa = 0;
> +
> +	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
> +		.nb_max = 4096 * 4,
> +		.nb_min = 512,	/* HW Q size for RX */
> +		.nb_align = 2,};
> +
> +	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
> +		.nb_max = 4096 * 4,
> +		.nb_min = 256,	/* HW Q size for TX */
> +		.nb_align = 2,};
> +
> +	dev_info->rx_offload_capa = 0;
> +	dev_info->tx_offload_capa = 0;

Dublication, please check ~10 lines above.
Also not required to set 0 at all because of memset.

> +
> +	/* ARK PMD supports all line rates, how do we indicate that here ?? */
> +	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
> +				ETH_LINK_SPEED_10G |
> +				ETH_LINK_SPEED_25G |
> +				ETH_LINK_SPEED_40G |
> +				ETH_LINK_SPEED_50G |
> +				ETH_LINK_SPEED_100G);
> +	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
> +	dev_info->driver_name = dev->data->drv_name;

setting driver_name not required, ethdev layer will overwrite this value.

And to have driver_name correct, rte_eth_copy_pci_info() should be
called, please check above [1].

> +}
> +
> +
> +static inline int
> +process_pktdir_arg(const char *key, const char *value,
> +		   void *extra_args __rte_unused)
> +{
> +	ARK_DEBUG_TRACE("In process_pktdir_arg, key = %s, value = %s\n",
> +			key, value);

The general usage of DEBUG_TRACE is providing backtrace log, function
enterance / exit informations. I guess, that is why it has been
controlled by different config option.
Here what you need looks like regular debugging functions, PMD_DRV_LOG /
RTE_LOG variant.

> +	pkt_dir_v = strtol(value, NULL, 16);
> +	ARK_DEBUG_TRACE("pkt_dir_v = 0x%x\n", pkt_dir_v);
> +	return 0;
> +}
> +
> +static inline int
> +process_file_args(const char *key, const char *value, void *extra_args)
> +{
> +	ARK_DEBUG_TRACE("**** IN process_pktgen_arg, key = %s, value = %s\n",
> +			key, value);
> +	char *args = (char *)extra_args;
> +
> +	/* Open the configuration file */
> +	FILE *file = fopen(value, "r");
> +	char line[256];
> +	int first = 1;
> +
> +	while (fgets(line, sizeof(line), file)) {
> +		/* ARK_DEBUG_TRACE("%s\n", line); */

Please remove dead code.

> +		if (first) {
> +			strncpy(args, line, ARK_MAX_ARG_LEN);

Can this overflow args variable? Any way to prevent possible crash?

> +			first = 0;
> +		} else {
> +			strncat(args, line, ARK_MAX_ARG_LEN);
> +		}
> +	}
> +	ARK_DEBUG_TRACE("file = %s\n", args);
> +	fclose(file);
> +	return 0;
> +}
> +
> +static int
> +eth_ark_check_args(const char *params)
> +{
> +	struct rte_kvargs *kvlist;
> +	unsigned int k_idx;
> +	struct rte_kvargs_pair *pair = NULL;
> +
> +	kvlist = rte_kvargs_parse(params, valid_arguments);
> +	if (kvlist == NULL)
> +		return 0;
> +
> +	pkt_gen_args[0] = 0;
> +	pkt_chkr_args[0] = 0;
> +
> +	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
> +		pair = &kvlist->pairs[k_idx];
> +		ARK_DEBUG_TRACE("**** Arg passed to PMD = %s:%s\n", pair->key,
> +				pair->value);
> +	}
> +
> +	if (rte_kvargs_process(kvlist,
> +			       ARK_PKTDIR_ARG,
> +			       &process_pktdir_arg,
> +			       NULL) != 0) {
> +		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
> +	}
> +
> +	if (rte_kvargs_process(kvlist,
> +			       ARK_PKTGEN_ARG,
> +			       &process_file_args,
> +			       pkt_gen_args) != 0) {
> +		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
> +	}
> +
> +	if (rte_kvargs_process(kvlist,
> +			       ARK_PKTCHKR_ARG,
> +			       &process_file_args,
> +			       pkt_chkr_args) != 0) {
> +		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
> +	}

Not processing "iface" device argument?

> +
> +	ARK_DEBUG_TRACE("INFO: packet director set to 0x%x\n", pkt_dir_v);
> +
> +	return 1;
> +}
> +
> +
> +/*
> + * PCIE

Can you elaborate this comment?

> + */
> +static int
> +pmd_ark_probe(const char *name, const char *params)
> +{
> +	RTE_LOG(INFO, PMD, "Initializing pmd_ark for %s params = %s\n", name,
> +		params);
> +
> +	/* Parse off the v index */
> +
> +	eth_ark_check_args(params);
> +	return 0;
> +}
> +
> +static int
> +pmd_ark_remove(const char *name)
> +{
> +	RTE_LOG(INFO, PMD, "Closing ark %s ethdev on numa socket %u\n", name,
> +		rte_socket_id());
> +	return 1;
> +}
> +
> +static struct rte_vdev_driver pmd_ark_drv = {
> +	.probe = pmd_ark_probe,
> +	.remove = pmd_ark_remove,
> +};

Sorry, I am confused here.
Why both virtual and physical initialization routine exists together?
This PMD for physical PCI device, right?

> +
> +RTE_PMD_REGISTER_VDEV(net_ark, pmd_ark_drv);
> +RTE_PMD_REGISTER_ALIAS(net_ark, eth_ark);
> +RTE_PMD_REGISTER_PCI(eth_ark, rte_ark_pmd.pci_drv);
> +RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
> +RTE_PMD_REGISTER_PCI_TABLE(eth_ark, pci_id_ark_map);

Can add RTE_PMD_REGISTER_PARAM_STRING macro.

> diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
> new file mode 100644
> index 0000000..08d7fb1
> --- /dev/null
> +++ b/drivers/net/ark/ark_ethdev.h
> @@ -0,0 +1,39 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright (c) 2015-2017 Atomic Rules LLC
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of copyright holder nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _ARK_ETHDEV_H_
> +#define _ARK_ETHDEV_H_
> +
> +/* STUB */
> +
> +#endif
> diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
> new file mode 100644
> index 0000000..7cd62d5
> --- /dev/null
> +++ b/drivers/net/ark/ark_global.h
> @@ -0,0 +1,108 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright (c) 2015-2017 Atomic Rules LLC
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of copyright holder nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _ARK_GLOBAL_H_
> +#define _ARK_GLOBAL_H_
> +
> +#include <time.h>
> +#include <assert.h>
> +
> +#include <rte_mbuf.h>
> +#include <rte_ethdev.h>
> +#include <rte_malloc.h>
> +#include <rte_memcpy.h>
> +#include <rte_string_fns.h>
> +#include <rte_cycles.h>
> +#include <rte_kvargs.h>
> +#include <rte_dev.h>
> +#include <rte_version.h>
> +
> +#define ETH_ARK_ARG_MAXLEN	64
> +#define ARK_SYSCTRL_BASE  0x0
> +#define ARK_PKTGEN_BASE   0x10000
> +#define ARK_MPU_RX_BASE   0x20000
> +#define ARK_UDM_BASE      0x30000
> +#define ARK_MPU_TX_BASE   0x40000
> +#define ARK_DDM_BASE      0x60000
> +#define ARK_CMAC_BASE     0x80000
> +#define ARK_PKTDIR_BASE   0xa0000
> +#define ARK_PKTCHKR_BASE  0x90000
> +#define ARK_RCPACING_BASE 0xb0000
> +#define ARK_EXTERNAL_BASE 0x100000
> +#define ARK_MPU_QOFFSET   0x00100
> +#define ARK_MAX_PORTS     8
> +
> +#define offset8(n)     n
> +#define offset16(n)   ((n) / 2)
> +#define offset32(n)   ((n) / 4)
> +#define offset64(n)   ((n) / 8)
> +
> +/*
> + * Structure to store private data for each PF/VF instance.
> + */
> +#define def_ptr(type, name) \
> +	union type {		   \
> +		uint64_t *t64;	   \
> +		uint32_t *t32;	   \
> +		uint16_t *t16;	   \
> +		uint8_t  *t8;	   \
> +		void     *v;	   \
> +	} name
> +
> +struct ark_port {
> +	struct rte_eth_dev *eth_dev;
> +	int id;
> +};
> +
> +struct ark_adapter {
> +	/* User extension private data */
> +	void *user_data;
> +
> +	struct ark_port port[ARK_MAX_PORTS];
> +	int num_ports;
> +
> +	/* Common for both PF and VF */
> +	struct rte_eth_dev *eth_dev;
> +
> +	void *d_handle;
> +
> +	/* Our Bar 0 */
> +	uint8_t *bar0;
> +
> +	/* Application Bar */
> +	uint8_t *a_bar;
> +};
> +
> +typedef uint32_t *ark_t;
> +
> +#endif
> diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
> new file mode 100644
> index 0000000..7f84780
> --- /dev/null
> +++ b/drivers/net/ark/rte_pmd_ark_version.map
> @@ -0,0 +1,4 @@
> +DPDK_2.0 {

DPDK_17.05

> +	 local: *;
> +
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index 0e0b600..da23898 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -104,6 +104,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
>  # plugins (link only if static libraries)
>  
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
> 

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v3 1/7] net/ark: PMD for Atomic Rules Arkville driver stub
@ 2017-03-21 21:43  3% Ed Czeck
  2017-03-22 18:16  0% ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Ed Czeck @ 2017-03-21 21:43 UTC (permalink / raw)
  To: dev; +Cc: Ed Czeck

Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile


Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
---
 MAINTAINERS                                 |   8 +
 config/common_base                          |  11 ++
 config/defconfig_arm-armv7a-linuxapp-gcc    |   1 +
 config/defconfig_ppc_64-power8-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                     | 237 +++++++++++++++++++++++
 doc/guides/nics/features/ark.ini            |  15 ++
 doc/guides/nics/index.rst                   |   1 +
 drivers/net/Makefile                        |   1 +
 drivers/net/ark/Makefile                    |  63 +++++++
 drivers/net/ark/ark_debug.h                 |  74 ++++++++
 drivers/net/ark/ark_ethdev.c                | 281 ++++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h                |  39 ++++
 drivers/net/ark/ark_global.h                | 108 +++++++++++
 drivers/net/ark/rte_pmd_ark_version.map     |   4 +
 mk/rte.app.mk                               |   1 +
 15 files changed, 845 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 doc/guides/nics/features/ark.ini
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_debug.h
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 0c78b58..8043d75 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -278,6 +278,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ark
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: /drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index 37aa1e1..0916c44 100644
--- a/config/common_base
+++ b/config/common_base
@@ -353,6 +353,17 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_PAD_TX=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+
+#
 # Compile the TAP PMD
 # It is enabled by default for Linux only.
 #
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 35f7fb6..89bc396 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
 # Note: Initially, all of the PMD drivers compilation are turned off on Power
 # Will turn on them only after the successful testing on Power
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..72fb8d6
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,237 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*.
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/features/ark.ini b/doc/guides/nics/features/ark.ini
new file mode 100644
index 0000000..dc8a0e2
--- /dev/null
+++ b/doc/guides/nics/features/ark.ini
@@ -0,0 +1,15 @@
+;
+; Supported features of the 'ark' poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Queue start/stop     = Y
+Jumbo frame          = Y
+Scattered Rx         = Y
+Basic stats          = Y
+Stats per queue      = Y
+FW version           = Y
+Linux UIO            = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 87f9334..381d82c 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     bnx2x
     bnxt
     cxgbe
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a16f25e..ea9868b 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
 DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..217bd34
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,63 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD)
+#
+
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_kvargs
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libpthread
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libdl
+
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_debug.h b/drivers/net/ark/ark_debug.h
new file mode 100644
index 0000000..a108c28
--- /dev/null
+++ b/drivers/net/ark/ark_debug.h
@@ -0,0 +1,74 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIu32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIu64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+#define ARK_TRACE_ON(fmt, ...) \
+	PMD_DRV_LOG(ERR, fmt, ##__VA_ARGS__)
+
+#define ARK_TRACE_OFF(fmt, ...) \
+	do {if (0) PMD_DRV_LOG(ERR, fmt, ##__VA_ARGS__); } while (0)
+
+/* Debug macro for reporting Packet stats */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define ARK_DEBUG_STATS(fmt, ...) ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_STATS(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for tracing full behavior*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+#ifdef ARK_STD_LOG
+#define PMD_DRV_LOG(level, fmt, args...) \
+	fprintf(stderr, fmt, args)
+#else
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+#endif
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..124b73c
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,281 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+#include <rte_vdev.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+				 struct rte_eth_dev_info *dev_info);
+
+
+#define ARK_DEV_TO_PCI(eth_dev)			\
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+#define ARK_MAX_ARG_LEN 256
+static uint32_t pkt_dir_v;
+static char pkt_gen_args[ARK_MAX_ARG_LEN];
+static char pkt_chkr_args[ARK_MAX_ARG_LEN];
+
+#define ARK_PKTGEN_ARG "Pkt_gen"
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	"iface",
+	NULL
+};
+
+#define MAX_ARK_PHYS 16
+struct ark_adapter *gark[MAX_ARK_PHYS];
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC
+	},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_infos_get = eth_ark_dev_info_get,
+
+};
+
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev __rte_unused)
+{
+	return -1;					/* STUB */
+}
+
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	ARK_DEBUG_TRACE("ARKP: In %s\n", __func__);
+	return 0;
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+		     struct rte_eth_dev_info *dev_info)
+{
+	/* device specific configuration */
+	memset(dev_info, 0, sizeof(*dev_info));
+
+	dev_info->max_rx_pktlen = (16 * 1024) - 128;
+	dev_info->min_rx_bufsize = 1024;
+	dev_info->rx_offload_capa = 0;
+	dev_info->tx_offload_capa = 0;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = 4096 * 4,
+		.nb_min = 512,	/* HW Q size for RX */
+		.nb_align = 2,};
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = 4096 * 4,
+		.nb_min = 256,	/* HW Q size for TX */
+		.nb_align = 2,};
+
+	dev_info->rx_offload_capa = 0;
+	dev_info->tx_offload_capa = 0;
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa = (ETH_LINK_SPEED_1G |
+				ETH_LINK_SPEED_10G |
+				ETH_LINK_SPEED_25G |
+				ETH_LINK_SPEED_40G |
+				ETH_LINK_SPEED_50G |
+				ETH_LINK_SPEED_100G);
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+	dev_info->driver_name = dev->data->drv_name;
+}
+
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+		   void *extra_args __rte_unused)
+{
+	ARK_DEBUG_TRACE("In process_pktdir_arg, key = %s, value = %s\n",
+			key, value);
+	pkt_dir_v = strtol(value, NULL, 16);
+	ARK_DEBUG_TRACE("pkt_dir_v = 0x%x\n", pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktgen_arg, key = %s, value = %s\n",
+			key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[256];
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		/* ARK_DEBUG_TRACE("%s\n", line); */
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	ARK_DEBUG_TRACE("file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	pkt_gen_args[0] = 0;
+	pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		ARK_DEBUG_TRACE("**** Arg passed to PMD = %s:%s\n", pair->key,
+				pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTDIR_ARG,
+			       &process_pktdir_arg,
+			       NULL) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTGEN_ARG,
+			       &process_file_args,
+			       pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+			       ARK_PKTCHKR_ARG,
+			       &process_file_args,
+			       pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+	}
+
+	ARK_DEBUG_TRACE("INFO: packet director set to 0x%x\n", pkt_dir_v);
+
+	return 1;
+}
+
+
+/*
+ * PCIE
+ */
+static int
+pmd_ark_probe(const char *name, const char *params)
+{
+	RTE_LOG(INFO, PMD, "Initializing pmd_ark for %s params = %s\n", name,
+		params);
+
+	/* Parse off the v index */
+
+	eth_ark_check_args(params);
+	return 0;
+}
+
+static int
+pmd_ark_remove(const char *name)
+{
+	RTE_LOG(INFO, PMD, "Closing ark %s ethdev on numa socket %u\n", name,
+		rte_socket_id());
+	return 1;
+}
+
+static struct rte_vdev_driver pmd_ark_drv = {
+	.probe = pmd_ark_probe,
+	.remove = pmd_ark_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_ark, pmd_ark_drv);
+RTE_PMD_REGISTER_ALIAS(net_ark, eth_ark);
+RTE_PMD_REGISTER_PCI(eth_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(eth_ark, pci_id_ark_map);
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..08d7fb1
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,39 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+/* STUB */
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..7cd62d5
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,108 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Common for both PF and VF */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* Application Bar */
+	uint8_t *a_bar;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..7f84780
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_2.0 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 0e0b600..da23898 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -104,6 +104,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
-- 
1.9.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2] net/ark: poll-mode driver for AtomicRules Arkville
@ 2017-03-20 21:14  1% Ed Czeck
  0 siblings, 0 replies; 200+ results
From: Ed Czeck @ 2017-03-20 21:14 UTC (permalink / raw)
  To: dev; +Cc: Ed Czeck, Shepard Siegel, John Miller

This is the PMD for Atomic Rules's Arkville ARK family of devices.
See doc/guides/nics/ark.rst for detailed description.

v2:
* Fixed all observed compiler error
* Fixed all observed checkpatch messages except for PRIu64 system macro

Signed-off-by: Shepard Siegel <shepard.siegel@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
---
 MAINTAINERS                                 |    8 +
 config/common_base                          |   10 +
 config/defconfig_arm-armv7a-linuxapp-gcc    |    1 +
 config/defconfig_ppc_64-power8-linuxapp-gcc |    1 +
 doc/guides/nics/ark.rst                     |  237 +++++++
 doc/guides/nics/features/ark.ini            |   15 +
 doc/guides/nics/index.rst                   |    1 +
 drivers/net/Makefile                        |    1 +
 drivers/net/ark/Makefile                    |   72 ++
 drivers/net/ark/ark_ddm.c                   |  150 ++++
 drivers/net/ark/ark_ddm.h                   |  154 ++++
 drivers/net/ark/ark_debug.h                 |   74 ++
 drivers/net/ark/ark_ethdev.c                | 1015 +++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h                |   75 ++
 drivers/net/ark/ark_ethdev_rx.c             |  667 ++++++++++++++++++
 drivers/net/ark/ark_ethdev_tx.c             |  492 +++++++++++++
 drivers/net/ark/ark_ext.h                   |   79 +++
 drivers/net/ark/ark_global.h                |  159 +++++
 drivers/net/ark/ark_mpu.c                   |  167 +++++
 drivers/net/ark/ark_mpu.h                   |  143 ++++
 drivers/net/ark/ark_pktchkr.c               |  460 ++++++++++++
 drivers/net/ark/ark_pktchkr.h               |  114 +++
 drivers/net/ark/ark_pktdir.c                |   79 +++
 drivers/net/ark/ark_pktdir.h                |   68 ++
 drivers/net/ark/ark_pktgen.c                |  482 +++++++++++++
 drivers/net/ark/ark_pktgen.h                |  106 +++
 drivers/net/ark/ark_rqp.c                   |   92 +++
 drivers/net/ark/ark_rqp.h                   |   75 ++
 drivers/net/ark/ark_udm.c                   |  221 ++++++
 drivers/net/ark/ark_udm.h                   |  175 +++++
 drivers/net/ark/rte_pmd_ark_version.map     |    4 +
 mk/rte.app.mk                               |    1 +
 32 files changed, 5398 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 doc/guides/nics/features/ark.ini
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_ddm.c
 create mode 100644 drivers/net/ark/ark_ddm.h
 create mode 100644 drivers/net/ark/ark_debug.h
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_ethdev_rx.c
 create mode 100644 drivers/net/ark/ark_ethdev_tx.c
 create mode 100644 drivers/net/ark/ark_ext.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/ark_mpu.c
 create mode 100644 drivers/net/ark/ark_mpu.h
 create mode 100644 drivers/net/ark/ark_pktchkr.c
 create mode 100644 drivers/net/ark/ark_pktchkr.h
 create mode 100644 drivers/net/ark/ark_pktdir.c
 create mode 100644 drivers/net/ark/ark_pktdir.h
 create mode 100644 drivers/net/ark/ark_pktgen.c
 create mode 100644 drivers/net/ark/ark_pktgen.h
 create mode 100644 drivers/net/ark/ark_rqp.c
 create mode 100644 drivers/net/ark/ark_rqp.h
 create mode 100644 drivers/net/ark/ark_udm.c
 create mode 100644 drivers/net/ark/ark_udm.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 39bc78e..6f6136a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -280,6 +280,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ark
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: /drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index aeee13e..e64cd83 100644
--- a/config/common_base
+++ b/config/common_base
@@ -348,6 +348,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+
+#
 # Compile the TAP PMD
 # It is enabled by default for Linux only.
 #
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 35f7fb6..89bc396 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
 # Note: Initially, all of the PMD drivers compilation are turned off on Power
 # Will turn on them only after the successful testing on Power
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..72fb8d6
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,237 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*.
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
diff --git a/doc/guides/nics/features/ark.ini b/doc/guides/nics/features/ark.ini
new file mode 100644
index 0000000..dc8a0e2
--- /dev/null
+++ b/doc/guides/nics/features/ark.ini
@@ -0,0 +1,15 @@
+;
+; Supported features of the 'ark' poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Queue start/stop     = Y
+Jumbo frame          = Y
+Scattered Rx         = Y
+Basic stats          = Y
+Stats per queue      = Y
+FW version           = Y
+Linux UIO            = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 87f9334..381d82c 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     bnx2x
     bnxt
     cxgbe
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a16f25e..ea9868b 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
 DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..615dfa2
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,72 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD)
+#
+
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev_tx.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_pktgen.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_pktchkr.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_pktdir.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_mpu.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ddm.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_udm.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_rqp.c
+
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_kvargs
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libpthread
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libdl
+
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ark/ark_ddm.c b/drivers/net/ark/ark_ddm.c
new file mode 100644
index 0000000..86bb2b5
--- /dev/null
+++ b/drivers/net/ark/ark_ddm.c
@@ -0,0 +1,150 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_debug.h"
+#include "ark_ddm.h"
+
+/* ************************************************************************* */
+int
+ark_ddm_verify(struct ark_ddm_t *ddm)
+{
+	if (sizeof(struct ark_ddm_t) != ARK_DDM_EXPECTED_SIZE) {
+	fprintf(stderr, "  DDM structure looks incorrect %d vs %zd\n",
+		ARK_DDM_EXPECTED_SIZE, sizeof(struct ark_ddm_t));
+	return -1;
+	}
+
+	if (ddm->cfg.const0 != ARK_DDM_CONST) {
+	fprintf(stderr, "  DDM module not found as expected 0x%08x\n",
+		ddm->cfg.const0);
+	return -1;
+	}
+	return 0;
+}
+
+void
+ark_ddm_start(struct ark_ddm_t *ddm)
+{
+	ddm->cfg.command = 1;
+}
+
+int
+ark_ddm_stop(struct ark_ddm_t *ddm, const int wait)
+{
+	int cnt = 0;
+
+	ddm->cfg.command = 2;
+	while (wait && (ddm->cfg.stop_flushed & 0x01) == 0) {
+	if (cnt++ > 1000)
+		return 1;
+
+	usleep(10);
+	}
+	return 0;
+}
+
+void
+ark_ddm_reset(struct ark_ddm_t *ddm)
+{
+	int status;
+
+	/* reset only works if ddm has stopped properly. */
+	status = ark_ddm_stop(ddm, 1);
+
+	if (status != 0) {
+	ARK_DEBUG_TRACE("ARKP: %s  stop failed  doing forced reset\n",
+		__func__);
+	ddm->cfg.command = 4;
+	usleep(10);
+	}
+	ddm->cfg.command = 3;
+}
+
+void
+ark_ddm_setup(struct ark_ddm_t *ddm, phys_addr_t cons_addr, uint32_t interval)
+{
+	ddm->setup.cons_write_index_addr = cons_addr;
+	ddm->setup.write_index_interval = interval / 4;	/* 4 ns period */
+}
+
+void
+ark_ddm_stats_reset(struct ark_ddm_t *ddm)
+{
+	ddm->cfg.tlp_stats_clear = 1;
+}
+
+void
+ark_ddm_dump(struct ark_ddm_t *ddm, const char *msg)
+{
+	ARK_DEBUG_TRACE("ARKP DDM Dump: %s Stopped: %d\n", msg,
+	ark_ddm_is_stopped(ddm)
+	);
+}
+
+void
+ark_ddm_dump_stats(struct ark_ddm_t *ddm, const char *msg)
+{
+	struct ark_ddm_stats_t *stats = &ddm->stats;
+
+	ARK_DEBUG_STATS("ARKP DDM Stats: %s"
+					ARK_SU64 ARK_SU64 ARK_SU64
+					"\n", msg,
+	"Bytes:", stats->tx_byte_count,
+	"Packets:", stats->tx_pkt_count, "MBufs", stats->tx_mbuf_count);
+}
+
+int
+ark_ddm_is_stopped(struct ark_ddm_t *ddm)
+{
+	return (ddm->cfg.stop_flushed & 0x01) != 0;
+}
+
+uint64_t
+ark_ddm_queue_byte_count(struct ark_ddm_t *ddm)
+{
+	return ddm->queue_stats.byte_count;
+}
+
+uint64_t
+ark_ddm_queue_pkt_count(struct ark_ddm_t *ddm)
+{
+	return ddm->queue_stats.pkt_count;
+}
+
+void
+ark_ddm_queue_reset_stats(struct ark_ddm_t *ddm)
+{
+	ddm->queue_stats.byte_count = 1;
+}
diff --git a/drivers/net/ark/ark_ddm.h b/drivers/net/ark/ark_ddm.h
new file mode 100644
index 0000000..8208b12
--- /dev/null
+++ b/drivers/net/ark/ark_ddm.h
@@ -0,0 +1,154 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DDM_H_
+#define _ARK_DDM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/* DDM core hardware structures */
+#define ARK_DDM_CFG 0x0000
+#define ARK_DDM_CONST 0xfacecafe
+struct ark_ddm_cfg_t {
+	uint32_t r0;
+	volatile uint32_t tlp_stats_clear;
+	uint32_t const0;
+	volatile uint32_t tag_max;
+	volatile uint32_t command;
+	volatile uint32_t stop_flushed;
+};
+
+#define ARK_DDM_STATS 0x0020
+struct ark_ddm_stats_t {
+	volatile uint64_t tx_byte_count;
+	volatile uint64_t tx_pkt_count;
+	volatile uint64_t tx_mbuf_count;
+};
+
+#define ARK_DDM_MRDQ 0x0040
+struct ark_ddm_mrdq_t {
+	volatile uint32_t mrd_q1;
+	volatile uint32_t mrd_q2;
+	volatile uint32_t mrd_q3;
+	volatile uint32_t mrd_q4;
+	volatile uint32_t mrd_full;
+};
+
+#define ARK_DDM_CPLDQ 0x0068
+struct ark_ddm_cpldq_t {
+	volatile uint32_t cpld_q1;
+	volatile uint32_t cpld_q2;
+	volatile uint32_t cpld_q3;
+	volatile uint32_t cpld_q4;
+	volatile uint32_t cpld_full;
+};
+
+#define ARK_DDM_MRD_PS 0x0090
+struct ark_ddm_mrd_ps_t {
+	volatile uint32_t mrd_ps_min;
+	volatile uint32_t mrd_ps_max;
+	volatile uint32_t mrd_full_ps_min;
+	volatile uint32_t mrd_full_ps_max;
+	volatile uint32_t mrd_dw_ps_min;
+	volatile uint32_t mrd_dw_ps_max;
+};
+
+#define ARK_DDM_QUEUE_STATS 0x00a8
+struct ark_ddm_qstats_t {
+	volatile uint64_t byte_count;
+	volatile uint64_t pkt_count;
+	volatile uint64_t mbuf_count;
+};
+
+#define ARK_DDM_CPLD_PS 0x00c0
+struct ark_ddm_cpld_ps_t {
+	volatile uint32_t cpld_ps_min;
+	volatile uint32_t cpld_ps_max;
+	volatile uint32_t cpld_full_ps_min;
+	volatile uint32_t cpld_full_ps_max;
+	volatile uint32_t cpld_dw_ps_min;
+	volatile uint32_t cpld_dw_ps_max;
+};
+
+#define ARK_DDM_SETUP  0x00e0
+struct ark_ddm_setup_t {
+	phys_addr_t cons_write_index_addr;
+	uint32_t write_index_interval;	/* 4ns each */
+	volatile uint32_t cons_index;
+};
+
+/*  Consolidated structure */
+struct ark_ddm_t {
+	struct ark_ddm_cfg_t cfg;
+	uint8_t reserved0[(ARK_DDM_STATS - ARK_DDM_CFG) -
+					  sizeof(struct ark_ddm_cfg_t)];
+	struct ark_ddm_stats_t stats;
+	uint8_t reserved1[(ARK_DDM_MRDQ - ARK_DDM_STATS) -
+					  sizeof(struct ark_ddm_stats_t)];
+	struct ark_ddm_mrdq_t mrdq;
+	uint8_t reserved2[(ARK_DDM_CPLDQ - ARK_DDM_MRDQ) -
+					  sizeof(struct ark_ddm_mrdq_t)];
+	struct ark_ddm_cpldq_t cpldq;
+	uint8_t reserved3[(ARK_DDM_MRD_PS - ARK_DDM_CPLDQ) -
+					  sizeof(struct ark_ddm_cpldq_t)];
+	struct ark_ddm_mrd_ps_t mrd_ps;
+	struct ark_ddm_qstats_t queue_stats;
+	struct ark_ddm_cpld_ps_t cpld_ps;
+	uint8_t reserved5[(ARK_DDM_SETUP - ARK_DDM_CPLD_PS) -
+					  sizeof(struct ark_ddm_cpld_ps_t)];
+	struct ark_ddm_setup_t setup;
+	uint8_t reserved_p[(256 - ARK_DDM_SETUP)
+					  - sizeof(struct ark_ddm_setup_t)];
+};
+
+#define ARK_DDM_EXPECTED_SIZE 256
+#define ARK_DDM_QOFFSET ARK_DDM_EXPECTED_SIZE
+
+/* DDM function prototype */
+int ark_ddm_verify(struct ark_ddm_t *ddm);
+void ark_ddm_start(struct ark_ddm_t *ddm);
+int ark_ddm_stop(struct ark_ddm_t *ddm, const int wait);
+void ark_ddm_reset(struct ark_ddm_t *ddm);
+void ark_ddm_stats_reset(struct ark_ddm_t *ddm);
+void ark_ddm_setup(struct ark_ddm_t *ddm, phys_addr_t cons_addr,
+	uint32_t interval);
+void ark_ddm_dump_stats(struct ark_ddm_t *ddm, const char *msg);
+void ark_ddm_dump(struct ark_ddm_t *ddm, const char *msg);
+int ark_ddm_is_stopped(struct ark_ddm_t *ddm);
+uint64_t ark_ddm_queue_byte_count(struct ark_ddm_t *ddm);
+uint64_t ark_ddm_queue_pkt_count(struct ark_ddm_t *ddm);
+void ark_ddm_queue_reset_stats(struct ark_ddm_t *ddm);
+
+#endif
diff --git a/drivers/net/ark/ark_debug.h b/drivers/net/ark/ark_debug.h
new file mode 100644
index 0000000..8a7f83a
--- /dev/null
+++ b/drivers/net/ark/ark_debug.h
@@ -0,0 +1,74 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <inttypes.h>
+#include <rte_log.h>
+
+/* Format specifiers for string data pairs */
+#define ARK_SU32  "\n\t%-20s    %'20" PRIu32
+#define ARK_SU64  "\n\t%-20s    %'20" PRIu64
+#define ARK_SU64X "\n\t%-20s    %#20" PRIx64
+#define ARK_SPTR  "\n\t%-20s    %20p"
+
+#define ARK_TRACE_ON(fmt, ...) \
+	fprintf(stderr, fmt, ##__VA_ARGS__)
+
+#define ARK_TRACE_OFF(fmt, ...) \
+	do {if (0) fprintf(stderr, fmt, ##__VA_ARGS__); } while (0)
+
+/* Debug macro for reporting Packet stats */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define ARK_DEBUG_STATS(fmt, ...) ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_STATS(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for tracing full behavior*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+#ifdef ARK_STD_LOG
+#define PMD_DRV_LOG(level, fmt, args...) \
+	fprintf(stderr, fmt, args)
+#else
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+#endif
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..6dfe2ce
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,1015 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+#include <rte_vdev.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+#include "ark_mpu.h"
+#include "ark_ddm.h"
+#include "ark_udm.h"
+#include "ark_rqp.h"
+#include "ark_pktdir.h"
+#include "ark_pktgen.h"
+#include "ark_pktchkr.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int ark_config_device(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static int eth_ark_dev_start(struct rte_eth_dev *dev);
+static void eth_ark_dev_stop(struct rte_eth_dev *dev);
+static void eth_ark_dev_close(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+	struct rte_eth_dev_info *dev_info);
+static int eth_ark_dev_link_update(struct rte_eth_dev *dev,
+	int wait_to_complete);
+static int eth_ark_dev_set_link_up(struct rte_eth_dev *dev);
+static int eth_ark_dev_set_link_down(struct rte_eth_dev *dev);
+static void eth_ark_dev_stats_get(struct rte_eth_dev *dev,
+	struct rte_eth_stats *stats);
+static void eth_ark_dev_stats_reset(struct rte_eth_dev *dev);
+static void eth_ark_set_default_mac_addr(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr);
+static void eth_ark_macaddr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint32_t index, uint32_t pool);
+static void eth_ark_macaddr_remove(struct rte_eth_dev *dev,
+	uint32_t index);
+
+#define ARK_DEV_TO_PCI(eth_dev) \
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+#define ARK_MAX_ARG_LEN 256
+static uint32_t pkt_dir_v;
+static char pkt_gen_args[ARK_MAX_ARG_LEN];
+static char pkt_chkr_args[ARK_MAX_ARG_LEN];
+
+#define ARK_PKTGEN_ARG "Pkt_gen"
+#define ARK_PKTCHKR_ARG "Pkt_chkr"
+#define ARK_PKTDIR_ARG "Pkt_dir"
+
+static const char * const valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	"iface",
+	NULL
+};
+
+#define MAX_ARK_PHYS 16
+struct ark_adapter *gark[MAX_ARK_PHYS];
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_start = eth_ark_dev_start,
+	.dev_stop = eth_ark_dev_stop,
+	.dev_close = eth_ark_dev_close,
+
+	.dev_infos_get = eth_ark_dev_info_get,
+
+	.rx_queue_setup = eth_ark_dev_rx_queue_setup,
+	.rx_queue_count = eth_ark_dev_rx_queue_count,
+	.tx_queue_setup = eth_ark_tx_queue_setup,
+
+	.link_update = eth_ark_dev_link_update,
+	.dev_set_link_up = eth_ark_dev_set_link_up,
+	.dev_set_link_down = eth_ark_dev_set_link_down,
+
+	.rx_queue_start = eth_ark_rx_start_queue,
+	.rx_queue_stop = eth_ark_rx_stop_queue,
+
+	.tx_queue_start = eth_ark_tx_queue_start,
+	.tx_queue_stop = eth_ark_tx_queue_stop,
+
+	.stats_get = eth_ark_dev_stats_get,
+	.stats_reset = eth_ark_dev_stats_reset,
+
+	.mac_addr_add = eth_ark_macaddr_add,
+	.mac_addr_remove = eth_ark_macaddr_remove,
+	.mac_addr_set = eth_ark_set_default_mac_addr,
+
+};
+
+int
+ark_get_port_id(struct rte_eth_dev *dev, struct ark_adapter *ark)
+{
+	int n = ark->num_ports;
+	int i;
+
+	/* There has to be a smarter way to do this ... */
+	for (i = 0; i < n; i++) {
+		if (ark->port[i].eth_dev == dev)
+			return i;
+	}
+	ARK_DEBUG_TRACE("ARK: Device is NOT associated with a port !!");
+	return -1;
+}
+
+static
+int
+check_for_ext(struct rte_eth_dev *dev __rte_unused,
+			  struct ark_adapter *ark __rte_unused)
+{
+	int found = 0;
+
+	/* Get the env */
+	const char *dllpath = getenv("ARK_EXT_PATH");
+
+	if (dllpath == NULL) {
+		ARK_DEBUG_TRACE("ARK EXT NO dll path specified\n");
+		return 0;
+	}
+	ARK_DEBUG_TRACE("ARK EXT found dll path at %s\n", dllpath);
+
+	/* Open and load the .so */
+	ark->d_handle = dlopen(dllpath, RTLD_LOCAL | RTLD_LAZY);
+	if (ark->d_handle == NULL)
+		PMD_DRV_LOG(ERR, "Could not load user extension %s\n", dllpath);
+	else
+		ARK_DEBUG_TRACE("SUCCESS: loaded user extension %s\n", dllpath);
+
+	/* Get the entry points */
+	ark->user_ext.dev_init =
+		(void *(*)(struct rte_eth_dev *, void *, int))
+		dlsym(ark->d_handle, "dev_init");
+	ARK_DEBUG_TRACE("device ext init pointer = %p\n",
+					ark->user_ext.dev_init);
+	ark->user_ext.dev_get_port_count =
+		(int (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_get_port_count");
+	ark->user_ext.dev_uninit =
+		(void (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_uninit");
+	ark->user_ext.dev_configure =
+		(int (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_configure");
+	ark->user_ext.dev_start =
+		(int (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_start");
+	ark->user_ext.dev_stop =
+		(void (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_stop");
+	ark->user_ext.dev_close =
+		(void (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_close");
+	ark->user_ext.link_update =
+		(int (*)(struct rte_eth_dev *, int, void *))
+		dlsym(ark->d_handle,
+			  "link_update");
+	ark->user_ext.dev_set_link_up =
+		(int (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_set_link_up");
+	ark->user_ext.dev_set_link_down =
+		(int (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "dev_set_link_down");
+	ark->user_ext.stats_get =
+		(void (*)(struct rte_eth_dev *, struct rte_eth_stats *,
+				  void *)) dlsym(ark->d_handle, "stats_get");
+	ark->user_ext.stats_reset =
+		(void (*)(struct rte_eth_dev *, void *))
+		dlsym(ark->d_handle,
+			  "stats_reset");
+	ark->user_ext.mac_addr_add =
+		(void (*)(struct rte_eth_dev *, struct ether_addr *, uint32_t,
+				  uint32_t, void *)) dlsym(ark->d_handle, "mac_addr_add");
+	ark->user_ext.mac_addr_remove =
+		(void (*)(struct rte_eth_dev *, uint32_t,
+				  void *)) dlsym(ark->d_handle, "mac_addr_remove");
+	ark->user_ext.mac_addr_set =
+		(void (*)(struct rte_eth_dev *, struct ether_addr *,
+				  void *)) dlsym(ark->d_handle, "mac_addr_set");
+
+	return found;
+}
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret;
+
+	ark->eth_dev = dev;
+
+	ARK_DEBUG_TRACE("eth_ark_dev_init(struct rte_eth_dev *dev)\n");
+	gark[0] = ark;
+
+	/* Check to see if there is an extension that we need to load */
+	check_for_ext(dev, ark);
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	if (pci_dev->device.devargs)
+		eth_ark_check_args(pci_dev->device.devargs->args);
+	else
+		PMD_DRV_LOG(INFO, "No Device args found\n");
+
+	/* Use dummy function until setup */
+	dev->rx_pkt_burst = &eth_ark_recv_pkts_noop;
+	dev->tx_pkt_burst = &eth_ark_xmit_pkts_noop;
+
+	ark->bar0 = (uint8_t *)pci_dev->mem_resource[0].addr;
+	ark->a_bar = (uint8_t *)pci_dev->mem_resource[2].addr;
+
+	ark->sysctrl.v  = (void *)&ark->bar0[ARK_SYSCTRL_BASE];
+	ark->mpurx.v  = (void *)&ark->bar0[ARK_MPU_RX_BASE];
+	ark->udm.v  = (void *)&ark->bar0[ARK_UDM_BASE];
+	ark->mputx.v  = (void *)&ark->bar0[ARK_MPU_TX_BASE];
+	ark->ddm.v  = (void *)&ark->bar0[ARK_DDM_BASE];
+	ark->cmac.v  = (void *)&ark->bar0[ARK_CMAC_BASE];
+	ark->external.v  = (void *)&ark->bar0[ARK_EXTERNAL_BASE];
+	ark->pktdir.v  = (void *)&ark->bar0[ARK_PKTDIR_BASE];
+	ark->pktgen.v  = (void *)&ark->bar0[ARK_PKTGEN_BASE];
+	ark->pktchkr.v  = (void *)&ark->bar0[ARK_PKTCHKR_BASE];
+
+	ark->rqpacing =
+		(struct ark_rqpace_t *)(ark->bar0 + ARK_RCPACING_BASE);
+	ark->started = 0;
+
+	ARK_DEBUG_TRACE
+		("Sys Ctrl Const = 0x%x  DEV Commit_iD: %08x\n",
+		 ark->sysctrl.t32[4],
+		 rte_be_to_cpu_32(ark->sysctrl.t32[0x20 / 4]));
+	PMD_DRV_LOG(INFO, "ARKP PMD  Commit_iD: %08x\n",
+				rte_be_to_cpu_32(ark->sysctrl.t32[0x20 / 4]));
+
+	/* If HW sanity test fails, return an error */
+	if (ark->sysctrl.t32[4] != 0xcafef00d) {
+		PMD_DRV_LOG
+			(ERR,
+			 "HW Sanity test has failed, expected constant 0x%x, read 0x%x (%s)\n",
+			 0xcafef00d, ark->sysctrl.t32[4], __func__);
+		return -1;
+	}
+
+	PMD_DRV_LOG
+		(INFO,
+		 "HW Sanity test has PASSED, expected constant 0x%x, read 0x%x (%s)\n",
+		 0xcafef00d, ark->sysctrl.t32[4], __func__);
+
+	/* We are a single function multi-port device. */
+	const unsigned int numa_node = rte_socket_id();
+	struct ether_addr adr;
+
+	ret = ark_config_device(dev);
+	dev->dev_ops = &ark_eth_dev_ops;
+
+	dev->data->mac_addrs = rte_zmalloc("ark", ETHER_ADDR_LEN, 0);
+	if (!dev->data->mac_addrs) {
+		PMD_DRV_LOG(ERR,
+					"Failed to allocated memory for storing mac address");
+	}
+	ether_addr_copy((struct ether_addr *)&adr, &dev->data->mac_addrs[0]);
+
+	if (ark->user_ext.dev_init) {
+		ark->user_data = ark->user_ext.dev_init(dev, ark->a_bar, 0);
+		if (!ark->user_data) {
+			PMD_DRV_LOG(INFO,
+						"Failed to initialize PMD extension !!, continuing without it\n");
+			memset(&ark->user_ext, 0, sizeof(struct ark_user_ext));
+			dlclose(ark->d_handle);
+		}
+	}
+
+	/*
+	 * We will create additional devices based on the number of requested
+	 * ports
+	 */
+	int pc = 1;
+	int p;
+
+	if (ark->user_ext.dev_get_port_count) {
+		pc = ark->user_ext.dev_get_port_count(dev, ark->user_data);
+		ark->num_ports = pc;
+	} else {
+		ark->num_ports = 1;
+	}
+	for (p = 0; p < pc; p++) {
+		struct ark_port *port;
+
+		port = &ark->port[p];
+		struct rte_eth_dev_data *data = NULL;
+
+		port->id = p;
+
+		char name[RTE_ETH_NAME_MAX_LEN];
+
+		snprintf(name, sizeof(name), "arketh%d",
+				 dev->data->port_id + p);
+
+		if (p == 0) {
+			/* First port is already allocated by DPDK */
+			port->eth_dev = ark->eth_dev;
+			continue;
+		}
+
+		/* reserve an ethdev entry */
+		port->eth_dev = rte_eth_dev_allocate(name);
+		if (!port->eth_dev) {
+			PMD_DRV_LOG(ERR, "Could not allocate eth_dev for port %d\n",
+						p);
+			goto error;
+		}
+
+		data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+		if (!data) {
+			PMD_DRV_LOG(ERR, "Could not allocate eth_dev for port %d\n",
+						p);
+			goto error;
+		}
+		data->port_id = ark->eth_dev->data->port_id + p;
+		port->eth_dev->data = data;
+		port->eth_dev->device = &pci_dev->device;
+		port->eth_dev->data->dev_private = ark;
+		port->eth_dev->driver = ark->eth_dev->driver;
+		port->eth_dev->dev_ops = ark->eth_dev->dev_ops;
+		port->eth_dev->tx_pkt_burst = ark->eth_dev->tx_pkt_burst;
+		port->eth_dev->rx_pkt_burst = ark->eth_dev->rx_pkt_burst;
+
+		rte_eth_copy_pci_info(port->eth_dev, pci_dev);
+
+		port->eth_dev->data->mac_addrs =
+			rte_zmalloc(name, ETHER_ADDR_LEN, 0);
+		if (!port->eth_dev->data->mac_addrs) {
+			PMD_DRV_LOG(ERR,
+						"Memory allocation for MAC failed !, exiting\n");
+			goto error;
+		}
+		ether_addr_copy
+			((struct ether_addr *)&adr,
+			 &port->eth_dev->data->mac_addrs[0]);
+
+		if (ark->user_ext.dev_init)
+			ark->user_data =
+				ark->user_ext.dev_init(dev, ark->a_bar, p);
+	}
+
+	return ret;
+
+ error:
+	if (dev->data->mac_addrs)
+		rte_free(dev->data->mac_addrs);
+
+	for (p = 0; p < pc; p++) {
+		if (ark->port[p].eth_dev->data)
+			rte_free(ark->port[p].eth_dev->data);
+		if (ark->port[p].eth_dev->data->mac_addrs)
+			rte_free(ark->port[p].eth_dev->data->mac_addrs);
+	}
+
+	return -1;
+}
+
+/*
+ *Initial device configuration when device is opened
+ * setup the DDM, and UDM
+ * Called once per PCIE device
+ */
+static int
+ark_config_device(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	uint16_t num_q, i;
+	struct ark_mpu_t *mpu;
+
+	/*
+	 * Make sure that the packet director, generator and checker are in a
+	 * known state
+	 */
+	ark->start_pg = 0;
+	ark->pg = ark_pmd_pktgen_init(ark->pktgen.v, 0, 1);
+	ark_pmd_pktgen_reset(ark->pg);
+	ark->pc = ark_pmd_pktchkr_init(ark->pktchkr.v, 0, 1);
+	ark_pmd_pktchkr_stop(ark->pc);
+	ark->pd = ark_pmd_pktdir_init(ark->pktdir.v);
+
+	/* Verify HW */
+	if (ark_udm_verify(ark->udm.v))
+		return -1;
+	if (ark_ddm_verify(ark->ddm.v))
+		return -1;
+
+	/* UDM */
+	if (ark_udm_reset(ark->udm.v)) {
+		PMD_DRV_LOG(ERR, "Unable to stop and reset UDM\n");
+		return -1;
+	}
+	/* Keep in reset until the MPU are cleared */
+
+	/* MPU reset */
+	mpu = ark->mpurx.v;
+	num_q = ark_api_num_queues(mpu);
+	ark->rx_queues = num_q;
+	for (i = 0; i < num_q; i++) {
+		ark_mpu_reset(mpu);
+		mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+	}
+
+	ark_udm_stop(ark->udm.v, 0);
+	ark_udm_configure(ark->udm.v,
+					  RTE_PKTMBUF_HEADROOM,
+					  RTE_MBUF_DEFAULT_DATAROOM,
+					  ARK_RX_WRITE_TIME_NS);
+	ark_udm_stats_reset(ark->udm.v);
+	ark_udm_stop(ark->udm.v, 0);
+
+	/* TX -- DDM */
+	if (ark_ddm_stop(ark->ddm.v, 1))
+		PMD_DRV_LOG(ERR, "Unable to stop DDM\n");
+
+	mpu = ark->mputx.v;
+	num_q = ark_api_num_queues(mpu);
+	ark->tx_queues = num_q;
+	for (i = 0; i < num_q; i++) {
+		ark_mpu_reset(mpu);
+		mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+	}
+
+	ark_ddm_reset(ark->ddm.v);
+	ark_ddm_stats_reset(ark->ddm.v);
+	/* ark_ddm_dump(ark->ddm.v, "Config"); */
+	/* ark_ddm_dump_stats(ark->ddm.v, "Config"); */
+
+	/* MPU reset */
+	ark_ddm_stop(ark->ddm.v, 0);
+	ark_rqp_stats_reset(ark->rqpacing);
+
+	return 0;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	if (ark->user_ext.dev_uninit)
+		ark->user_ext.dev_uninit(dev, ark->user_data);
+
+	ark_pmd_pktgen_uninit(ark->pg);
+	ark_pmd_pktchkr_uninit(ark->pc);
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	if (dev->data->mac_addrs)
+		rte_free(dev->data->mac_addrs);
+	if (dev->data)
+		rte_free(dev->data);
+
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	ARK_DEBUG_TRACE
+		("ARKP: In eth_ark_dev_configure(struct rte_eth_dev *dev)\n");
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	eth_ark_dev_set_link_up(dev);
+	if (ark->user_ext.dev_configure)
+		return ark->user_ext.dev_configure(dev, ark->user_data);
+	return 0;
+}
+
+static void *
+delay_pg_start(void *arg)
+{
+	struct ark_adapter *ark = (struct ark_adapter *)arg;
+
+	/* This function is used exclusively for regression testing, We
+	 * perform a blind sleep here to ensure that the external test
+	 * application has time to setup the test before we generate packets
+	 */
+	usleep(100000);
+	ark_pmd_pktgen_run(ark->pg);
+	return NULL;
+}
+
+static int
+eth_ark_dev_start(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	int i;
+
+	ARK_DEBUG_TRACE("ARKP: In eth_ark_dev_start\n");
+
+	/* RX Side */
+	/* start UDM */
+	ark_udm_start(ark->udm.v);
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++)
+		eth_ark_rx_start_queue(dev, i);
+
+	/* TX Side */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		eth_ark_tx_queue_start(dev, i);
+
+	/* start DDM */
+	ark_ddm_start(ark->ddm.v);
+
+	ark->started = 1;
+	/* set xmit and receive function */
+	dev->rx_pkt_burst = &eth_ark_recv_pkts;
+	dev->tx_pkt_burst = &eth_ark_xmit_pkts;
+
+	if (ark->start_pg)
+		ark_pmd_pktchkr_run(ark->pc);
+
+	if (ark->start_pg && (ark_get_port_id(dev, ark) == 0)) {
+		pthread_t thread;
+
+		/* TODO: add comment here */
+		pthread_create(&thread, NULL, delay_pg_start, ark);
+	}
+
+	if (ark->user_ext.dev_start)
+		ark->user_ext.dev_start(dev, ark->user_data);
+
+	return 0;
+}
+
+static void
+eth_ark_dev_stop(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	int status;
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct ark_mpu_t *mpu;
+
+	ARK_DEBUG_TRACE("ARKP: In eth_ark_dev_stop\n");
+
+	if (ark->started == 0)
+		return;
+	ark->started = 0;
+
+	/* Stop the extension first */
+	if (ark->user_ext.dev_stop)
+		ark->user_ext.dev_stop(dev, ark->user_data);
+
+	/* Stop the packet generator */
+	if (ark->start_pg)
+		ark_pmd_pktgen_pause(ark->pg);
+
+	dev->rx_pkt_burst = &eth_ark_recv_pkts_noop;
+	dev->tx_pkt_burst = &eth_ark_xmit_pkts_noop;
+
+	/* STOP TX Side */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		status = eth_ark_tx_queue_stop(dev, i);
+		if (status != 0) {
+			uint8_t port = dev->data->port_id;
+			PMD_DRV_LOG(ERR,
+						"ARKP tx_queue stop anomaly port %u, queue %u\n",
+						port, i);
+		}
+	}
+
+	/* Stop DDM */
+	/* Wait up to 0.1 second.  each stop is upto 1000 * 10 useconds */
+	for (i = 0; i < 10; i++) {
+		status = ark_ddm_stop(ark->ddm.v, 1);
+		if (status == 0)
+			break;
+	}
+	if (status || i != 0) {
+		PMD_DRV_LOG(ERR, "DDM stop anomaly. status: %d iter: %u. (%s)\n",
+					status, i, __func__);
+		ark_ddm_dump(ark->ddm.v, "Stop anomaly");
+
+		mpu = ark->mputx.v;
+		for (i = 0; i < ark->tx_queues; i++) {
+			ark_mpu_dump(mpu, "DDM failure dump", i);
+			mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+		}
+	}
+
+	/* STOP RX Side */
+	/* Stop UDM */
+	for (i = 0; i < 10; i++) {
+		status = ark_udm_stop(ark->udm.v, 1);
+		if (status == 0)
+			break;
+	}
+	if (status || i != 0) {
+		PMD_DRV_LOG(ERR, "UDM stop anomaly. status %d iter: %u. (%s)\n",
+					status, i, __func__);
+		ark_udm_dump(ark->udm.v, "Stop anomaly");
+
+		mpu = ark->mpurx.v;
+		for (i = 0; i < ark->rx_queues; i++) {
+			ark_mpu_dump(mpu, "UDM Stop anomaly", i);
+			mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+		}
+	}
+
+	ark_udm_dump_stats(ark->udm.v, "Post stop");
+	ark_udm_dump_perf(ark->udm.v, "Post stop");
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		eth_ark_rx_dump_queue(dev, i, __func__);
+
+	/* Stop the packet checker if it is running */
+	if (ark->start_pg) {
+		ark_pmd_pktchkr_dump_stats(ark->pc);
+		ark_pmd_pktchkr_stop(ark->pc);
+	}
+}
+
+static void
+eth_ark_dev_close(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	uint16_t i;
+
+	if (ark->user_ext.dev_close)
+		ark->user_ext.dev_close(dev, ark->user_data);
+
+	eth_ark_dev_stop(dev);
+	eth_ark_udm_force_close(dev);
+
+	/*
+	 * TODO This should only be called once for the device during shutdown
+	 */
+	ark_rqp_dump(ark->rqpacing);
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		eth_ark_tx_queue_release(dev->data->tx_queues[i]);
+		dev->data->tx_queues[i] = 0;
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		eth_ark_dev_rx_queue_release(dev->data->rx_queues[i]);
+		dev->data->rx_queues[i] = 0;
+	}
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+					 struct rte_eth_dev_info *dev_info)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+	struct ark_mpu_t *tx_mpu = RTE_PTR_ADD(ark->bar0, ARK_MPU_TX_BASE);
+	struct ark_mpu_t *rx_mpu = RTE_PTR_ADD(ark->bar0, ARK_MPU_RX_BASE);
+
+	uint16_t ports = ark->num_ports;
+
+	/* device specific configuration */
+	memset(dev_info, 0, sizeof(*dev_info));
+
+	dev_info->max_rx_queues = ark_api_num_queues_per_port(rx_mpu, ports);
+	dev_info->max_tx_queues = ark_api_num_queues_per_port(tx_mpu, ports);
+	dev_info->max_mac_addrs = 0;
+	dev_info->if_index = 0;
+	dev_info->max_rx_pktlen = (16 * 1024) - 128;
+	dev_info->min_rx_bufsize = 1024;
+	dev_info->rx_offload_capa = 0;
+	dev_info->tx_offload_capa = 0;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = 4096 * 4,
+		.nb_min = 512,	/* HW Q size for RX */
+		.nb_align = 2,};
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+		.nb_max = 4096 * 4,
+		.nb_min = 256,	/* HW Q size for TX */
+		.nb_align = 2,};
+
+	dev_info->rx_offload_capa = 0;
+	dev_info->tx_offload_capa = 0;
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa =
+		ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G | ETH_LINK_SPEED_25G |
+		ETH_LINK_SPEED_40G | ETH_LINK_SPEED_50G | ETH_LINK_SPEED_100G;
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+	dev_info->driver_name = dev->data->drv_name;
+}
+
+static int
+eth_ark_dev_link_update(struct rte_eth_dev *dev, int wait_to_complete)
+{
+	ARK_DEBUG_TRACE("ARKP: link status = %d\n",
+					dev->data->dev_link.link_status);
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (ark->user_ext.link_update) {
+		return ark->user_ext.link_update
+			(dev, wait_to_complete,
+			 ark->user_data);
+	}
+	return 0;
+}
+
+static int
+eth_ark_dev_set_link_up(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = 1;
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (ark->user_ext.dev_set_link_up)
+		return ark->user_ext.dev_set_link_up(dev, ark->user_data);
+	return 0;
+}
+
+static int
+eth_ark_dev_set_link_down(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = 0;
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (ark->user_ext.dev_set_link_down)
+		return ark->user_ext.dev_set_link_down(dev, ark->user_data);
+	return 0;
+}
+
+static void
+eth_ark_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	uint16_t i;
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->imissed = 0;
+	stats->oerrors = 0;
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		eth_tx_queue_stats_get(dev->data->tx_queues[i], stats);
+	for (i = 0; i < dev->data->nb_rx_queues; i++)
+		eth_rx_queue_stats_get(dev->data->rx_queues[i], stats);
+	if (ark->user_ext.stats_get)
+		ark->user_ext.stats_get(dev, stats, ark->user_data);
+}
+
+static void
+eth_ark_dev_stats_reset(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		eth_tx_queue_stats_reset(dev->data->rx_queues[i]);
+	for (i = 0; i < dev->data->nb_rx_queues; i++)
+		eth_rx_queue_stats_reset(dev->data->rx_queues[i]);
+	if (ark->user_ext.stats_reset)
+		ark->user_ext.stats_reset(dev, ark->user_data);
+}
+
+static void
+eth_ark_macaddr_add(struct rte_eth_dev *dev,
+					struct ether_addr *mac_addr,
+					uint32_t index,
+					uint32_t pool)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (ark->user_ext.mac_addr_add)
+		ark->user_ext.mac_addr_add
+			(dev,
+			 mac_addr,
+			 index,
+			 pool,
+			 ark->user_data);
+}
+
+static void
+eth_ark_macaddr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (ark->user_ext.mac_addr_remove)
+		ark->user_ext.mac_addr_remove(dev, index, ark->user_data);
+}
+
+static void
+eth_ark_set_default_mac_addr(struct rte_eth_dev *dev,
+			 struct ether_addr *mac_addr)
+{
+	struct ark_adapter *ark =
+		(struct ark_adapter *)dev->data->dev_private;
+
+	if (ark->user_ext.mac_addr_set)
+		ark->user_ext.mac_addr_set(dev, mac_addr, ark->user_data);
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+				   void *extra_args __rte_unused)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktdir_arg, key = %s, value = %s\n",
+					key, value);
+	pkt_dir_v = strtol(value, NULL, 16);
+	ARK_DEBUG_TRACE("pkt_dir_v = 0x%x\n", pkt_dir_v);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktgen_arg, key = %s, value = %s\n",
+					key, value);
+	char *args = (char *)extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[256];
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+		/* ARK_DEBUG_TRACE("%s\n", line); */
+		if (first) {
+			strncpy(args, line, ARK_MAX_ARG_LEN);
+			first = 0;
+		} else {
+			strncat(args, line, ARK_MAX_ARG_LEN);
+		}
+	}
+	ARK_DEBUG_TRACE("file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned int k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	/*
+	 * TODO: the index of gark[index] should be associated with phy dev
+	 * map
+	 */
+	struct ark_adapter *ark = gark[0];
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+		return 0;
+
+	pkt_gen_args[0] = 0;
+	pkt_chkr_args[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+		pair = &kvlist->pairs[k_idx];
+		ARK_DEBUG_TRACE("**** Arg passed to PMD = %s:%s\n", pair->key,
+						pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist,
+						   ARK_PKTDIR_ARG,
+						   &process_pktdir_arg,
+						   NULL) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+						   ARK_PKTGEN_ARG,
+						   &process_file_args,
+						   pkt_gen_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist,
+						   ARK_PKTCHKR_ARG,
+						   &process_file_args,
+						   pkt_chkr_args) != 0) {
+		PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+	}
+
+	/* Setup the packet director */
+	ark_pmd_pktdir_setup(ark->pd, pkt_dir_v);
+	ARK_DEBUG_TRACE("INFO: packet director set to 0x%x\n", pkt_dir_v);
+
+	/* Setup the packet generator */
+	if (pkt_gen_args[0]) {
+		PMD_DRV_LOG(INFO, "Setting up the packet generator\n");
+		ark_pmd_pktgen_parse(pkt_gen_args);
+		ark_pmd_pktgen_reset(ark->pg);
+		ark_pmd_pktgen_setup(ark->pg);
+		ark->start_pg = 1;
+	}
+
+	/* Setup the packet checker */
+	if (pkt_chkr_args[0]) {
+		ark_pmd_pktchkr_parse(pkt_chkr_args);
+		ark_pmd_pktchkr_setup(ark->pc);
+	}
+
+	return 1;
+}
+
+static int
+pmd_ark_probe(const char *name, const char *params)
+{
+	RTE_LOG(INFO, PMD, "Initializing pmd_ark for %s params = %s\n", name,
+			params);
+
+	/* Parse off the v index */
+
+	eth_ark_check_args(params);
+	return 0;
+}
+
+static int
+pmd_ark_remove(const char *name)
+{
+	RTE_LOG(INFO, PMD, "Closing ark %s ethdev on numa socket %u\n", name,
+			rte_socket_id());
+	return 1;
+}
+
+static struct rte_vdev_driver pmd_ark_drv = {
+	.probe = pmd_ark_probe,
+	.remove = pmd_ark_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_ark, pmd_ark_drv);
+RTE_PMD_REGISTER_ALIAS(net_ark, eth_ark);
+RTE_PMD_REGISTER_PCI(eth_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(eth_ark, pci_id_ark_map);
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..9167181
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,75 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+int ark_get_port_id(struct rte_eth_dev *dev, struct ark_adapter *ark);
+
+/* RX functions */
+int eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t queue_idx,
+	uint16_t nb_desc,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf, struct rte_mempool *mp);
+uint32_t eth_ark_dev_rx_queue_count(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id);
+int eth_ark_rx_stop_queue(struct rte_eth_dev *dev, uint16_t queue_id);
+int eth_ark_rx_start_queue(struct rte_eth_dev *dev, uint16_t queue_id);
+uint16_t eth_ark_recv_pkts_noop(void *rx_queue, struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts);
+uint16_t eth_ark_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts);
+void eth_ark_dev_rx_queue_release(void *rx_queue);
+void eth_rx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats);
+void eth_rx_queue_stats_reset(void *vqueue);
+void eth_ark_rx_dump_queue(struct rte_eth_dev *dev, uint16_t queue_id,
+	const char *msg);
+
+void eth_ark_udm_force_close(struct rte_eth_dev *dev);
+
+/* TX functions */
+uint16_t eth_ark_xmit_pkts_noop(void *txq, struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts);
+uint16_t eth_ark_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts);
+int eth_ark_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+	uint16_t nb_desc, unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf);
+void eth_ark_tx_queue_release(void *tx_queue);
+int eth_ark_tx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id);
+int eth_ark_tx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id);
+void eth_tx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats);
+void eth_tx_queue_stats_reset(void *vqueue);
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
new file mode 100644
index 0000000..3a1c778
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -0,0 +1,667 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+#include "ark_mpu.h"
+#include "ark_udm.h"
+
+#define ARK_RX_META_SIZE 32
+#define ARK_RX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_RX_META_SIZE)
+#define ARK_RX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
+
+#ifdef RTE_LIBRTE_ARK_DEBUG_RX
+#define ARK_RX_DEBUG 1
+#define ARK_FULL_DEBUG 1
+#else
+#define ARK_RX_DEBUG 0
+#define ARK_FULL_DEBUG 0
+#endif
+
+/* Forward declarations */
+struct ark_rx_queue;
+struct ark_rx_meta;
+
+static void dump_mbuf_data(struct rte_mbuf *mbuf, uint16_t lo, uint16_t hi);
+static void ark_ethdev_rx_dump(const char *name, struct ark_rx_queue *queue);
+static uint32_t eth_ark_rx_jumbo(struct ark_rx_queue *queue,
+	struct ark_rx_meta *meta, struct rte_mbuf *mbuf0, uint32_t cons_index);
+static inline int eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue);
+
+/* ************************************************************************* */
+struct ark_rx_queue {
+	/* array of mbufs to populate */
+	struct rte_mbuf **reserve_q;
+	/* array of physical addrresses of the mbuf data pointer */
+	/* This point is a virtual address */
+	phys_addr_t *paddress_q;
+	struct rte_mempool *mb_pool;
+
+	struct ark_udm_t *udm;
+	struct ark_mpu_t *mpu;
+
+	uint32_t queue_size;
+	uint32_t queue_mask;
+
+	uint32_t seed_index;		/* 1 set with an empty mbuf */
+	uint32_t cons_index;		/* 3 consumed by the driver */
+
+	/* The queue Id is used to identify the HW Q */
+	uint16_t phys_qid;
+
+	/* The queue Index is used within the dpdk device structures */
+	uint16_t queue_index;
+
+	uint32_t pad1;
+
+	/* separate cache line */
+	/* second cache line - fields only used in slow path */
+	MARKER cacheline1 __rte_cache_min_aligned;
+
+	volatile uint32_t prod_index;	/* 2 filled by the HW */
+} __rte_cache_aligned;
+
+/* ************************************************************************* */
+
+/* MATCHES struct in UDMDefines.bsv */
+
+/* TODO move to ark_udm.h */
+struct ark_rx_meta {
+	uint64_t timestamp;
+	uint64_t user_data;
+	uint8_t port;
+	uint8_t dst_queue;
+	uint16_t pkt_len;
+};
+
+/* ************************************************************************* */
+
+/* TODO  pick a better function name */
+static int
+eth_ark_rx_queue_setup(struct rte_eth_dev *dev,
+	struct ark_rx_queue *queue,
+	uint16_t rx_queue_id __rte_unused, uint16_t rx_queue_idx)
+{
+	phys_addr_t queue_base;
+	phys_addr_t phys_addr_q_base;
+	phys_addr_t phys_addr_prod_index;
+
+	queue_base = rte_malloc_virt2phy(queue);
+	phys_addr_prod_index = queue_base +
+		offsetof(struct ark_rx_queue, prod_index);
+
+	phys_addr_q_base = rte_malloc_virt2phy(queue->paddress_q);
+
+	/* Verify HW */
+	if (ark_mpu_verify(queue->mpu, sizeof(phys_addr_t))) {
+	PMD_DRV_LOG(ERR, "ARKP: Illegal configuration rx queue\n");
+	return -1;
+	}
+
+	/* Stop and Reset and configure MPU */
+	ark_mpu_configure(queue->mpu, phys_addr_q_base, queue->queue_size, 0);
+
+	ark_udm_write_addr(queue->udm, phys_addr_prod_index);
+
+	/* advance the valid pointer, but don't start until the queue starts */
+	ark_mpu_reset_stats(queue->mpu);
+
+	/* The seed is the producer index for the HW */
+	ark_mpu_set_producer(queue->mpu, queue->seed_index);
+	dev->data->rx_queue_state[rx_queue_idx] = RTE_ETH_QUEUE_STATE_STOPPED;
+
+	return 0;
+}
+
+static inline void
+eth_ark_rx_update_cons_index(struct ark_rx_queue *queue, uint32_t cons_index)
+{
+	queue->cons_index = cons_index;
+	eth_ark_rx_seed_mbufs(queue);
+	ark_mpu_set_producer(queue->mpu, queue->seed_index);
+}
+
+/* ************************************************************************* */
+int
+eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t queue_idx,
+	uint16_t nb_desc,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf, struct rte_mempool *mb_pool)
+{
+	struct ark_adapter *ark = (struct ark_adapter *)dev->data->dev_private;
+	static int warning1;		/* = 0 */
+
+	struct ark_rx_queue *queue;
+	uint32_t i;
+	int status;
+
+	int port = ark_get_port_id(dev, ark);
+	int qidx = port + queue_idx;	/* TODO FIXME */
+
+	// TODO: We may already be setup, check here if there is nothing to do
+	/* Free memory prior to re-allocation if needed */
+	if (dev->data->rx_queues[queue_idx] != NULL) {
+	// TODO: release any allocated queues
+	dev->data->rx_queues[queue_idx] = NULL;
+	}
+
+	if (rx_conf != NULL && warning1 == 0) {
+	warning1 = 1;
+	PMD_DRV_LOG(INFO,
+		"ARKP: Arkville PMD ignores  rte_eth_rxconf argument.\n");
+	}
+
+	if (RTE_PKTMBUF_HEADROOM < ARK_RX_META_SIZE) {
+	PMD_DRV_LOG(ERR,
+		"Error: DPDK Arkville requires head room > %d bytes (%s)\n",
+		ARK_RX_META_SIZE, __func__);
+	return -1;		/* ERROR CODE */
+	}
+
+	if (!rte_is_power_of_2(nb_desc)) {
+	PMD_DRV_LOG(ERR,
+		"DPDK Arkville configuration queue size must be power of two %u (%s)\n",
+		nb_desc, __func__);
+	return -1;		/* ERROR CODE */
+	}
+
+	/* Allocate queue struct */
+	queue =
+	rte_zmalloc_socket("Ark_rXQueue", sizeof(struct ark_rx_queue), 64,
+	socket_id);
+	if (queue == 0) {
+		PMD_DRV_LOG(ERR, "Failed to allocate memory in %s\n", __func__);
+		return -ENOMEM;
+	}
+
+	/* NOTE zmalloc is used, no need to 0 indexes, etc. */
+	queue->mb_pool = mb_pool;
+	queue->phys_qid = qidx;
+	queue->queue_index = queue_idx;
+	queue->queue_size = nb_desc;
+	queue->queue_mask = nb_desc - 1;
+
+	queue->reserve_q =
+	rte_zmalloc_socket("Ark_rXQueue mbuf",
+	nb_desc * sizeof(struct rte_mbuf *), 64, socket_id);
+	queue->paddress_q =
+	rte_zmalloc_socket("Ark_rXQueue paddr", nb_desc * sizeof(phys_addr_t),
+	64, socket_id);
+	if (queue->reserve_q == 0 || queue->paddress_q == 0) {
+	PMD_DRV_LOG(ERR, "Failed to allocate queue memory in %s\n", __func__);
+	rte_free(queue->reserve_q);
+	rte_free(queue->paddress_q);
+	rte_free(queue);
+	return -ENOMEM;
+	}
+
+	dev->data->rx_queues[queue_idx] = queue;
+	queue->udm = RTE_PTR_ADD(ark->udm.v, qidx * ARK_UDM_QOFFSET);
+	queue->mpu = RTE_PTR_ADD(ark->mpurx.v, qidx * ARK_MPU_QOFFSET);
+
+	/* populate mbuf reserve */
+	status = eth_ark_rx_seed_mbufs(queue);
+
+	/* MPU Setup */
+	if (status == 0)
+		status = eth_ark_rx_queue_setup(dev, queue, qidx, queue_idx);
+
+	if (unlikely(status != 0)) {
+	struct rte_mbuf *mbuf;
+
+	PMD_DRV_LOG(ERR, "ARKP Failed to initialize RX queue %d %s\n", qidx,
+		__func__);
+	/* Free the mbufs allocated */
+	for (i = 0, mbuf = queue->reserve_q[0]; i < nb_desc; ++i, mbuf++) {
+		if (mbuf != 0)
+			rte_pktmbuf_free(mbuf);
+	}
+	rte_free(queue->reserve_q);
+	rte_free(queue->paddress_q);
+	rte_free(queue);
+	return -1;		/* ERROR CODE */
+	}
+
+	return 0;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_recv_pkts_noop(void *rx_queue __rte_unused,
+	struct rte_mbuf **rx_pkts __rte_unused, uint16_t nb_pkts __rte_unused)
+{
+	return 0;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	struct ark_rx_queue *queue;
+	register uint32_t cons_index, prod_index;
+	uint16_t nb;
+	uint64_t rx_bytes = 0;
+	struct rte_mbuf *mbuf;
+	struct ark_rx_meta *meta;
+
+	queue = (struct ark_rx_queue *)rx_queue;
+	if (unlikely(queue == 0))
+	return 0;
+	if (unlikely(nb_pkts == 0))
+	return 0;
+	prod_index = queue->prod_index;
+	cons_index = queue->cons_index;
+	nb = 0;
+
+	while (prod_index != cons_index) {
+		mbuf = queue->reserve_q[cons_index & queue->queue_mask];
+		/* prefetch mbuf ? */
+		rte_mbuf_prefetch_part1(mbuf);
+		rte_mbuf_prefetch_part2(mbuf);
+
+		/* META DATA burried in buffer */
+		meta = RTE_PTR_ADD(mbuf->buf_addr, ARK_RX_META_OFFSET);
+
+		mbuf->port = meta->port;
+		mbuf->pkt_len = meta->pkt_len;
+		mbuf->data_len = meta->pkt_len;
+		mbuf->data_off = RTE_PKTMBUF_HEADROOM;
+		mbuf->udata64 = meta->user_data;
+		if (ARK_RX_DEBUG) {	/* debug use */
+			if ((meta->pkt_len > (1024 * 16)) ||
+				(meta->pkt_len == 0)) {
+				PMD_DRV_LOG(INFO,
+						"ARKP RX: Bad Meta Q: %u cons: %u prod: %u\n",
+						queue->phys_qid,
+						cons_index,
+						queue->prod_index);
+
+				PMD_DRV_LOG(INFO, "       :  cons: %u prod: %u seed_index %u\n",
+						cons_index,
+						queue->prod_index,
+						queue->seed_index);
+
+				PMD_DRV_LOG(INFO, "       :  UDM prod: %u  len: %u\n",
+						queue->udm->rt_cfg.prod_idx,
+						meta->pkt_len);
+				ark_mpu_dump(queue->mpu,
+							 "    ",
+							 queue->phys_qid);
+
+				dump_mbuf_data(mbuf, 0, 256);
+				/* its FUBAR so fix it */
+				mbuf->pkt_len = 63;
+				meta->pkt_len = 63;
+			}
+			mbuf->seqn = cons_index;
+		}
+
+		rx_bytes += meta->pkt_len;	/* TEMP stats */
+
+		if (unlikely(meta->pkt_len > ARK_RX_MAX_NOCHAIN))
+			cons_index = eth_ark_rx_jumbo
+				(queue, meta, mbuf, cons_index + 1);
+		else
+			cons_index += 1;
+
+		rx_pkts[nb] = mbuf;
+		nb++;
+		if (nb >= nb_pkts)
+			break;
+	}
+
+	if (unlikely(nb != 0))
+		/* report next free to FPGA */
+		eth_ark_rx_update_cons_index(queue, cons_index);
+
+	return nb;
+}
+
+/* ************************************************************************* */
+static uint32_t
+eth_ark_rx_jumbo(struct ark_rx_queue *queue,
+	struct ark_rx_meta *meta, struct rte_mbuf *mbuf0, uint32_t cons_index)
+{
+	struct rte_mbuf *mbuf_prev;
+	struct rte_mbuf *mbuf;
+
+	uint16_t remaining;
+	uint16_t data_len;
+	uint8_t segments;
+
+	/* first buf populated by called */
+	mbuf_prev = mbuf0;
+	segments = 1;
+	data_len = RTE_MIN(meta->pkt_len, RTE_MBUF_DEFAULT_DATAROOM);
+	remaining = meta->pkt_len - data_len;
+	mbuf0->data_len = data_len;
+
+	/* TODO check that the data does not exceed prod_index! */
+	while (remaining != 0) {
+		data_len =
+			RTE_MIN(remaining,
+					RTE_MBUF_DEFAULT_DATAROOM +
+					RTE_PKTMBUF_HEADROOM);
+
+		remaining -= data_len;
+		segments += 1;
+
+		mbuf = queue->reserve_q[cons_index & queue->queue_mask];
+		mbuf_prev->next = mbuf;
+		mbuf_prev = mbuf;
+		mbuf->data_len = data_len;
+		mbuf->data_off = 0;
+		if (ARK_RX_DEBUG)
+			mbuf->seqn = cons_index;	/* for debug only */
+
+		cons_index += 1;
+	}
+
+	mbuf0->nb_segs = segments;
+	return cons_index;
+}
+
+/* Drain the internal queue allowing hw to clear out. */
+static void
+eth_ark_rx_queue_drain(struct ark_rx_queue *queue)
+{
+	register uint32_t cons_index;
+	struct rte_mbuf *mbuf;
+
+	cons_index = queue->cons_index;
+
+	/* NOT performance optimized, since this is a one-shot call */
+	while ((cons_index ^ queue->prod_index) & queue->queue_mask) {
+		mbuf = queue->reserve_q[cons_index & queue->queue_mask];
+		rte_pktmbuf_free(mbuf);
+		cons_index++;
+		eth_ark_rx_update_cons_index(queue, cons_index);
+	}
+}
+
+uint32_t
+eth_ark_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+	return (queue->prod_index - queue->cons_index);	/* mod arith */
+}
+
+/* ************************************************************************* */
+int
+eth_ark_rx_start_queue(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+	if (queue == 0)
+	return -1;
+
+	dev->data->rx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STARTED;
+
+	ark_mpu_set_producer(queue->mpu, queue->seed_index);
+	ark_mpu_start(queue->mpu);
+
+	ark_udm_queue_enable(queue->udm, 1);
+
+	return 0;
+}
+
+/* ************************************************************************* */
+
+/* Queue can be restarted.   data remains
+ */
+int
+eth_ark_rx_stop_queue(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+	if (queue == 0)
+	return -1;
+
+	ark_udm_queue_enable(queue->udm, 0);
+
+	dev->data->rx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STOPPED;
+
+	return 0;
+}
+
+/* ************************************************************************* */
+static inline int
+eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue)
+{
+	uint32_t limit = queue->cons_index + queue->queue_size;
+	uint32_t seed_index = queue->seed_index;
+
+	uint32_t count = 0;
+	uint32_t seed_m = queue->seed_index & queue->queue_mask;
+
+	uint32_t nb = limit - seed_index;
+
+	/* Handle wrap around -- remainder is filled on the next call */
+	if (unlikely(seed_m + nb > queue->queue_size))
+		nb = queue->queue_size - seed_m;
+
+	struct rte_mbuf **mbufs = &queue->reserve_q[seed_m];
+	int status = rte_pktmbuf_alloc_bulk(queue->mb_pool, mbufs, nb);
+
+	if (unlikely(status != 0))
+		return -1;
+
+	if (ARK_RX_DEBUG) {		/* DEBUG */
+		while (count != nb) {
+			struct rte_mbuf *mbuf_init =
+				queue->reserve_q[seed_m + count];
+
+			memset(mbuf_init->buf_addr, -1, 512);
+			*((uint32_t *)mbuf_init->buf_addr) = seed_index + count;
+			*(uint16_t *)RTE_PTR_ADD(mbuf_init->buf_addr, 4) =
+				queue->phys_qid;
+			count++;
+		}
+		count = 0;
+	}
+	/* DEBUG */
+	queue->seed_index += nb;
+
+	/* Duff's device https://en.wikipedia.org/wiki/Duff's_device */
+	switch (nb % 4) {
+	case 0:
+	while (count != nb) {
+		queue->paddress_q[seed_m++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+	case 3:
+		queue->paddress_q[seed_m++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+	case 2:
+		queue->paddress_q[seed_m++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+	case 1:
+		queue->paddress_q[seed_m++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+
+	} /* while (count != nb) */
+	} /* switch */
+
+	return 0;
+}
+
+void
+eth_ark_rx_dump_queue(struct rte_eth_dev *dev, uint16_t queue_id,
+	const char *msg)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+
+	ark_ethdev_rx_dump(msg, queue);
+}
+
+/* ************************************************************************* */
+
+/* Call on device closed no user API, queue is stopped */
+void
+eth_ark_dev_rx_queue_release(void *vqueue)
+{
+	struct ark_rx_queue *queue;
+	uint32_t i;
+
+	queue = (struct ark_rx_queue *)vqueue;
+	if (queue == 0)
+		return;
+
+	ark_udm_queue_enable(queue->udm, 0);
+	/* Stop the MPU since pointer are going away */
+	ark_mpu_stop(queue->mpu);
+
+	/* Need to clear out mbufs here, dropping packets along the way */
+	eth_ark_rx_queue_drain(queue);
+
+	for (i = 0; i < queue->queue_size; ++i)
+		rte_pktmbuf_free(queue->reserve_q[i]);
+
+	rte_free(queue->reserve_q);
+	rte_free(queue->paddress_q);
+	rte_free(queue);
+}
+
+void
+eth_rx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats)
+{
+	struct ark_rx_queue *queue;
+	struct ark_udm_t *udm;
+
+	queue = vqueue;
+	if (queue == 0)
+	return;
+	udm = queue->udm;
+
+	uint64_t ibytes = ark_udm_bytes(udm);
+	uint64_t ipackets = ark_udm_packets(udm);
+	uint64_t idropped = ark_udm_dropped(queue->udm);
+
+	stats->q_ipackets[queue->queue_index] = ipackets;
+	stats->q_ibytes[queue->queue_index] = ibytes;
+	stats->q_errors[queue->queue_index] = idropped;
+	stats->ipackets += ipackets;
+	stats->ibytes += ibytes;
+	stats->imissed += idropped;
+}
+
+void
+eth_rx_queue_stats_reset(void *vqueue)
+{
+	struct ark_rx_queue *queue;
+
+	queue = vqueue;
+	if (queue == 0)
+		return;
+
+	ark_mpu_reset_stats(queue->mpu);
+	ark_udm_queue_stats_reset(queue->udm);
+}
+
+void
+eth_ark_udm_force_close(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *)dev->data->dev_private;
+	struct ark_rx_queue *queue;
+	uint32_t index;
+	uint16_t i;
+
+	if (!ark_udm_is_flushed(ark->udm.v)) {
+	/* restart the MPUs */
+	fprintf(stderr, "ARK: %s UDM not flushed\n", __func__);
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		queue = (struct ark_rx_queue *)dev->data->rx_queues[i];
+		if (queue == 0)
+		continue;
+
+		ark_mpu_start(queue->mpu);
+		/* Add some buffers */
+		index = 100000 + queue->seed_index;
+		ark_mpu_set_producer(queue->mpu, index);
+	}
+	/* Wait to allow data to pass */
+	usleep(100);
+
+	ARK_DEBUG_TRACE("UDM forced flush attempt, stopped = %d\n",
+		ark_udm_is_flushed(ark->udm.v));
+	}
+	ark_udm_reset(ark->udm.v);
+}
+
+static void
+ark_ethdev_rx_dump(const char *name, struct ark_rx_queue *queue)
+{
+	if (queue == NULL)
+	return;
+	ARK_DEBUG_TRACE("RX QUEUE %d -- %s", queue->phys_qid, name);
+	ARK_DEBUG_TRACE(ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 "\n",
+	"queue_size", queue->queue_size,
+	"seed_index", queue->seed_index,
+	"prod_index", queue->prod_index, "cons_index", queue->cons_index);
+
+	ark_mpu_dump(queue->mpu, name, queue->phys_qid);
+	ark_mpu_dump_setup(queue->mpu, queue->phys_qid);
+	ark_udm_dump(queue->udm, name);
+	ark_udm_dump_setup(queue->udm, queue->phys_qid);
+}
+
+static void
+dump_mbuf_data(struct rte_mbuf *mbuf, uint16_t lo, uint16_t hi)
+{
+	uint16_t i, j;
+
+	fprintf(stderr, " MBUF: %p len %d, off: %d, seq: %u\n", mbuf,
+	mbuf->pkt_len, mbuf->data_off, mbuf->seqn);
+	for (i = lo; i < hi; i += 16) {
+		uint8_t *dp = RTE_PTR_ADD(mbuf->buf_addr, i);
+
+		fprintf(stderr, "  %6d:  ", i);
+		for (j = 0; j < 16; j++)
+			fprintf(stderr, " %02x", dp[j]);
+
+		fprintf(stderr, "\n");
+	}
+}
diff --git a/drivers/net/ark/ark_ethdev_tx.c b/drivers/net/ark/ark_ethdev_tx.c
new file mode 100644
index 0000000..2b14feb
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev_tx.c
@@ -0,0 +1,492 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_global.h"
+#include "ark_mpu.h"
+#include "ark_ddm.h"
+#include "ark_ethdev.h"
+#include "ark_debug.h"
+
+#define ARK_TX_META_SIZE   32
+#define ARK_TX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_TX_META_SIZE)
+#define ARK_TX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
+#define ARK_TX_PAD_TO_60   1
+
+#ifdef RTE_LIBRTE_ARK_DEBUG_TX
+#define ARK_TX_DEBUG       1
+#define ARK_TX_DEBUG_JUMBO 1
+#else
+#define ARK_TX_DEBUG       0
+#define ARK_TX_DEBUG_JUMBO 0
+#endif
+
+/* ************************************************************************* */
+
+/* struct fixed in FPGA -- 16 bytes */
+
+/* TODO move to ark_ddm.h */
+struct ark_tx_meta {
+	uint64_t physaddr;
+	uint32_t delta_ns;
+	uint16_t data_len;		/* of this MBUF */
+#define   ARK_DDM_EOP   0x01
+#define   ARK_DDM_SOP   0x02
+	uint8_t flags;		/* bit 0 indicates last mbuf in chain. */
+	uint8_t reserved[1];
+};
+
+/* ************************************************************************* */
+struct ark_tx_queue {
+	struct ark_tx_meta *meta_q;
+	struct rte_mbuf **bufs;
+
+	/* handles for hw objects */
+	struct ark_mpu_t *mpu;
+	struct ark_ddm_t *ddm;
+
+	/* Stats HW tracks bytes and packets, need to count send errors */
+	uint64_t tx_errors;
+
+	uint32_t queue_size;
+	uint32_t queue_mask;
+
+	/* 3 indexs to the paired data rings. */
+	uint32_t prod_index;		/* where to put the next one */
+	uint32_t free_index;		/* mbuf has been freed */
+
+	// The queue Id is used to identify the HW Q
+	uint16_t phys_qid;
+	/* The queue Index within the dpdk device structures */
+	uint16_t queue_index;
+
+	uint32_t pad[1];
+
+	/* second cache line - fields only used in slow path */
+	MARKER cacheline1 __rte_cache_min_aligned;
+	uint32_t cons_index;		/* hw is done, can be freed */
+} __rte_cache_aligned;
+
+/* Forward declarations */
+static uint32_t eth_ark_tx_jumbo(struct ark_tx_queue *queue,
+	struct rte_mbuf *mbuf);
+static int eth_ark_tx_hw_queue_config(struct ark_tx_queue *queue);
+static void free_completed_tx(struct ark_tx_queue *queue);
+
+static inline void
+ark_tx_hw_queue_stop(struct ark_tx_queue *queue)
+{
+	ark_mpu_stop(queue->mpu);
+}
+
+/* ************************************************************************* */
+static inline void
+eth_ark_tx_meta_from_mbuf(struct ark_tx_meta *meta,
+	const struct rte_mbuf *mbuf, uint8_t flags)
+{
+	meta->physaddr = rte_mbuf_data_dma_addr(mbuf);
+	meta->delta_ns = 0;
+	meta->data_len = rte_pktmbuf_data_len(mbuf);
+	meta->flags = flags;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_xmit_pkts_noop(void *vtxq __rte_unused,
+	struct rte_mbuf **tx_pkts __rte_unused, uint16_t nb_pkts __rte_unused)
+{
+	return 0;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_xmit_pkts(void *vtxq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	struct ark_tx_queue *queue;
+	struct rte_mbuf *mbuf;
+	struct ark_tx_meta *meta;
+
+	uint32_t idx;
+	uint32_t prod_index_limit;
+	int stat;
+	uint16_t nb;
+
+	queue = (struct ark_tx_queue *)vtxq;
+
+	/* free any packets after the HW is done with them */
+	free_completed_tx(queue);
+
+	prod_index_limit = queue->queue_size + queue->free_index;
+
+	for (nb = 0;
+		 (nb < nb_pkts) && (queue->prod_index != prod_index_limit);
+		 ++nb) {
+		mbuf = tx_pkts[nb];
+
+		if (ARK_TX_PAD_TO_60) {
+			if (unlikely(rte_pktmbuf_pkt_len(mbuf) < 60)) {
+				/* this packet even if it is small can be split,
+				 * be sure to add to the end
+				 */
+				uint16_t to_add =
+					60 - rte_pktmbuf_pkt_len(mbuf);
+				char *appended =
+					rte_pktmbuf_append(mbuf, to_add);
+
+				if (appended == 0) {
+					/* This packet is in error,
+					 * we cannot send it so just
+					 * count it and delete it.
+					 */
+					queue->tx_errors += 1;
+					rte_pktmbuf_free(mbuf);
+					continue;
+				}
+				memset(appended, 0, to_add);
+			}
+		}
+
+		if (unlikely(mbuf->nb_segs != 1)) {
+			stat = eth_ark_tx_jumbo(queue, mbuf);
+			if (unlikely(stat != 0))
+				break;		/* Queue is full */
+		} else {
+			idx = queue->prod_index & queue->queue_mask;
+			queue->bufs[idx] = mbuf;
+			meta = &queue->meta_q[idx];
+			eth_ark_tx_meta_from_mbuf(meta,
+				  mbuf,
+				  ARK_DDM_SOP |
+				  ARK_DDM_EOP);
+			queue->prod_index++;
+		}
+	}
+
+	if (ARK_TX_DEBUG) {
+		if (nb != nb_pkts) {
+			PMD_DRV_LOG(ERR,
+				"ARKP TX: Failure to send: req: %u sent: %u prod: "
+				"%u cons: %u free: %u\n",
+				nb_pkts, nb, queue->prod_index,
+				queue->cons_index,
+				queue->free_index);
+			ark_mpu_dump(queue->mpu,
+						 "TX Failure MPU: ",
+						 queue->phys_qid);
+		}
+	}
+
+	/* let fpga know producer index.  */
+	if (likely(nb != 0))
+		ark_mpu_set_producer(queue->mpu, queue->prod_index);
+
+	return nb;
+}
+
+/* ************************************************************************* */
+static uint32_t
+eth_ark_tx_jumbo(struct ark_tx_queue *queue, struct rte_mbuf *mbuf)
+{
+	struct rte_mbuf *next;
+	struct ark_tx_meta *meta;
+	uint32_t free_queue_space;
+	uint32_t idx;
+	uint8_t flags = ARK_DDM_SOP;
+
+	free_queue_space = queue->queue_mask -
+		(queue->prod_index - queue->free_index);
+	if (unlikely(free_queue_space < mbuf->nb_segs))
+		return -1;
+
+	if (ARK_TX_DEBUG_JUMBO) {
+	PMD_DRV_LOG(ERR,
+		"ARKP  JUMBO TX len: %u segs: %u prod:"
+		"%u cons: %u free: %u free_space: %u\n",
+		mbuf->pkt_len, mbuf->nb_segs,
+				queue->prod_index,
+				queue->cons_index,
+		queue->free_index, free_queue_space);
+	}
+
+	while (mbuf != NULL) {
+	next = mbuf->next;
+
+	idx = queue->prod_index & queue->queue_mask;
+	queue->bufs[idx] = mbuf;
+	meta = &queue->meta_q[idx];
+
+	flags |= (next == NULL) ? ARK_DDM_EOP : 0;
+	eth_ark_tx_meta_from_mbuf(meta, mbuf, flags);
+	queue->prod_index++;
+
+	flags &= ~ARK_DDM_SOP;	/* drop SOP flags */
+	mbuf = next;
+	}
+
+	return 0;
+}
+
+/* ************************************************************************* */
+int
+eth_ark_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t queue_idx,
+	uint16_t nb_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct ark_adapter *ark = (struct ark_adapter *)dev->data->dev_private;
+	struct ark_tx_queue *queue;
+	int status;
+
+	/* TODO: divide the Q's evenly with the Vports */
+	int port = ark_get_port_id(dev, ark);
+	int qidx = port + queue_idx;	/* FIXME for multi queue */
+
+	if (!rte_is_power_of_2(nb_desc)) {
+		PMD_DRV_LOG(ERR,
+					"DPDK Arkville configuration queue size must be power of two %u (%s)\n",
+					nb_desc, __func__);
+		return -1;
+	}
+
+	/* Allocate queue struct */
+	queue =
+		rte_zmalloc_socket("Ark_tXQueue",
+			   sizeof(struct ark_tx_queue),
+			   64,
+			   socket_id);
+	if (queue == 0) {
+		PMD_DRV_LOG(ERR, "ARKP Failed to allocate tx "
+				"queue memory in %s\n",
+				__func__);
+		return -ENOMEM;
+	}
+
+	/* we use zmalloc no need to initialize fields */
+	queue->queue_size = nb_desc;
+	queue->queue_mask = nb_desc - 1;
+	queue->phys_qid = qidx;
+	queue->queue_index = queue_idx;
+	dev->data->tx_queues[queue_idx] = queue;
+
+	queue->meta_q =
+	rte_zmalloc_socket("Ark_tXQueue meta",
+	nb_desc * sizeof(struct ark_tx_meta), 64, socket_id);
+	queue->bufs =
+	rte_zmalloc_socket("Ark_tXQueue bufs",
+	nb_desc * sizeof(struct rte_mbuf *), 64, socket_id);
+
+	if (queue->meta_q == 0 || queue->bufs == 0) {
+		PMD_DRV_LOG(ERR, "Failed to allocate "
+					"queue memory in %s\n", __func__);
+		rte_free(queue->meta_q);
+		rte_free(queue->bufs);
+		rte_free(queue);
+		return -ENOMEM;
+	}
+
+	queue->ddm = RTE_PTR_ADD(ark->ddm.v, qidx * ARK_DDM_QOFFSET);
+	queue->mpu = RTE_PTR_ADD(ark->mputx.v, qidx * ARK_MPU_QOFFSET);
+
+	status = eth_ark_tx_hw_queue_config(queue);
+
+	if (unlikely(status != 0)) {
+		rte_free(queue->meta_q);
+		rte_free(queue->bufs);
+		rte_free(queue);
+		return -1;		/* ERROR CODE */
+	}
+
+	return 0;
+}
+
+/* ************************************************************************* */
+static int
+eth_ark_tx_hw_queue_config(struct ark_tx_queue *queue)
+{
+	phys_addr_t queue_base, ring_base, prod_index_addr;
+	uint32_t write_interval_ns;
+
+	/* Verify HW -- MPU */
+	if (ark_mpu_verify(queue->mpu, sizeof(struct ark_tx_meta)))
+		return -1;
+
+	queue_base = rte_malloc_virt2phy(queue);
+	ring_base = rte_malloc_virt2phy(queue->meta_q);
+	prod_index_addr =
+		queue_base + offsetof(struct ark_tx_queue, cons_index);
+
+	ark_mpu_stop(queue->mpu);
+	ark_mpu_reset(queue->mpu);
+
+	/* Stop and Reset and configure MPU */
+	ark_mpu_configure(queue->mpu, ring_base, queue->queue_size, 1);
+
+	/*
+	 * Adjust the write interval based on queue size --
+	 * increase pcie traffic
+	 * when low mbuf count
+	 */
+	switch (queue->queue_size) {
+	case 128:
+		write_interval_ns = 500;
+		break;
+	case 256:
+		write_interval_ns = 500;
+		break;
+	case 512:
+		write_interval_ns = 1000;
+		break;
+	default:
+		write_interval_ns = 2000;
+		break;
+	}
+
+	// Completion address in UDM
+	ark_ddm_setup(queue->ddm, prod_index_addr, write_interval_ns);
+
+	return 0;
+}
+
+/* ************************************************************************* */
+void
+eth_ark_tx_queue_release(void *vtx_queue)
+{
+	struct ark_tx_queue *queue;
+
+	queue = (struct ark_tx_queue *)vtx_queue;
+
+	ark_tx_hw_queue_stop(queue);
+
+	queue->cons_index = queue->prod_index;
+	free_completed_tx(queue);
+
+	rte_free(queue->meta_q);
+	rte_free(queue->bufs);
+	rte_free(queue);
+}
+
+/* ************************************************************************* */
+int
+eth_ark_tx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_tx_queue *queue;
+	int cnt = 0;
+
+	queue = dev->data->tx_queues[queue_id];
+
+	/* Wait for DDM to send out all packets. */
+	while (queue->cons_index != queue->prod_index) {
+		usleep(100);
+		if (cnt++ > 10000)
+			return -1;
+	}
+
+	ark_mpu_stop(queue->mpu);
+	free_completed_tx(queue);
+
+	dev->data->tx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STOPPED;
+
+	return 0;
+}
+
+int
+eth_ark_tx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_tx_queue *queue;
+
+	queue = dev->data->tx_queues[queue_id];
+	if (dev->data->tx_queue_state[queue_id] == RTE_ETH_QUEUE_STATE_STARTED)
+		return 0;
+
+	ark_mpu_start(queue->mpu);
+	dev->data->tx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STARTED;
+
+	return 0;
+}
+
+/* ************************************************************************* */
+static void
+free_completed_tx(struct ark_tx_queue *queue)
+{
+	struct rte_mbuf *mbuf;
+	struct ark_tx_meta *meta;
+	uint32_t top_index;
+
+	top_index = queue->cons_index;	/* read once */
+	while (queue->free_index != top_index) {
+		meta = &queue->meta_q[queue->free_index & queue->queue_mask];
+		mbuf = queue->bufs[queue->free_index & queue->queue_mask];
+
+		if (likely((meta->flags & ARK_DDM_SOP) != 0)) {
+			/* ref count of the mbuf is checked in this call. */
+			rte_pktmbuf_free(mbuf);
+		}
+		queue->free_index++;
+	}
+}
+
+/* ************************************************************************* */
+void
+eth_tx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats)
+{
+	struct ark_tx_queue *queue;
+	struct ark_ddm_t *ddm;
+	uint64_t bytes, pkts;
+
+	queue = vqueue;
+	ddm = queue->ddm;
+
+	bytes = ark_ddm_queue_byte_count(ddm);
+	pkts = ark_ddm_queue_pkt_count(ddm);
+
+	stats->q_opackets[queue->queue_index] = pkts;
+	stats->q_obytes[queue->queue_index] = bytes;
+	stats->opackets += pkts;
+	stats->obytes += bytes;
+	stats->oerrors += queue->tx_errors;
+}
+
+void
+eth_tx_queue_stats_reset(void *vqueue)
+{
+	struct ark_tx_queue *queue;
+	struct ark_ddm_t *ddm;
+
+	queue = vqueue;
+	ddm = queue->ddm;
+
+	ark_ddm_queue_reset_stats(ddm);
+	queue->tx_errors = 0;
+}
diff --git a/drivers/net/ark/ark_ext.h b/drivers/net/ark/ark_ext.h
new file mode 100644
index 0000000..0786d1f
--- /dev/null
+++ b/drivers/net/ark/ark_ext.h
@@ -0,0 +1,79 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_EXT_H_
+#define _ARK_EXT_H_
+
+/*
+ * Called post PMD init.
+ * The implementation returns its private data that gets passed into
+ * all other functions as user_data
+ * The ARK extension implementation MUST implement this function
+ */
+void *dev_init(struct rte_eth_dev *dev, void *a_bar, int port_id);
+
+/* Called during device shutdown */
+void dev_uninit(struct rte_eth_dev *dev, void *user_data);
+
+/* This call is optional and allows the
+ * extension to specify the number of supported ports.
+ */
+uint8_t dev_get_port_count(struct rte_eth_dev *dev, void *user_data);
+
+/*
+ * The following functions are optional and are directly mapped
+ * from the DPDK PMD ops structure.
+ * Each function if implemented is called after the ARK PMD
+ * implementation executes.
+ */
+int dev_configure(struct rte_eth_dev *dev, void *user_data);
+int dev_start(struct rte_eth_dev *dev, void *user_data);
+void dev_stop(struct rte_eth_dev *dev, void *user_data);
+void dev_close(struct rte_eth_dev *dev, void *user_data);
+int link_update(struct rte_eth_dev *dev, int wait_to_complete,
+	void *user_data);
+int dev_set_link_up(struct rte_eth_dev *dev, void *user_data);
+int dev_set_link_down(struct rte_eth_dev *dev, void *user_data);
+void stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats,
+	void *user_data);
+void stats_reset(struct rte_eth_dev *dev, void *user_data);
+void mac_addr_add(struct rte_eth_dev *dev,
+	struct ether_addr *macadr,
+				  uint32_t index,
+				  uint32_t pool,
+				  void *user_data);
+void mac_addr_remove(struct rte_eth_dev *dev, uint32_t index, void *user_data);
+void mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
+	void *user_data);
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..398f647
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,159 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#include "ark_pktdir.h"
+#include "ark_pktgen.h"
+#include "ark_pktchkr.h"
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPU_RX_BASE   0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPU_TX_BASE   0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xa0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xb0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define offset8(n)     n
+#define offset16(n)   ((n) / 2)
+#define offset32(n)   ((n) / 4)
+#define offset64(n)   ((n) / 8)
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define def_ptr(type, name) \
+	union type {		   \
+		uint64_t *t64;	   \
+		uint32_t *t32;	   \
+		uint16_t *t16;	   \
+		uint8_t  *t8;	   \
+		void     *v;	   \
+	} name
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_user_ext {
+	void *(*dev_init)(struct rte_eth_dev *, void *abar, int port_id);
+	void (*dev_uninit)(struct rte_eth_dev *, void *);
+	int (*dev_get_port_count)(struct rte_eth_dev *, void *);
+	int (*dev_configure)(struct rte_eth_dev *, void *);
+	int (*dev_start)(struct rte_eth_dev *, void *);
+	void (*dev_stop)(struct rte_eth_dev *, void *);
+	void (*dev_close)(struct rte_eth_dev *, void *);
+	int (*link_update)(struct rte_eth_dev *, int wait_to_complete, void *);
+	int (*dev_set_link_up)(struct rte_eth_dev *, void *);
+	int (*dev_set_link_down)(struct rte_eth_dev *, void *);
+	void (*stats_get)(struct rte_eth_dev *, struct rte_eth_stats *, void *);
+	void (*stats_reset)(struct rte_eth_dev *, void *);
+	void (*mac_addr_add)(struct rte_eth_dev *,
+						  struct ether_addr *,
+						 uint32_t,
+						 uint32_t,
+						 void *);
+	void (*mac_addr_remove)(struct rte_eth_dev *, uint32_t, void *);
+	void (*mac_addr_set)(struct rte_eth_dev *, struct ether_addr *, void *);
+};
+
+struct ark_adapter {
+	/* User extension private data */
+	void *user_data;
+
+	/* Pointers to packet generator and checker */
+	int start_pg;
+	ark_pkt_gen_t pg;
+	ark_pkt_chkr_t pc;
+	ark_pkt_dir_t pd;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Common for both PF and VF */
+	struct rte_eth_dev *eth_dev;
+
+	void *d_handle;
+	struct ark_user_ext user_ext;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* A Bar */
+	uint8_t *a_bar;
+
+	/* Arkville demo block offsets */
+	def_ptr(sys_ctrl, sysctrl);
+	def_ptr(pkt_gen, pktgen);
+	def_ptr(mpu_rx, mpurx);
+	def_ptr(UDM, udm);
+	def_ptr(mpu_tx, mputx);
+	def_ptr(DDM, ddm);
+	def_ptr(CMAC, cmac);
+	def_ptr(external, external);
+	def_ptr(pkt_dir, pktdir);
+	def_ptr(pkt_chkr, pktchkr);
+
+	int started;
+	uint16_t rx_queues;
+	uint16_t tx_queues;
+
+	struct ark_rqpace_t *rqpacing;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/ark_mpu.c b/drivers/net/ark/ark_mpu.c
new file mode 100644
index 0000000..7f60cbc
--- /dev/null
+++ b/drivers/net/ark/ark_mpu.c
@@ -0,0 +1,167 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_debug.h"
+#include "ark_mpu.h"
+
+uint16_t
+ark_api_num_queues(struct ark_mpu_t *mpu)
+{
+	return mpu->hw.num_queues;
+}
+
+uint16_t
+ark_api_num_queues_per_port(struct ark_mpu_t *mpu, uint16_t ark_ports)
+{
+	return mpu->hw.num_queues / ark_ports;
+}
+
+int
+ark_mpu_verify(struct ark_mpu_t *mpu, uint32_t obj_size)
+{
+	uint32_t version;
+
+	version = mpu->id.vernum & 0x0000fF00;
+	if ((mpu->id.idnum != 0x2055504d) || (mpu->hw.obj_size != obj_size) ||
+		version != 0x00003100) {
+		fprintf(stderr,
+		"   MPU module not found as expected %08x \"%c%c%c%c"
+		"%c%c%c%c\"\n", mpu->id.idnum, mpu->id.id[0], mpu->id.id[1],
+		mpu->id.id[2], mpu->id.id[3], mpu->id.ver[0], mpu->id.ver[1],
+		mpu->id.ver[2], mpu->id.ver[3]);
+		fprintf(stderr,
+				"   MPU HW num_queues: %u hw_depth %u, obj_size: %u, "
+				"obj_per_mrr: %u Expected size %u\n",
+		mpu->hw.num_queues, mpu->hw.hw_depth, mpu->hw.obj_size,
+		mpu->hw.obj_per_mrr, obj_size);
+	return -1;
+	}
+	return 0;
+}
+
+void
+ark_mpu_stop(struct ark_mpu_t *mpu)
+{
+	mpu->cfg.command = MPU_CMD_STOP;
+}
+
+void
+ark_mpu_start(struct ark_mpu_t *mpu)
+{
+	mpu->cfg.command = MPU_CMD_RUN;	/* run state */
+}
+
+int
+ark_mpu_reset(struct ark_mpu_t *mpu)
+{
+	int cnt = 0;
+
+	mpu->cfg.command = MPU_CMD_RESET;	/* reset */
+
+	while (mpu->cfg.command != MPU_CMD_IDLE) {
+	if (cnt++ > 1000)
+		break;
+	usleep(10);
+	}
+	if (mpu->cfg.command != MPU_CMD_IDLE) {
+	mpu->cfg.command = MPU_CMD_FORCE_RESET;	/* forced reset */
+	usleep(10);
+	}
+	ark_mpu_reset_stats(mpu);
+	return mpu->cfg.command != MPU_CMD_IDLE;
+}
+
+void
+ark_mpu_reset_stats(struct ark_mpu_t *mpu)
+{
+	mpu->stats.pci_request = 1;	/* reset stats */
+}
+
+int
+ark_mpu_configure(struct ark_mpu_t *mpu, phys_addr_t ring, uint32_t ring_size,
+	int is_tx)
+{
+	ark_mpu_reset(mpu);
+
+	if (!rte_is_power_of_2(ring_size)) {
+	fprintf(stderr, "ARKP Invalid ring size for MPU %d\n", ring_size);
+	return -1;
+	}
+
+	mpu->cfg.ring_base = ring;
+	mpu->cfg.ring_size = ring_size;
+	mpu->cfg.ring_mask = ring_size - 1;
+	mpu->cfg.min_host_move = is_tx ? 1 : mpu->hw.obj_per_mrr;
+	mpu->cfg.min_hw_move = mpu->hw.obj_per_mrr;
+	mpu->cfg.sw_prod_index = 0;
+	mpu->cfg.hw_cons_index = 0;
+	return 0;
+}
+
+void
+ark_mpu_dump(struct ark_mpu_t *mpu, const char *code, uint16_t qid)
+{
+	/* DUMP to see that we have started */
+	ARK_DEBUG_TRACE
+		("ARKP MPU: %s Q: %3u sw_prod %u, hw_cons: %u\n", code, qid,
+		 mpu->cfg.sw_prod_index, mpu->cfg.hw_cons_index);
+	ARK_DEBUG_TRACE
+		("ARKP MPU: %s state: %d count %d, reserved %d data 0x%08x_%08x 0x%08x_%08x\n",
+		 code, mpu->debug.state, mpu->debug.count, mpu->debug.reserved,
+		 mpu->debug.peek[1], mpu->debug.peek[0], mpu->debug.peek[3],
+		 mpu->debug.peek[2]
+		 );
+	ARK_DEBUG_STATS
+		("ARKP MPU: %s Q: %3u" ARK_SU64 ARK_SU64 ARK_SU64 ARK_SU64
+		 ARK_SU64 ARK_SU64 ARK_SU64 "\n", code, qid,
+		 "PCI Request:", mpu->stats.pci_request,
+		 "Queue_empty", mpu->stats.q_empty,
+		 "Queue_q1", mpu->stats.q_q1,
+		 "Queue_q2", mpu->stats.q_q2,
+		 "Queue_q3", mpu->stats.q_q3,
+		 "Queue_q4", mpu->stats.q_q4,
+		 "Queue_full", mpu->stats.q_full
+		 );
+}
+
+void
+ark_mpu_dump_setup(struct ark_mpu_t *mpu, uint16_t q_id)
+{
+	ARK_DEBUG_TRACE
+		("MPU Setup Q: %u"
+		 ARK_SU64X "\n", q_id,
+		 "ring_base", mpu->cfg.ring_base
+		 );
+}
diff --git a/drivers/net/ark/ark_mpu.h b/drivers/net/ark/ark_mpu.h
new file mode 100644
index 0000000..376c042
--- /dev/null
+++ b/drivers/net/ark/ark_mpu.h
@@ -0,0 +1,143 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_MPU_H_
+#define _ARK_MPU_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/*
+ * MPU hardware structures
+ */
+
+#define ARK_MPU_ID 0x00
+struct ark_mpu_id_t {
+	union {
+	char id[4];
+	uint32_t idnum;
+	};
+	union {
+	char ver[4];
+	uint32_t vernum;
+	};
+	uint32_t phys_id;
+	uint32_t mrr_code;
+};
+
+#define ARK_MPU_HW 0x010
+struct ark_mpu_hw_t {
+	uint16_t num_queues;
+	uint16_t reserved;
+	uint32_t hw_depth;
+	uint32_t obj_size;
+	uint32_t obj_per_mrr;
+};
+
+#define ARK_MPU_CFG 0x040
+struct ark_mpu_cfg_t {
+	phys_addr_t ring_base;	/* phys_addr_t is a uint64_t */
+	uint32_t ring_size;
+	uint32_t ring_mask;
+	uint32_t min_host_move;
+	uint32_t min_hw_move;
+	volatile uint32_t sw_prod_index;
+	volatile uint32_t hw_cons_index;
+	volatile uint32_t command;
+};
+enum ARK_MPU_COMMAND {
+	MPU_CMD_IDLE = 1, MPU_CMD_RUN = 2, MPU_CMD_STOP = 4, MPU_CMD_RESET =
+	8, MPU_CMD_FORCE_RESET = 16, MPU_COMMAND_LIMIT = 0xfFFFFFFF
+};
+
+#define ARK_MPU_STATS 0x080
+struct ark_mpu_stats_t {
+	volatile uint64_t pci_request;
+	volatile uint64_t q_empty;
+	volatile uint64_t q_q1;
+	volatile uint64_t q_q2;
+	volatile uint64_t q_q3;
+	volatile uint64_t q_q4;
+	volatile uint64_t q_full;
+};
+
+#define ARK_MPU_DEBUG 0x0C0
+struct ark_mpu_debug_t {
+	volatile uint32_t state;
+	uint32_t reserved;
+	volatile uint32_t count;
+	volatile uint32_t take;
+	volatile uint32_t peek[4];
+};
+
+/*  Consolidated structure */
+struct ark_mpu_t {
+	struct ark_mpu_id_t id;
+	uint8_t reserved0[(ARK_MPU_HW - ARK_MPU_ID)
+					  - sizeof(struct ark_mpu_id_t)];
+	struct ark_mpu_hw_t hw;
+	uint8_t reserved1[(ARK_MPU_CFG - ARK_MPU_HW) -
+					  sizeof(struct ark_mpu_hw_t)];
+	struct ark_mpu_cfg_t cfg;
+	uint8_t reserved2[(ARK_MPU_STATS - ARK_MPU_CFG) -
+					  sizeof(struct ark_mpu_cfg_t)];
+	struct ark_mpu_stats_t stats;
+	uint8_t reserved3[(ARK_MPU_DEBUG - ARK_MPU_STATS) -
+					  sizeof(struct ark_mpu_stats_t)];
+	struct ark_mpu_debug_t debug;
+};
+
+uint16_t ark_api_num_queues(struct ark_mpu_t *mpu);
+uint16_t ark_api_num_queues_per_port(struct ark_mpu_t *mpu,
+	uint16_t ark_ports);
+int ark_mpu_verify(struct ark_mpu_t *mpu, uint32_t obj_size);
+void ark_mpu_stop(struct ark_mpu_t *mpu);
+void ark_mpu_start(struct ark_mpu_t *mpu);
+int ark_mpu_reset(struct ark_mpu_t *mpu);
+int ark_mpu_configure(struct ark_mpu_t *mpu, phys_addr_t ring,
+	uint32_t ring_size, int is_tx);
+
+void ark_mpu_dump(struct ark_mpu_t *mpu, const char *msg, uint16_t idx);
+void ark_mpu_dump_setup(struct ark_mpu_t *mpu, uint16_t qid);
+void ark_mpu_reset_stats(struct ark_mpu_t *mpu);
+
+static inline void
+ark_mpu_set_producer(struct ark_mpu_t *mpu, uint32_t idx)
+{
+	mpu->cfg.sw_prod_index = idx;
+}
+
+// #define    ark_mpu_set_producer(MPU, IDX) {(MPU)->cfg.sw_prod_index = (IDX);}
+
+#endif
diff --git a/drivers/net/ark/ark_pktchkr.c b/drivers/net/ark/ark_pktchkr.c
new file mode 100644
index 0000000..1f390de
--- /dev/null
+++ b/drivers/net/ark/ark_pktchkr.c
@@ -0,0 +1,460 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <getopt.h>
+#include <sys/time.h>
+#include <locale.h>
+#include <unistd.h>
+
+#include "ark_pktchkr.h"
+#include "ark_debug.h"
+
+static int set_arg(char *arg, char *val);
+static int ark_pmd_pktchkr_is_gen_forever(ark_pkt_chkr_t handle);
+
+#define ARK_MAX_STR_LEN 64
+union OPTV {
+	int INT;
+	int BOOL;
+	uint64_t LONG;
+	char STR[ARK_MAX_STR_LEN];
+};
+
+enum OPTYPE {
+	OTINT,
+	OTLONG,
+	OTBOOL,
+	OTSTRING
+};
+
+struct OPTIONS {
+	char opt[ARK_MAX_STR_LEN];
+	enum OPTYPE t;
+	union OPTV v;
+};
+
+static struct OPTIONS toptions[] = {
+	{{"configure"}, OTBOOL, {1} },
+	{{"port"}, OTINT, {0} },
+	{{"mac-dump"}, OTBOOL, {0} },
+	{{"dg-mode"}, OTBOOL, {1} },
+	{{"run"}, OTBOOL, {0} },
+	{{"stop"}, OTBOOL, {0} },
+	{{"dump"}, OTBOOL, {0} },
+	{{"en_resync"}, OTBOOL, {0} },
+	{{"tuser_err_val"}, OTINT, {1} },
+	{{"gen_forever"}, OTBOOL, {0} },
+	{{"en_slaved_start"}, OTBOOL, {0} },
+	{{"vary_length"}, OTBOOL, {0} },
+	{{"incr_payload"}, OTINT, {0} },
+	{{"incr_first_byte"}, OTBOOL, {0} },
+	{{"ins_seq_num"}, OTBOOL, {0} },
+	{{"ins_time_stamp"}, OTBOOL, {1} },
+	{{"ins_udp_hdr"}, OTBOOL, {0} },
+	{{"num_pkts"}, OTLONG, .v.LONG = 10000000000000L},
+	{{"payload_byte"}, OTINT, {0x55} },
+	{{"pkt_spacing"}, OTINT, {60} },
+	{{"pkt_size_min"}, OTINT, {2005} },
+	{{"pkt_size_max"}, OTINT, {1514} },
+	{{"pkt_size_incr"}, OTINT, {1} },
+	{{"eth_type"}, OTINT, {0x0800} },
+	{{"src_mac_addr"}, OTLONG, .v.LONG = 0xdC3cF6425060L},
+	{{"dst_mac_addr"}, OTLONG, .v.LONG = 0x112233445566L},
+	{{"hdr_dW0"}, OTINT, {0x0016e319} },
+	{{"hdr_dW1"}, OTINT, {0x27150004} },
+	{{"hdr_dW2"}, OTINT, {0x76967bda} },
+	{{"hdr_dW3"}, OTINT, {0x08004500} },
+	{{"hdr_dW4"}, OTINT, {0x005276ed} },
+	{{"hdr_dW5"}, OTINT, {0x40004006} },
+	{{"hdr_dW6"}, OTINT, {0x56cfc0a8} },
+	{{"start_offset"}, OTINT, {0} },
+	{{"dst_ip"}, OTSTRING, .v.STR = "169.254.10.240"},
+	{{"dst_port"}, OTINT, {65536} },
+	{{"src_port"}, OTINT, {65536} },
+};
+
+ark_pkt_chkr_t
+ark_pmd_pktchkr_init(void *addr, int ord, int l2_mode)
+{
+	struct ark_pkt_chkr_inst *inst =
+		rte_malloc("ark_pkt_chkr_inst",
+		sizeof(struct ark_pkt_chkr_inst), 0);
+	inst->sregs = (struct ark_pkt_chkr_stat_regs *)addr;
+	inst->cregs =
+		(struct ark_pkt_chkr_ctl_regs *)(((uint8_t *)addr) + 0x100);
+	inst->ordinal = ord;
+	inst->l2_mode = l2_mode;
+	return inst;
+}
+
+void
+ark_pmd_pktchkr_uninit(ark_pkt_chkr_t handle)
+{
+	rte_free(handle);
+}
+
+void
+ark_pmd_pktchkr_run(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->sregs->pkt_start_stop = 0;
+	inst->sregs->pkt_start_stop = 0x1;
+}
+
+int
+ark_pmd_pktchkr_stopped(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+	uint32_t r = inst->sregs->pkt_start_stop;
+
+	return (((r >> 16) & 1) == 1);
+}
+
+void
+ark_pmd_pktchkr_stop(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+	int wait_cycle = 10;
+
+	inst->sregs->pkt_start_stop = 0;
+	while (!ark_pmd_pktchkr_stopped(handle) && (wait_cycle > 0)) {
+	usleep(1000);
+	wait_cycle--;
+	ARK_DEBUG_TRACE("Waiting for pktchk %d to stop...\n", inst->ordinal);
+	}
+	ARK_DEBUG_TRACE("pktchk %d stopped.\n", inst->ordinal);
+}
+
+int
+ark_pmd_pktchkr_is_running(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+	uint32_t r = inst->sregs->pkt_start_stop;
+
+	return ((r & 1) == 1);
+}
+
+static void
+ark_pmd_pktchkr_set_pkt_ctrl(ark_pkt_chkr_t handle, uint32_t gen_forever,
+	uint32_t vary_length, uint32_t incr_payload, uint32_t incr_first_byte,
+	uint32_t ins_seq_num, uint32_t ins_udp_hdr, uint32_t en_resync,
+	uint32_t tuser_err_val, uint32_t ins_time_stamp)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+	uint32_t r = (tuser_err_val << 16) | (en_resync << 0);
+
+	inst->sregs->pkt_ctrl = r;
+	if (!inst->l2_mode)
+		ins_udp_hdr = 0;
+	r = (gen_forever << 24) | (vary_length << 16) |
+	(incr_payload << 12) | (incr_first_byte << 8) |
+	(ins_time_stamp << 5) | (ins_seq_num << 4) | ins_udp_hdr;
+	inst->cregs->pkt_ctrl = r;
+}
+
+static
+	int
+ark_pmd_pktchkr_is_gen_forever(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+	uint32_t r = inst->cregs->pkt_ctrl;
+
+	return (((r >> 24) & 1) == 1);
+}
+
+int
+ark_pmd_pktchkr_wait_done(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	if (ark_pmd_pktchkr_is_gen_forever(handle)) {
+	ARK_DEBUG_TRACE
+		("Error: wait_done will not terminate because gen_forever=1\n");
+	return -1;
+	}
+	int wait_cycle = 10;
+
+	while (!ark_pmd_pktchkr_stopped(handle) && (wait_cycle > 0)) {
+	usleep(1000);
+	wait_cycle--;
+	ARK_DEBUG_TRACE
+		("Waiting for packet checker %d's internal pktgen to finish sending...\n",
+		inst->ordinal);
+	ARK_DEBUG_TRACE("pktchk %d's pktgen done.\n", inst->ordinal);
+	}
+	return 0;
+}
+
+int
+ark_pmd_pktchkr_get_pkts_sent(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	return inst->cregs->pkts_sent;
+}
+
+void
+ark_pmd_pktchkr_set_payload_byte(ark_pkt_chkr_t handle, uint32_t b)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->pkt_payload = b;
+}
+
+void
+ark_pmd_pktchkr_set_pkt_size_min(ark_pkt_chkr_t handle, uint32_t x)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->pkt_size_min = x;
+}
+
+void
+ark_pmd_pktchkr_set_pkt_size_max(ark_pkt_chkr_t handle, uint32_t x)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->pkt_size_max = x;
+}
+
+void
+ark_pmd_pktchkr_set_pkt_size_incr(ark_pkt_chkr_t handle, uint32_t x)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->pkt_size_incr = x;
+}
+
+void
+ark_pmd_pktchkr_set_num_pkts(ark_pkt_chkr_t handle, uint32_t x)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->num_pkts = x;
+}
+
+void
+ark_pmd_pktchkr_set_src_mac_addr(ark_pkt_chkr_t handle, uint64_t mac_addr)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->src_mac_addr_h = (mac_addr >> 32) & 0xffff;
+	inst->cregs->src_mac_addr_l = mac_addr & 0xffffffff;
+}
+
+void
+ark_pmd_pktchkr_set_dst_mac_addr(ark_pkt_chkr_t handle, uint64_t mac_addr)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->dst_mac_addr_h = (mac_addr >> 32) & 0xffff;
+	inst->cregs->dst_mac_addr_l = mac_addr & 0xffffffff;
+}
+
+void
+ark_pmd_pktchkr_set_eth_type(ark_pkt_chkr_t handle, uint32_t x)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	inst->cregs->eth_type = x;
+}
+
+void
+ark_pmd_pktchkr_set_hdr_dW(ark_pkt_chkr_t handle, uint32_t *hdr)
+{
+	uint32_t i;
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	for (i = 0; i < 7; i++)
+		inst->cregs->hdr_dw[i] = hdr[i];
+}
+
+void
+ark_pmd_pktchkr_dump_stats(ark_pkt_chkr_t handle)
+{
+	struct ark_pkt_chkr_inst *inst = (struct ark_pkt_chkr_inst *)handle;
+
+	fprintf(stderr, "pkts_rcvd      = (%'u)\n", inst->sregs->pkts_rcvd);
+	fprintf(stderr, "bytes_rcvd     = (%'" PRIu64 ")\n",
+	inst->sregs->bytes_rcvd);
+	fprintf(stderr, "pkts_ok        = (%'u)\n",
+			inst->sregs->pkts_ok);
+	fprintf(stderr, "pkts_mismatch  = (%'u)\n",
+			inst->sregs->pkts_mismatch);
+	fprintf(stderr, "pkts_err       = (%'u)\n",
+			inst->sregs->pkts_err);
+	fprintf(stderr, "first_mismatch = (%'u)\n",
+			inst->sregs->first_mismatch);
+	fprintf(stderr, "resync_events  = (%'u)\n",
+			inst->sregs->resync_events);
+	fprintf(stderr, "pkts_missing   = (%'u)\n",
+			inst->sregs->pkts_missing);
+	fprintf(stderr, "min_latency    = (%'u)\n",
+			inst->sregs->min_latency);
+	fprintf(stderr, "max_latency    = (%'u)\n",
+			inst->sregs->max_latency);
+}
+
+static struct OPTIONS *
+options(const char *id)
+{
+	unsigned int i;
+
+	for (i = 0; i < sizeof(toptions) / sizeof(struct OPTIONS); i++) {
+	if (strcmp(id, toptions[i].opt) == 0)
+		return &toptions[i];
+	}
+	PMD_DRV_LOG(ERR,
+	"pktchkr: Could not find requested option !!, option = %s\n", id);
+	return NULL;
+}
+
+static int
+set_arg(char *arg, char *val)
+{
+	struct OPTIONS *o = options(arg);
+
+	if (o) {
+	switch (o->t) {
+	case OTINT:
+	case OTBOOL:
+		o->v.INT = atoi(val);
+		break;
+	case OTLONG:
+		o->v.INT = atoll(val);
+		break;
+	case OTSTRING:
+		strncpy(o->v.STR, val, ARK_MAX_STR_LEN);
+		break;
+	}
+	return 1;
+	}
+	return 0;
+}
+
+/******
+ * Arg format = "opt0=v,opt_n=v ..."
+ ******/
+void
+ark_pmd_pktchkr_parse(char *args)
+{
+	char *argv, *v;
+	const char toks[] = "=\n\t\v\f \r";
+	argv = strtok(args, toks);
+	v = strtok(NULL, toks);
+	set_arg(argv, v);
+	while (argv && v) {
+		argv = strtok(NULL, toks);
+		v = strtok(NULL, toks);
+		if (argv && v)
+			set_arg(argv, v);
+	}
+}
+
+static int32_t parse_ipv4_string(char const *ip_address);
+static int32_t
+parse_ipv4_string(char const *ip_address)
+{
+	unsigned int ip[4];
+
+	if (sscanf(ip_address,
+					"%u.%u.%u.%u",
+					&ip[0], &ip[1], &ip[2], &ip[3]) != 4)
+		return 0;
+	return ip[3] + ip[2] * 0x100 + ip[1] * 0x10000ul + ip[0] * 0x1000000ul;
+}
+
+void
+ark_pmd_pktchkr_setup(ark_pkt_chkr_t handle)
+{
+	uint32_t hdr[7];
+	int32_t dst_ip = parse_ipv4_string(options("dst_ip")->v.STR);
+
+	if (!options("stop")->v.BOOL && options("configure")->v.BOOL) {
+	ark_pmd_pktchkr_set_payload_byte(handle,
+		 options("payload_byte")->v.INT);
+	ark_pmd_pktchkr_set_src_mac_addr(handle,
+		options("src_mac_addr")->v.INT);
+	ark_pmd_pktchkr_set_dst_mac_addr(handle,
+		options("dst_mac_addr")->v.LONG);
+
+	ark_pmd_pktchkr_set_eth_type(handle,
+		options("eth_type")->v.INT);
+	if (options("dg-mode")->v.BOOL) {
+		hdr[0] = options("hdr_dW0")->v.INT;
+		hdr[1] = options("hdr_dW1")->v.INT;
+		hdr[2] = options("hdr_dW2")->v.INT;
+		hdr[3] = options("hdr_dW3")->v.INT;
+		hdr[4] = options("hdr_dW4")->v.INT;
+		hdr[5] = options("hdr_dW5")->v.INT;
+		hdr[6] = options("hdr_dW6")->v.INT;
+	} else {
+		hdr[0] = dst_ip;
+		hdr[1] = options("dst_port")->v.INT;
+		hdr[2] = options("src_port")->v.INT;
+		hdr[3] = 0;
+		hdr[4] = 0;
+		hdr[5] = 0;
+		hdr[6] = 0;
+	}
+	ark_pmd_pktchkr_set_hdr_dW(handle, hdr);
+	ark_pmd_pktchkr_set_num_pkts(handle,
+		 options("num_pkts")->v.INT);
+	ark_pmd_pktchkr_set_pkt_size_min(handle,
+		 options("pkt_size_min")->v.INT);
+	ark_pmd_pktchkr_set_pkt_size_max(handle,
+		 options("pkt_size_max")->v.INT);
+	ark_pmd_pktchkr_set_pkt_size_incr(handle,
+		 options("pkt_size_incr")->v.INT);
+	ark_pmd_pktchkr_set_pkt_ctrl(handle,
+		options("gen_forever")->v.BOOL,
+		options("vary_length")->v.BOOL,
+		options("incr_payload")->v.BOOL,
+		options("incr_first_byte")->v.BOOL,
+		options("ins_seq_num")->v.INT,
+		options("ins_udp_hdr")->v.BOOL,
+		options("en_resync")->v.BOOL,
+		options("tuser_err_val")->v.INT,
+		options("ins_time_stamp")->v.INT);
+	}
+
+	if (options("stop")->v.BOOL)
+	ark_pmd_pktchkr_stop(handle);
+
+	if (options("run")->v.BOOL) {
+	ARK_DEBUG_TRACE("Starting packet checker on port %d\n",
+		options("port")->v.INT);
+	ark_pmd_pktchkr_run(handle);
+	}
+}
diff --git a/drivers/net/ark/ark_pktchkr.h b/drivers/net/ark/ark_pktchkr.h
new file mode 100644
index 0000000..5d62bc6
--- /dev/null
+++ b/drivers/net/ark/ark_pktchkr.h
@@ -0,0 +1,114 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_PKTCHKR_H_
+#define _ARK_PKTCHKR_H_
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_eal.h>
+
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+
+#define ARK_PKTCHKR_BASE_ADR  0x90000
+
+typedef void *ark_pkt_chkr_t;
+
+struct ark_pkt_chkr_stat_regs {
+	uint32_t r0;
+	uint32_t pkt_start_stop;
+	uint32_t pkt_ctrl;
+	uint32_t pkts_rcvd;
+	uint64_t bytes_rcvd;
+	uint32_t pkts_ok;
+	uint32_t pkts_mismatch;
+	uint32_t pkts_err;
+	uint32_t first_mismatch;
+	uint32_t resync_events;
+	uint32_t pkts_missing;
+	uint32_t min_latency;
+	uint32_t max_latency;
+} __attribute__ ((packed));
+
+struct ark_pkt_chkr_ctl_regs {
+	uint32_t pkt_ctrl;
+	uint32_t pkt_payload;
+	uint32_t pkt_size_min;
+	uint32_t pkt_size_max;
+	uint32_t pkt_size_incr;
+	uint32_t num_pkts;
+	uint32_t pkts_sent;
+	uint32_t src_mac_addr_l;
+	uint32_t src_mac_addr_h;
+	uint32_t dst_mac_addr_l;
+	uint32_t dst_mac_addr_h;
+	uint32_t eth_type;
+	uint32_t hdr_dw[7];
+} __attribute__ ((packed));
+
+struct ark_pkt_chkr_inst {
+	struct rte_eth_dev_info *dev_info;
+	volatile struct ark_pkt_chkr_stat_regs *sregs;
+	volatile struct ark_pkt_chkr_ctl_regs *cregs;
+	int l2_mode;
+	int ordinal;
+};
+
+/*  packet checker functions */
+ark_pkt_chkr_t ark_pmd_pktchkr_init(void *addr, int ord, int l2_mode);
+void ark_pmd_pktchkr_uninit(ark_pkt_chkr_t handle);
+void ark_pmd_pktchkr_run(ark_pkt_chkr_t handle);
+int ark_pmd_pktchkr_stopped(ark_pkt_chkr_t handle);
+void ark_pmd_pktchkr_stop(ark_pkt_chkr_t handle);
+int ark_pmd_pktchkr_is_running(ark_pkt_chkr_t handle);
+int ark_pmd_pktchkr_get_pkts_sent(ark_pkt_chkr_t handle);
+void ark_pmd_pktchkr_set_payload_byte(ark_pkt_chkr_t handle, uint32_t b);
+void ark_pmd_pktchkr_set_pkt_size_min(ark_pkt_chkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_set_pkt_size_max(ark_pkt_chkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_set_pkt_size_incr(ark_pkt_chkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_set_num_pkts(ark_pkt_chkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_set_src_mac_addr(ark_pkt_chkr_t handle, uint64_t mac_addr);
+void ark_pmd_pktchkr_set_dst_mac_addr(ark_pkt_chkr_t handle, uint64_t mac_addr);
+void ark_pmd_pktchkr_set_eth_type(ark_pkt_chkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_set_hdr_dW(ark_pkt_chkr_t handle, uint32_t *hdr);
+void ark_pmd_pktchkr_parse(char *args);
+void ark_pmd_pktchkr_setup(ark_pkt_chkr_t handle);
+void ark_pmd_pktchkr_dump_stats(ark_pkt_chkr_t handle);
+int ark_pmd_pktchkr_wait_done(ark_pkt_chkr_t handle);
+
+#endif
diff --git a/drivers/net/ark/ark_pktdir.c b/drivers/net/ark/ark_pktdir.c
new file mode 100644
index 0000000..3a6ccdf
--- /dev/null
+++ b/drivers/net/ark/ark_pktdir.c
@@ -0,0 +1,79 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include "ark_global.h"
+
+ark_pkt_dir_t
+ark_pmd_pktdir_init(void *base)
+{
+	struct ark_pkt_dir_inst *inst =
+	rte_malloc("ark_pkt_dir_inst", sizeof(struct ark_pkt_dir_inst), 0);
+	inst->regs = (struct ark_pkt_dir_regs *)base;
+	inst->regs->ctrl = 0x00110110;	/* POR state */
+	return inst;
+}
+
+void
+ark_pmd_pktdir_uninit(ark_pkt_dir_t handle)
+{
+	struct ark_pkt_dir_inst *inst = (struct ark_pkt_dir_inst *)handle;
+
+	rte_free(inst);
+}
+
+void
+ark_pmd_pktdir_setup(ark_pkt_dir_t handle, uint32_t v)
+{
+	struct ark_pkt_dir_inst *inst = (struct ark_pkt_dir_inst *)handle;
+
+	inst->regs->ctrl = v;
+}
+
+uint32_t
+ark_pmd_pktdir_status(ark_pkt_dir_t handle)
+{
+	struct ark_pkt_dir_inst *inst = (struct ark_pkt_dir_inst *)handle;
+
+	return inst->regs->ctrl;
+}
+
+uint32_t
+ark_pmd_pktdir_stall_cnt(ark_pkt_dir_t handle)
+{
+	struct ark_pkt_dir_inst *inst = (struct ark_pkt_dir_inst *)handle;
+
+	return inst->regs->stall_cnt;
+}
diff --git a/drivers/net/ark/ark_pktdir.h b/drivers/net/ark/ark_pktdir.h
new file mode 100644
index 0000000..3f01e5c
--- /dev/null
+++ b/drivers/net/ark/ark_pktdir.h
@@ -0,0 +1,68 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_PKTDIR_H_
+#define _ARK_PKTDIR_H_
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_eal.h>
+
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+
+#define ARK_PKTDIR_BASE_ADR  0xa0000
+
+typedef void *ark_pkt_dir_t;
+
+struct ark_pkt_dir_regs {
+	uint32_t ctrl;
+	uint32_t status;
+	uint32_t stall_cnt;
+} __attribute__ ((packed));
+
+struct ark_pkt_dir_inst {
+	volatile struct ark_pkt_dir_regs *regs;
+};
+
+ark_pkt_dir_t ark_pmd_pktdir_init(void *base);
+void ark_pmd_pktdir_uninit(ark_pkt_dir_t handle);
+void ark_pmd_pktdir_setup(ark_pkt_dir_t handle, uint32_t v);
+uint32_t ark_pmd_pktdir_stall_cnt(ark_pkt_dir_t handle);
+uint32_t ark_pmd_pktdir_status(ark_pkt_dir_t handle);
+
+#endif
diff --git a/drivers/net/ark/ark_pktgen.c b/drivers/net/ark/ark_pktgen.c
new file mode 100644
index 0000000..7679064
--- /dev/null
+++ b/drivers/net/ark/ark_pktgen.c
@@ -0,0 +1,482 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <getopt.h>
+#include <sys/time.h>
+#include <locale.h>
+#include <unistd.h>
+
+#include "ark_pktgen.h"
+#include "ark_debug.h"
+
+#define ARK_MAX_STR_LEN 64
+union OPTV {
+	int INT;
+	int BOOL;
+	uint64_t LONG;
+	char STR[ARK_MAX_STR_LEN];
+};
+
+enum OPTYPE {
+	OTINT,
+	OTLONG,
+	OTBOOL,
+	OTSTRING
+};
+
+struct OPTIONS {
+	char opt[ARK_MAX_STR_LEN];
+	enum OPTYPE t;
+	union OPTV v;
+};
+
+static struct OPTIONS toptions[] = {
+	{{"configure"}, OTBOOL, {1} },
+	{{"dg-mode"}, OTBOOL, {1} },
+	{{"run"}, OTBOOL, {0} },
+	{{"pause"}, OTBOOL, {0} },
+	{{"reset"}, OTBOOL, {0} },
+	{{"dump"}, OTBOOL, {0} },
+	{{"gen_forever"}, OTBOOL, {0} },
+	{{"en_slaved_start"}, OTBOOL, {0} },
+	{{"vary_length"}, OTBOOL, {0} },
+	{{"incr_payload"}, OTBOOL, {0} },
+	{{"incr_first_byte"}, OTBOOL, {0} },
+	{{"ins_seq_num"}, OTBOOL, {0} },
+	{{"ins_time_stamp"}, OTBOOL, {1} },
+	{{"ins_udp_hdr"}, OTBOOL, {0} },
+	{{"num_pkts"}, OTLONG, .v.LONG = 100000000},
+	{{"payload_byte"}, OTINT, {0x55} },
+	{{"pkt_spacing"}, OTINT, {130} },
+	{{"pkt_size_min"}, OTINT, {2006} },
+	{{"pkt_size_max"}, OTINT, {1514} },
+	{{"pkt_size_incr"}, OTINT, {1} },
+	{{"eth_type"}, OTINT, {0x0800} },
+	{{"src_mac_addr"}, OTLONG, .v.LONG = 0xdC3cF6425060L},
+	{{"dst_mac_addr"}, OTLONG, .v.LONG = 0x112233445566L},
+	{{"hdr_dW0"}, OTINT, {0x0016e319} },
+	{{"hdr_dW1"}, OTINT, {0x27150004} },
+	{{"hdr_dW2"}, OTINT, {0x76967bda} },
+	{{"hdr_dW3"}, OTINT, {0x08004500} },
+	{{"hdr_dW4"}, OTINT, {0x005276ed} },
+	{{"hdr_dW5"}, OTINT, {0x40004006} },
+	{{"hdr_dW6"}, OTINT, {0x56cfc0a8} },
+	{{"start_offset"}, OTINT, {0} },
+	{{"bytes_per_cycle"}, OTINT, {10} },
+	{{"shaping"}, OTBOOL, {0} },
+	{{"dst_ip"}, OTSTRING, .v.STR = "169.254.10.240"},
+	{{"dst_port"}, OTINT, {65536} },
+	{{"src_port"}, OTINT, {65536} },
+};
+
+ark_pkt_gen_t
+ark_pmd_pktgen_init(void *adr, int ord, int l2_mode)
+{
+	struct ark_pkt_gen_inst *inst =
+		rte_malloc("ark_pkt_gen_inst_pMD",
+	   sizeof(struct ark_pkt_gen_inst), 0);
+	inst->regs = (struct ark_pkt_gen_regs *)adr;
+	inst->ordinal = ord;
+	inst->l2_mode = l2_mode;
+	return inst;
+}
+
+void
+ark_pmd_pktgen_uninit(ark_pkt_gen_t handle)
+{
+	rte_free(handle);
+}
+
+void
+ark_pmd_pktgen_run(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+
+	inst->regs->pkt_start_stop = 1;
+}
+
+uint32_t
+ark_pmd_pktgen_paused(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	uint32_t r = inst->regs->pkt_start_stop;
+
+	return (((r >> 16) & 1) == 1);
+}
+
+void
+ark_pmd_pktgen_pause(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	int cnt = 0;
+
+	inst->regs->pkt_start_stop = 0;
+
+	while (!ark_pmd_pktgen_paused(handle)) {
+		usleep(1000);
+		if (cnt++ > 100) {
+			PMD_DRV_LOG(ERR, "pktgen %d failed to pause.\n",
+						inst->ordinal);
+			break;
+		}
+	}
+	ARK_DEBUG_TRACE("pktgen %d paused.\n", inst->ordinal);
+}
+
+void
+ark_pmd_pktgen_reset(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+
+	if (!ark_pmd_pktgen_is_running(handle) &&
+		!ark_pmd_pktgen_paused(handle)) {
+		ARK_DEBUG_TRACE
+			("pktgen %d is not running and is not paused. No need to reset.\n",
+			 inst->ordinal);
+		return;
+	}
+
+	if (ark_pmd_pktgen_is_running(handle) &&
+		!ark_pmd_pktgen_paused(handle)) {
+		ARK_DEBUG_TRACE("pktgen %d is not paused. Pausing first.\n",
+						inst->ordinal);
+		ark_pmd_pktgen_pause(handle);
+	}
+
+	ARK_DEBUG_TRACE("Resetting pktgen %d.\n", inst->ordinal);
+	inst->regs->pkt_start_stop = (1 << 8);
+}
+
+uint32_t
+ark_pmd_pktgen_tx_done(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	uint32_t r = inst->regs->pkt_start_stop;
+
+	return (((r >> 24) & 1) == 1);
+}
+
+uint32_t
+ark_pmd_pktgen_is_running(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	uint32_t r = inst->regs->pkt_start_stop;
+
+	return ((r & 1) == 1);
+}
+
+uint32_t
+ark_pmd_pktgen_is_gen_forever(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	uint32_t r = inst->regs->pkt_ctrl;
+
+	return (((r >> 24) & 1) == 1);
+}
+
+void
+ark_pmd_pktgen_wait_done(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+
+	if (ark_pmd_pktgen_is_gen_forever(handle))
+		PMD_DRV_LOG(ERR, "wait_done will not terminate because gen_forever=1\n");
+	int wait_cycle = 10;
+
+	while (!ark_pmd_pktgen_tx_done(handle) && (wait_cycle > 0)) {
+		usleep(1000);
+		wait_cycle--;
+		ARK_DEBUG_TRACE("Waiting for pktgen %d to finish sending...\n",
+						inst->ordinal);
+	}
+	ARK_DEBUG_TRACE("pktgen %d done.\n", inst->ordinal);
+}
+
+uint32_t
+ark_pmd_pktgen_get_pkts_sent(ark_pkt_gen_t handle)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	return inst->regs->pkts_sent;
+}
+
+void
+ark_pmd_pktgen_set_payload_byte(ark_pkt_gen_t handle, uint32_t b)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->pkt_payload = b;
+}
+
+void
+ark_pmd_pktgen_set_pkt_spacing(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->pkt_spacing = x;
+}
+
+void
+ark_pmd_pktgen_set_pkt_size_min(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->pkt_size_min = x;
+}
+
+void
+ark_pmd_pktgen_set_pkt_size_max(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->pkt_size_max = x;
+}
+
+void
+ark_pmd_pktgen_set_pkt_size_incr(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->pkt_size_incr = x;
+}
+
+void
+ark_pmd_pktgen_set_num_pkts(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->num_pkts = x;
+}
+
+void
+ark_pmd_pktgen_set_src_mac_addr(ark_pkt_gen_t handle, uint64_t mac_addr)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->src_mac_addr_h = (mac_addr >> 32) & 0xffff;
+	inst->regs->src_mac_addr_l = mac_addr & 0xffffffff;
+}
+
+void
+ark_pmd_pktgen_set_dst_mac_addr(ark_pkt_gen_t handle, uint64_t mac_addr)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->dst_mac_addr_h = (mac_addr >> 32) & 0xffff;
+	inst->regs->dst_mac_addr_l = mac_addr & 0xffffffff;
+}
+
+void
+ark_pmd_pktgen_set_eth_type(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+	inst->regs->eth_type = x;
+}
+
+void
+ark_pmd_pktgen_set_hdr_dW(ark_pkt_gen_t handle, uint32_t *hdr)
+{
+	uint32_t i;
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+
+	for (i = 0; i < 7; i++)
+		inst->regs->hdr_dw[i] = hdr[i];
+}
+
+void
+ark_pmd_pktgen_set_start_offset(ark_pkt_gen_t handle, uint32_t x)
+{
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+
+	inst->regs->start_offset = x;
+}
+
+static struct OPTIONS *
+options(const char *id)
+{
+	unsigned int i;
+
+	for (i = 0; i < sizeof(toptions) / sizeof(struct OPTIONS); i++) {
+		if (strcmp(id, toptions[i].opt) == 0)
+			return &toptions[i];
+	}
+
+	PMD_DRV_LOG
+		(ERR,
+		 "pktgen: Could not find requested option !!, "
+		 "option = %s\n",
+		 id
+		 );
+	return NULL;
+}
+
+static int pmd_set_arg(char *arg, char *val);
+static int
+pmd_set_arg(char *arg, char *val)
+{
+	struct OPTIONS *o = options(arg);
+
+	if (o) {
+	switch (o->t) {
+	case OTINT:
+	case OTBOOL:
+		o->v.INT = atoi(val);
+		break;
+	case OTLONG:
+		o->v.INT = atoll(val);
+		break;
+	case OTSTRING:
+		strncpy(o->v.STR, val, ARK_MAX_STR_LEN);
+		break;
+	}
+	return 1;
+	}
+	return 0;
+}
+
+/******
+ * Arg format = "opt0=v,opt_n=v ..."
+ ******/
+void
+ark_pmd_pktgen_parse(char *args)
+{
+	char *argv, *v;
+	const char toks[] = " =\n\t\v\f \r";
+	argv = strtok(args, toks);
+	v = strtok(NULL, toks);
+	pmd_set_arg(argv, v);
+	while (argv && v) {
+	argv = strtok(NULL, toks);
+	v = strtok(NULL, toks);
+	if (argv && v)
+		pmd_set_arg(argv, v);
+	}
+}
+
+static int32_t parse_ipv4_string(char const *ip_address);
+static int32_t
+parse_ipv4_string(char const *ip_address)
+{
+	unsigned int ip[4];
+
+	if (sscanf(ip_address,
+			   "%u.%u.%u.%u",
+			   &ip[0], &ip[1], &ip[2], &ip[3]) != 4)
+		return 0;
+	return ip[3] + ip[2] * 0x100 + ip[1] * 0x10000ul + ip[0] * 0x1000000ul;
+}
+
+static void
+ark_pmd_pktgen_set_pkt_ctrl(ark_pkt_gen_t handle, uint32_t gen_forever,
+	uint32_t en_slaved_start, uint32_t vary_length, uint32_t incr_payload,
+	uint32_t incr_first_byte, uint32_t ins_seq_num, uint32_t ins_udp_hdr,
+	uint32_t ins_time_stamp)
+{
+	uint32_t r;
+	struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)handle;
+
+	if (!inst->l2_mode)
+		ins_udp_hdr = 0;
+
+	r = (gen_forever << 24) | (en_slaved_start << 20) |
+		(vary_length << 16) |
+	(incr_payload << 12) | (incr_first_byte << 8) |
+	(ins_time_stamp << 5) | (ins_seq_num << 4) | ins_udp_hdr;
+
+	inst->regs->bytes_per_cycle = options("bytes_per_cycle")->v.INT;
+	if (options("shaping")->v.BOOL)
+		r = r | (1 << 28);	/* enable shaping */
+
+	inst->regs->pkt_ctrl = r;
+}
+
+void
+ark_pmd_pktgen_setup(ark_pkt_gen_t handle)
+{
+	uint32_t hdr[7];
+	int32_t dst_ip = parse_ipv4_string(options("dst_ip")->v.STR);
+
+	if (!options("pause")->v.BOOL && (!options("reset")->v.BOOL &&
+				  (options("configure")->v.BOOL))) {
+	ark_pmd_pktgen_set_payload_byte(handle,
+			options("payload_byte")->v.INT);
+	ark_pmd_pktgen_set_src_mac_addr(handle,
+			options("src_mac_addr")->v.INT);
+	ark_pmd_pktgen_set_dst_mac_addr(handle,
+			options("dst_mac_addr")->v.LONG);
+	ark_pmd_pktgen_set_eth_type(handle,
+			options("eth_type")->v.INT);
+
+	if (options("dg-mode")->v.BOOL) {
+		hdr[0] = options("hdr_dW0")->v.INT;
+		hdr[1] = options("hdr_dW1")->v.INT;
+		hdr[2] = options("hdr_dW2")->v.INT;
+		hdr[3] = options("hdr_dW3")->v.INT;
+		hdr[4] = options("hdr_dW4")->v.INT;
+		hdr[5] = options("hdr_dW5")->v.INT;
+		hdr[6] = options("hdr_dW6")->v.INT;
+	} else {
+		hdr[0] = dst_ip;
+		hdr[1] = options("dst_port")->v.INT;
+		hdr[2] = options("src_port")->v.INT;
+		hdr[3] = 0;
+		hdr[4] = 0;
+		hdr[5] = 0;
+		hdr[6] = 0;
+	}
+	ark_pmd_pktgen_set_hdr_dW(handle, hdr);
+	ark_pmd_pktgen_set_num_pkts(handle,
+		options("num_pkts")->v.INT);
+	ark_pmd_pktgen_set_pkt_size_min(handle,
+		options("pkt_size_min")->v.INT);
+	ark_pmd_pktgen_set_pkt_size_max(handle,
+		options("pkt_size_max")->v.INT);
+	ark_pmd_pktgen_set_pkt_size_incr(handle,
+		 options("pkt_size_incr")->v.INT);
+	ark_pmd_pktgen_set_pkt_spacing(handle,
+	   options("pkt_spacing")->v.INT);
+	ark_pmd_pktgen_set_start_offset(handle,
+		options("start_offset")->v.INT);
+	ark_pmd_pktgen_set_pkt_ctrl(handle,
+		options("gen_forever")->v.BOOL,
+		options("en_slaved_start")->v.BOOL,
+		options("vary_length")->v.BOOL,
+		options("incr_payload")->v.BOOL,
+		options("incr_first_byte")->v.BOOL,
+		options("ins_seq_num")->v.INT,
+		options("ins_udp_hdr")->v.BOOL,
+		options("ins_time_stamp")->v.INT);
+	}
+
+	if (options("pause")->v.BOOL)
+	ark_pmd_pktgen_pause(handle);
+
+	if (options("reset")->v.BOOL)
+	ark_pmd_pktgen_reset(handle);
+	if (options("run")->v.BOOL) {
+	ARK_DEBUG_TRACE("Starting packet generator on port %d\n",
+		options("port")->v.INT);
+	ark_pmd_pktgen_run(handle);
+	}
+}
diff --git a/drivers/net/ark/ark_pktgen.h b/drivers/net/ark/ark_pktgen.h
new file mode 100644
index 0000000..92994ce
--- /dev/null
+++ b/drivers/net/ark/ark_pktgen.h
@@ -0,0 +1,106 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_PKTGEN_H_
+#define _ARK_PKTGEN_H_
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_eal.h>
+
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+
+#define ARK_PKTGEN_BASE_ADR  0x10000
+
+typedef void *ark_pkt_gen_t;
+
+struct ark_pkt_gen_regs {
+	uint32_t r0;
+	volatile uint32_t pkt_start_stop;
+	volatile uint32_t pkt_ctrl;
+	uint32_t pkt_payload;
+	uint32_t pkt_spacing;
+	uint32_t pkt_size_min;
+	uint32_t pkt_size_max;
+	uint32_t pkt_size_incr;
+	volatile uint32_t num_pkts;
+	volatile uint32_t pkts_sent;
+	uint32_t src_mac_addr_l;
+	uint32_t src_mac_addr_h;
+	uint32_t dst_mac_addr_l;
+	uint32_t dst_mac_addr_h;
+	uint32_t eth_type;
+	uint32_t hdr_dw[7];
+	uint32_t start_offset;
+	uint32_t bytes_per_cycle;
+} __attribute__ ((packed));
+
+struct ark_pkt_gen_inst {
+	struct rte_eth_dev_info *dev_info;
+	struct ark_pkt_gen_regs *regs;
+	int l2_mode;
+	int ordinal;
+};
+
+/*  packet generator functions */
+ark_pkt_gen_t ark_pmd_pktgen_init(void *arg, int ord, int l2_mode);
+void ark_pmd_pktgen_uninit(ark_pkt_gen_t handle);
+void ark_pmd_pktgen_run(ark_pkt_gen_t handle);
+void ark_pmd_pktgen_pause(ark_pkt_gen_t handle);
+uint32_t ark_pmd_pktgen_paused(ark_pkt_gen_t handle);
+uint32_t ark_pmd_pktgen_is_gen_forever(ark_pkt_gen_t handle);
+uint32_t ark_pmd_pktgen_is_running(ark_pkt_gen_t handle);
+uint32_t ark_pmd_pktgen_tx_done(ark_pkt_gen_t handle);
+void ark_pmd_pktgen_reset(ark_pkt_gen_t handle);
+void ark_pmd_pktgen_wait_done(ark_pkt_gen_t handle);
+uint32_t ark_pmd_pktgen_get_pkts_sent(ark_pkt_gen_t handle);
+void ark_pmd_pktgen_set_payload_byte(ark_pkt_gen_t handle, uint32_t b);
+void ark_pmd_pktgen_set_pkt_spacing(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_set_pkt_size_min(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_set_pkt_size_max(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_set_pkt_size_incr(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_set_num_pkts(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_set_src_mac_addr(ark_pkt_gen_t handle, uint64_t mac_addr);
+void ark_pmd_pktgen_set_dst_mac_addr(ark_pkt_gen_t handle, uint64_t mac_addr);
+void ark_pmd_pktgen_set_eth_type(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_set_hdr_dW(ark_pkt_gen_t handle, uint32_t *hdr);
+void ark_pmd_pktgen_set_start_offset(ark_pkt_gen_t handle, uint32_t x);
+void ark_pmd_pktgen_parse(char *argv);
+void ark_pmd_pktgen_setup(ark_pkt_gen_t handle);
+
+#endif
diff --git a/drivers/net/ark/ark_rqp.c b/drivers/net/ark/ark_rqp.c
new file mode 100644
index 0000000..ece6044
--- /dev/null
+++ b/drivers/net/ark/ark_rqp.c
@@ -0,0 +1,92 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_rqp.h"
+#include "ark_debug.h"
+
+/* ************************************************************************* */
+void
+ark_rqp_stats_reset(struct ark_rqpace_t *rqp)
+{
+	rqp->stats_clear = 1;
+	/* POR 992 */
+	/* rqp->cpld_max = 992; */
+	/* POR 64 */
+	/* rqp->cplh_max = 64; */
+}
+
+/* ************************************************************************* */
+void
+ark_rqp_dump(struct ark_rqpace_t *rqp)
+{
+	if (rqp->err_count_other != 0)
+	fprintf
+		(stderr,
+		"ARKP RQP Errors noted: ctrl: %d cplh_hmax %d cpld_max %d"
+		 ARK_SU32
+		ARK_SU32 "\n",
+		 rqp->ctrl, rqp->cplh_max, rqp->cpld_max,
+		"Error Count", rqp->err_cnt,
+		 "Error General", rqp->err_count_other);
+
+	ARK_DEBUG_STATS
+		("ARKP RQP Dump: ctrl: %d cplh_hmax %d cpld_max %d" ARK_SU32
+		 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32
+		 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32
+		 ARK_SU32 ARK_SU32 ARK_SU32
+		 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 "\n",
+		 rqp->ctrl, rqp->cplh_max, rqp->cpld_max,
+		 "Error Count", rqp->err_cnt,
+		 "Error General", rqp->err_count_other,
+		 "stall_pS", rqp->stall_ps,
+		 "stall_pS Min", rqp->stall_ps_min,
+		 "stall_pS Max", rqp->stall_ps_max,
+		 "req_pS", rqp->req_ps,
+		 "req_pS Min", rqp->req_ps_min,
+		 "req_pS Max", rqp->req_ps_max,
+		 "req_dWPS", rqp->req_dw_ps,
+		 "req_dWPS Min", rqp->req_dw_ps_min,
+		 "req_dWPS Max", rqp->req_dw_ps_max,
+		 "cpl_pS", rqp->cpl_ps,
+		 "cpl_pS Min", rqp->cpl_ps_min,
+		 "cpl_pS Max", rqp->cpl_ps_max,
+		 "cpl_dWPS", rqp->cpl_dw_ps,
+		 "cpl_dWPS Min", rqp->cpl_dw_ps_min,
+		 "cpl_dWPS Max", rqp->cpl_dw_ps_max,
+		 "cplh pending", rqp->cplh_pending,
+		 "cpld pending", rqp->cpld_pending,
+		 "cplh pending max", rqp->cplh_pending_max,
+		 "cpld pending max", rqp->cpld_pending_max);
+}
diff --git a/drivers/net/ark/ark_rqp.h b/drivers/net/ark/ark_rqp.h
new file mode 100644
index 0000000..4376d76
--- /dev/null
+++ b/drivers/net/ark/ark_rqp.h
@@ -0,0 +1,75 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_RQP_H_
+#define _ARK_RQP_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/*
+ * RQ Pacing core hardware structure
+ */
+struct ark_rqpace_t {
+	volatile uint32_t ctrl;
+	volatile uint32_t stats_clear;
+	volatile uint32_t cplh_max;
+	volatile uint32_t cpld_max;
+	volatile uint32_t err_cnt;
+	volatile uint32_t stall_ps;
+	volatile uint32_t stall_ps_min;
+	volatile uint32_t stall_ps_max;
+	volatile uint32_t req_ps;
+	volatile uint32_t req_ps_min;
+	volatile uint32_t req_ps_max;
+	volatile uint32_t req_dw_ps;
+	volatile uint32_t req_dw_ps_min;
+	volatile uint32_t req_dw_ps_max;
+	volatile uint32_t cpl_ps;
+	volatile uint32_t cpl_ps_min;
+	volatile uint32_t cpl_ps_max;
+	volatile uint32_t cpl_dw_ps;
+	volatile uint32_t cpl_dw_ps_min;
+	volatile uint32_t cpl_dw_ps_max;
+	volatile uint32_t cplh_pending;
+	volatile uint32_t cpld_pending;
+	volatile uint32_t cplh_pending_max;
+	volatile uint32_t cpld_pending_max;
+	volatile uint32_t err_count_other;
+};
+
+void ark_rqp_dump(struct ark_rqpace_t *rqp);
+void ark_rqp_stats_reset(struct ark_rqpace_t *rqp);
+
+#endif
diff --git a/drivers/net/ark/ark_udm.c b/drivers/net/ark/ark_udm.c
new file mode 100644
index 0000000..decd48c
--- /dev/null
+++ b/drivers/net/ark/ark_udm.c
@@ -0,0 +1,221 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_debug.h"
+#include "ark_udm.h"
+
+int
+ark_udm_verify(struct ark_udm_t *udm)
+{
+	if (sizeof(struct ark_udm_t) != ARK_UDM_EXPECT_SIZE) {
+		fprintf(stderr, "  UDM structure looks incorrect %d vs %zd\n",
+				ARK_UDM_EXPECT_SIZE, sizeof(struct ark_udm_t));
+		return -1;
+	}
+
+	if (udm->setup.const0 != ARK_UDM_CONST) {
+		fprintf(stderr, "  UDM module not found as expected 0x%08x\n",
+				udm->setup.const0);
+		return -1;
+	}
+	return 0;
+}
+
+int
+ark_udm_stop(struct ark_udm_t *udm, const int wait)
+{
+	int cnt = 0;
+
+	udm->cfg.command = 2;
+
+	while (wait && (udm->cfg.stop_flushed & 0x01) == 0) {
+	if (cnt++ > 1000)
+		return 1;
+
+	usleep(10);
+	}
+	return 0;
+}
+
+int
+ark_udm_reset(struct ark_udm_t *udm)
+{
+	int status;
+
+	status = ark_udm_stop(udm, 1);
+	if (status != 0) {
+		ARK_DEBUG_TRACE("ARKP: %s  stop failed  doing forced reset\n",
+						__func__);
+		udm->cfg.command = 4;
+		usleep(10);
+		udm->cfg.command = 3;
+		status = ark_udm_stop(udm, 0);
+		ARK_DEBUG_TRACE
+			("ARKP: %s  stop status %d post failure and forced reset\n",
+			 __func__, status);
+	} else {
+		udm->cfg.command = 3;
+	}
+
+	return status;
+}
+
+void
+ark_udm_start(struct ark_udm_t *udm)
+{
+	udm->cfg.command = 1;
+}
+
+void
+ark_udm_stats_reset(struct ark_udm_t *udm)
+{
+	udm->pcibp.pci_clear = 1;
+	udm->tlp_ps.tlp_clear = 1;
+}
+
+void
+ark_udm_configure(struct ark_udm_t *udm, uint32_t headroom, uint32_t dataroom,
+	uint32_t write_interval_ns)
+{
+	/* headroom and data room are in DWs in the UDM */
+	udm->cfg.dataroom = dataroom / 4;
+	udm->cfg.headroom = headroom / 4;
+
+	/* 4 NS period ns */
+	udm->rt_cfg.write_interval = write_interval_ns / 4;
+}
+
+void
+ark_udm_write_addr(struct ark_udm_t *udm, phys_addr_t addr)
+{
+	udm->rt_cfg.hw_prod_addr = addr;
+}
+
+int
+ark_udm_is_flushed(struct ark_udm_t *udm)
+{
+	return (udm->cfg.stop_flushed & 0x01) != 0;
+}
+
+uint64_t
+ark_udm_dropped(struct ark_udm_t *udm)
+{
+	return udm->qstats.q_pkt_drop;
+}
+
+uint64_t
+ark_udm_bytes(struct ark_udm_t *udm)
+{
+	return udm->qstats.q_byte_count;
+}
+
+uint64_t
+ark_udm_packets(struct ark_udm_t *udm)
+{
+	return udm->qstats.q_ff_packet_count;
+}
+
+void
+ark_udm_dump_stats(struct ark_udm_t *udm, const char *msg)
+{
+	ARK_DEBUG_STATS("ARKP UDM Stats: %s" ARK_SU64 ARK_SU64 ARK_SU64 ARK_SU64
+		ARK_SU64 "\n", msg, "Pkts Received", udm->stats.rx_packet_count,
+		"Pkts Finalized", udm->stats.rx_sent_packets, "Pkts Dropped",
+		udm->tlp.pkt_drop, "Bytes Count", udm->stats.rx_byte_count, "MBuf Count",
+		udm->stats.rx_mbuf_count);
+}
+
+void
+ark_udm_dump_queue_stats(struct ark_udm_t *udm, const char *msg, uint16_t qid)
+{
+	ARK_DEBUG_STATS
+		("ARKP UDM Queue %3u Stats: %s"
+		 ARK_SU64 ARK_SU64
+		 ARK_SU64 ARK_SU64
+		 ARK_SU64 "\n",
+		 qid, msg,
+		 "Pkts Received", udm->qstats.q_packet_count,
+		 "Pkts Finalized", udm->qstats.q_ff_packet_count,
+		 "Pkts Dropped", udm->qstats.q_pkt_drop,
+		 "Bytes Count", udm->qstats.q_byte_count,
+		 "MBuf Count", udm->qstats.q_mbuf_count);
+}
+
+void
+ark_udm_dump(struct ark_udm_t *udm, const char *msg)
+{
+	ARK_DEBUG_TRACE("ARKP UDM Dump: %s Stopped: %d\n", msg,
+	udm->cfg.stop_flushed);
+}
+
+void
+ark_udm_dump_setup(struct ark_udm_t *udm, uint16_t q_id)
+{
+	ARK_DEBUG_TRACE
+		("UDM Setup Q: %u"
+		 ARK_SU64X ARK_SU32 "\n",
+		 q_id,
+		 "hw_prod_addr", udm->rt_cfg.hw_prod_addr,
+		 "prod_idx", udm->rt_cfg.prod_idx);
+}
+
+void
+ark_udm_dump_perf(struct ark_udm_t *udm, const char *msg)
+{
+	struct ark_udm_pcibp_t *bp = &udm->pcibp;
+
+	ARK_DEBUG_STATS
+		("ARKP UDM Performance %s"
+		 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 ARK_SU32 "\n",
+		 msg,
+		 "PCI Empty", bp->pci_empty,
+		 "PCI Q1", bp->pci_q1,
+		 "PCI Q2", bp->pci_q2,
+		 "PCI Q3", bp->pci_q3,
+		 "PCI Q4", bp->pci_q4,
+		 "PCI Full", bp->pci_full);
+}
+
+void
+ark_udm_queue_stats_reset(struct ark_udm_t *udm)
+{
+	udm->qstats.q_byte_count = 1;
+}
+
+void
+ark_udm_queue_enable(struct ark_udm_t *udm, int enable)
+{
+	udm->qstats.q_enable = enable ? 1 : 0;
+}
diff --git a/drivers/net/ark/ark_udm.h b/drivers/net/ark/ark_udm.h
new file mode 100644
index 0000000..d05b925
--- /dev/null
+++ b/drivers/net/ark/ark_udm.h
@@ -0,0 +1,175 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_UDM_H_
+#define _ARK_UDM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/*
+ * UDM hardware structures
+ */
+
+#define ARK_RX_WRITE_TIME_NS 2500
+#define ARK_UDM_SETUP 0
+#define ARK_UDM_CONST 0xbACECACE
+struct ark_udm_setup_t {
+	uint32_t r0;
+	uint32_t r4;
+	volatile uint32_t cycle_count;
+	uint32_t const0;
+};
+
+#define ARK_UDM_CFG 0x010
+struct ark_udm_cfg_t {
+	volatile uint32_t stop_flushed;	/* RO */
+	volatile uint32_t command;
+	uint32_t dataroom;
+	uint32_t headroom;
+};
+
+typedef enum {
+	ARK_UDM_START = 0x1,
+	ARK_UDM_STOP = 0x2,
+	ARK_UDM_RESET = 0x3
+} ark_udm_commands;
+
+#define ARK_UDM_STATS 0x020
+struct ark_udm_stats_t {
+	volatile uint64_t rx_byte_count;
+	volatile uint64_t rx_packet_count;
+	volatile uint64_t rx_mbuf_count;
+	volatile uint64_t rx_sent_packets;
+};
+
+#define ARK_UDM_PQ 0x040
+struct ark_udm_queue_stats_t {
+	volatile uint64_t q_byte_count;
+	volatile uint64_t q_packet_count;	/* includes drops */
+	volatile uint64_t q_mbuf_count;
+	volatile uint64_t q_ff_packet_count;
+	volatile uint64_t q_pkt_drop;
+	uint32_t q_enable;
+};
+
+#define ARK_UDM_TLP 0x0070
+struct ark_udm_tlp_t {
+	volatile uint64_t pkt_drop;	/* global */
+	volatile uint32_t tlp_q1;
+	volatile uint32_t tlp_q2;
+	volatile uint32_t tlp_q3;
+	volatile uint32_t tlp_q4;
+	volatile uint32_t tlp_full;
+};
+
+#define ARK_UDM_PCIBP 0x00a0
+struct ark_udm_pcibp_t {
+	volatile uint32_t pci_clear;
+	volatile uint32_t pci_empty;
+	volatile uint32_t pci_q1;
+	volatile uint32_t pci_q2;
+	volatile uint32_t pci_q3;
+	volatile uint32_t pci_q4;
+	volatile uint32_t pci_full;
+};
+
+#define ARK_UDM_TLP_PS 0x00bc
+struct ark_udm_tlp_ps_t {
+	volatile uint32_t tlp_clear;
+	volatile uint32_t tlp_ps_min;
+	volatile uint32_t tlp_ps_max;
+	volatile uint32_t tlp_full_ps_min;
+	volatile uint32_t tlp_full_ps_max;
+	volatile uint32_t tlp_dw_ps_min;
+	volatile uint32_t tlp_dw_ps_max;
+	volatile uint32_t tlp_pldw_ps_min;
+	volatile uint32_t tlp_pldw_ps_max;
+};
+
+#define ARK_UDM_RT_CFG 0x00e0
+struct ark_udm_rt_cfg_t {
+	phys_addr_t hw_prod_addr;
+	uint32_t write_interval;	/* 4ns cycles */
+	volatile uint32_t prod_idx;	/* RO */
+};
+
+/*  Consolidated structure */
+struct ark_udm_t {
+	struct ark_udm_setup_t setup;
+	struct ark_udm_cfg_t cfg;
+	struct ark_udm_stats_t stats;
+	struct ark_udm_queue_stats_t qstats;
+	uint8_t reserved1[(ARK_UDM_TLP - ARK_UDM_PQ) -
+					  sizeof(struct ark_udm_queue_stats_t)];
+	struct ark_udm_tlp_t tlp;
+	uint8_t reserved2[(ARK_UDM_PCIBP - ARK_UDM_TLP) -
+					  sizeof(struct ark_udm_tlp_t)];
+	struct ark_udm_pcibp_t pcibp;
+	struct ark_udm_tlp_ps_t tlp_ps;
+	struct ark_udm_rt_cfg_t rt_cfg;
+	int8_t reserved3[(0x100 - ARK_UDM_RT_CFG) -
+					  sizeof(struct ark_udm_rt_cfg_t)];
+};
+
+#define ARK_UDM_EXPECT_SIZE (0x00fc + 4)
+#define ARK_UDM_QOFFSET ARK_UDM_EXPECT_SIZE
+
+int ark_udm_verify(struct ark_udm_t *udm);
+int ark_udm_stop(struct ark_udm_t *udm, int wait);
+void ark_udm_start(struct ark_udm_t *udm);
+int ark_udm_reset(struct ark_udm_t *udm);
+void ark_udm_configure(struct ark_udm_t *udm,
+					   uint32_t headroom,
+					   uint32_t dataroom,
+					   uint32_t write_interval_ns);
+void ark_udm_write_addr(struct ark_udm_t *udm, phys_addr_t addr);
+void ark_udm_stats_reset(struct ark_udm_t *udm);
+void ark_udm_dump_stats(struct ark_udm_t *udm, const char *msg);
+void ark_udm_dump_queue_stats(struct ark_udm_t *udm, const char *msg,
+							  uint16_t qid);
+void ark_udm_dump(struct ark_udm_t *udm, const char *msg);
+void ark_udm_dump_perf(struct ark_udm_t *udm, const char *msg);
+void ark_udm_dump_setup(struct ark_udm_t *udm, uint16_t q_id);
+int ark_udm_is_flushed(struct ark_udm_t *udm);
+
+/* Per queue data */
+uint64_t ark_udm_dropped(struct ark_udm_t *udm);
+uint64_t ark_udm_bytes(struct ark_udm_t *udm);
+uint64_t ark_udm_packets(struct ark_udm_t *udm);
+
+void ark_udm_queue_stats_reset(struct ark_udm_t *udm);
+void ark_udm_queue_enable(struct ark_udm_t *udm, int enable);
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..7f84780
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_2.0 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 0e0b600..da23898 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -104,6 +104,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
-- 
1.9.1

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor
  2017-03-20 10:08  2%                 ` [dpdk-dev] [PATCH v11 0/18] distributor lib performance enhancements David Hunt
  2017-03-20 10:08  1%                   ` [dpdk-dev] [PATCH v11 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-20 10:08  2%                   ` David Hunt
  2017-03-27 13:02  3%                     ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: David Hunt @ 2017-03-20 10:08 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Also bumped up the ABI version number in the Makefile

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_distributor/Makefile                    |  2 +-
 lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
 lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
 lib/librte_distributor/rte_distributor_v20.c       | 10 +++
 lib/librte_distributor/rte_distributor_version.map | 14 ++++
 5 files changed, 162 insertions(+), 10 deletions(-)
 create mode 100644 lib/librte_distributor/rte_distributor_v1705.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 2b28eff..2f05cf3 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
 
 EXPORT_MAP := rte_distributor_version.map
 
-LIBABIVER := 1
+LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 6e1debf..06df13d 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -36,6 +36,7 @@
 #include <rte_mbuf.h>
 #include <rte_memory.h>
 #include <rte_cycles.h>
+#include <rte_compat.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
 #include <rte_string_fns.h>
@@ -44,6 +45,7 @@
 #include "rte_distributor_private.h"
 #include "rte_distributor.h"
 #include "rte_distributor_v20.h"
+#include "rte_distributor_v1705.h"
 
 TAILQ_HEAD(rte_dist_burst_list, rte_distributor);
 
@@ -57,7 +59,7 @@ EAL_REGISTER_TAILQ(rte_dist_burst_tailq)
 /**** Burst Packet APIs called by workers ****/
 
 void
-rte_distributor_request_pkt(struct rte_distributor *d,
+rte_distributor_request_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **oldpkt,
 		unsigned int count)
 {
@@ -102,9 +104,14 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 */
 	*retptr64 |= RTE_DISTRIB_GET_BUF;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_request_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(void rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt,
+		unsigned int count),
+		rte_distributor_request_pkt_v1705);
 
 int
-rte_distributor_poll_pkt(struct rte_distributor *d,
+rte_distributor_poll_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **pkts)
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
@@ -138,9 +145,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_poll_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **pkts),
+		rte_distributor_poll_pkt_v1705);
 
 int
-rte_distributor_get_pkt(struct rte_distributor *d,
+rte_distributor_get_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **pkts,
 		struct rte_mbuf **oldpkt, unsigned int return_count)
 {
@@ -168,9 +179,14 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	}
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_get_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **pkts,
+		struct rte_mbuf **oldpkt, unsigned int return_count),
+		rte_distributor_get_pkt_v1705);
 
 int
-rte_distributor_return_pkt(struct rte_distributor *d,
+rte_distributor_return_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **oldpkt, int num)
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
@@ -197,6 +213,10 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 	return 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_return_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_return_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt, int num),
+		rte_distributor_return_pkt_v1705);
 
 /**** APIs called on distributor core ***/
 
@@ -342,7 +362,7 @@ release(struct rte_distributor *d, unsigned int wkr)
 
 /* process a set of packets to distribute them to workers */
 int
-rte_distributor_process(struct rte_distributor *d,
+rte_distributor_process_v1705(struct rte_distributor *d,
 		struct rte_mbuf **mbufs, unsigned int num_mbufs)
 {
 	unsigned int next_idx = 0;
@@ -476,10 +496,14 @@ rte_distributor_process(struct rte_distributor *d,
 
 	return num_mbufs;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_process, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs),
+		rte_distributor_process_v1705);
 
 /* return to the caller, packets returned from workers */
 int
-rte_distributor_returned_pkts(struct rte_distributor *d,
+rte_distributor_returned_pkts_v1705(struct rte_distributor *d,
 		struct rte_mbuf **mbufs, unsigned int max_mbufs)
 {
 	struct rte_distributor_returned_pkts *returns = &d->returns;
@@ -504,6 +528,10 @@ rte_distributor_returned_pkts(struct rte_distributor *d,
 
 	return retval;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_returned_pkts, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs),
+		rte_distributor_returned_pkts_v1705);
 
 /*
  * Return the number of packets in-flight in a distributor, i.e. packets
@@ -525,7 +553,7 @@ total_outstanding(const struct rte_distributor *d)
  * queued up.
  */
 int
-rte_distributor_flush(struct rte_distributor *d)
+rte_distributor_flush_v1705(struct rte_distributor *d)
 {
 	const unsigned int flushed = total_outstanding(d);
 	unsigned int wkr;
@@ -549,10 +577,13 @@ rte_distributor_flush(struct rte_distributor *d)
 
 	return flushed;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_flush, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_flush(struct rte_distributor *d),
+		rte_distributor_flush_v1705);
 
 /* clears the internal returns array in the distributor */
 void
-rte_distributor_clear_returns(struct rte_distributor *d)
+rte_distributor_clear_returns_v1705(struct rte_distributor *d)
 {
 	unsigned int wkr;
 
@@ -565,10 +596,13 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 	for (wkr = 0; wkr < d->num_workers; wkr++)
 		d->bufs[wkr].retptr64[0] = 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_clear_returns, _v1705, 17.05);
+MAP_STATIC_SYMBOL(void rte_distributor_clear_returns(struct rte_distributor *d),
+		rte_distributor_clear_returns_v1705);
 
 /* creates a distributor instance */
 struct rte_distributor *
-rte_distributor_create(const char *name,
+rte_distributor_create_v1705(const char *name,
 		unsigned int socket_id,
 		unsigned int num_workers,
 		unsigned int alg_type)
@@ -638,3 +672,8 @@ rte_distributor_create(const char *name,
 
 	return d;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_create, _v1705, 17.05);
+MAP_STATIC_SYMBOL(struct rte_distributor *rte_distributor_create(
+		const char *name, unsigned int socket_id,
+		unsigned int num_workers, unsigned int alg_type),
+		rte_distributor_create_v1705);
diff --git a/lib/librte_distributor/rte_distributor_v1705.h b/lib/librte_distributor/rte_distributor_v1705.h
new file mode 100644
index 0000000..81b2691
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v1705.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V1705_H_
+#define _RTE_DISTRIB_V1705_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct rte_distributor *
+rte_distributor_create_v1705(const char *name, unsigned int socket_id,
+		unsigned int num_workers,
+		unsigned int alg_type);
+
+int
+rte_distributor_process_v1705(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+int
+rte_distributor_returned_pkts_v1705(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+int
+rte_distributor_flush_v1705(struct rte_distributor *d);
+
+void
+rte_distributor_clear_returns_v1705(struct rte_distributor *d);
+
+int
+rte_distributor_get_pkt_v1705(struct rte_distributor *d,
+	unsigned int worker_id, struct rte_mbuf **pkts,
+	struct rte_mbuf **oldpkt, unsigned int retcount);
+
+int
+rte_distributor_return_pkt_v1705(struct rte_distributor *d,
+	unsigned int worker_id, struct rte_mbuf **oldpkt, int num);
+
+void
+rte_distributor_request_pkt_v1705(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt,
+		unsigned int count);
+
+int
+rte_distributor_poll_pkt_v1705(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **mbufs);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_distributor/rte_distributor_v20.c b/lib/librte_distributor/rte_distributor_v20.c
index 1f406c5..bb6c5d7 100644
--- a/lib/librte_distributor/rte_distributor_v20.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -38,6 +38,7 @@
 #include <rte_memory.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
+#include <rte_compat.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
 #include "rte_distributor_v20.h"
@@ -63,6 +64,7 @@ rte_distributor_request_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	buf->bufptr64 = req;
 }
+VERSION_SYMBOL(rte_distributor_request_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
@@ -76,6 +78,7 @@ rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
 	int64_t ret = buf->bufptr64 >> RTE_DISTRIB_FLAG_BITS;
 	return (struct rte_mbuf *)((uintptr_t)ret);
 }
+VERSION_SYMBOL(rte_distributor_poll_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
@@ -87,6 +90,7 @@ rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	return ret;
 }
+VERSION_SYMBOL(rte_distributor_get_pkt, _v20, 2.0);
 
 int
 rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
@@ -98,6 +102,7 @@ rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
 	buf->bufptr64 = req;
 	return 0;
 }
+VERSION_SYMBOL(rte_distributor_return_pkt, _v20, 2.0);
 
 /**** APIs called on distributor core ***/
 
@@ -314,6 +319,7 @@ rte_distributor_process_v20(struct rte_distributor_v20 *d,
 	d->returns.count = ret_count;
 	return num_mbufs;
 }
+VERSION_SYMBOL(rte_distributor_process, _v20, 2.0);
 
 /* return to the caller, packets returned from workers */
 int
@@ -334,6 +340,7 @@ rte_distributor_returned_pkts_v20(struct rte_distributor_v20 *d,
 
 	return retval;
 }
+VERSION_SYMBOL(rte_distributor_returned_pkts, _v20, 2.0);
 
 /* return the number of packets in-flight in a distributor, i.e. packets
  * being workered on or queued up in a backlog. */
@@ -362,6 +369,7 @@ rte_distributor_flush_v20(struct rte_distributor_v20 *d)
 
 	return flushed;
 }
+VERSION_SYMBOL(rte_distributor_flush, _v20, 2.0);
 
 /* clears the internal returns array in the distributor */
 void
@@ -372,6 +380,7 @@ rte_distributor_clear_returns_v20(struct rte_distributor_v20 *d)
 	memset(d->returns.mbufs, 0, sizeof(d->returns.mbufs));
 #endif
 }
+VERSION_SYMBOL(rte_distributor_clear_returns, _v20, 2.0);
 
 /* creates a distributor instance */
 struct rte_distributor_v20 *
@@ -415,3 +424,4 @@ rte_distributor_create_v20(const char *name,
 
 	return d;
 }
+VERSION_SYMBOL(rte_distributor_create, _v20, 2.0);
diff --git a/lib/librte_distributor/rte_distributor_version.map b/lib/librte_distributor/rte_distributor_version.map
index 73fdc43..3a285b3 100644
--- a/lib/librte_distributor/rte_distributor_version.map
+++ b/lib/librte_distributor/rte_distributor_version.map
@@ -13,3 +13,17 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_17.05 {
+	global:
+
+	rte_distributor_clear_returns;
+	rte_distributor_create;
+	rte_distributor_flush;
+	rte_distributor_get_pkt;
+	rte_distributor_poll_pkt;
+	rte_distributor_process;
+	rte_distributor_request_pkt;
+	rte_distributor_return_pkt;
+	rte_distributor_returned_pkts;
+} DPDK_2.0;
-- 
2.7.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v11 01/18] lib: rename legacy distributor lib files
  2017-03-20 10:08  2%                 ` [dpdk-dev] [PATCH v11 0/18] distributor lib performance enhancements David Hunt
@ 2017-03-20 10:08  1%                   ` David Hunt
  2017-03-20 10:08  2%                   ` [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor David Hunt
  1 sibling, 0 replies; 200+ results
From: David Hunt @ 2017-03-20 10:08 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Move files out of the way so that we can replace with new
versions of the distributor libtrary. Files are named in
such a way as to match the symbol versioning that we will
apply for backward ABI compatibility.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_distributor/Makefile                    |   3 +-
 lib/librte_distributor/rte_distributor.h           | 210 +-----------------
 .../{rte_distributor.c => rte_distributor_v20.c}   |   2 +-
 lib/librte_distributor/rte_distributor_v20.h       | 247 +++++++++++++++++++++
 4 files changed, 251 insertions(+), 211 deletions(-)
 rename lib/librte_distributor/{rte_distributor.c => rte_distributor_v20.c} (99%)
 create mode 100644 lib/librte_distributor/rte_distributor_v20.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 4c9af17..b314ca6 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -42,10 +42,11 @@ EXPORT_MAP := rte_distributor_version.map
 LIBABIVER := 1
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor.c
+SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include := rte_distributor.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include += rte_distributor_v20.h
 
 # this lib needs eal
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += lib/librte_eal
diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 7d36bc8..e41d522 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -34,214 +34,6 @@
 #ifndef _RTE_DISTRIBUTE_H_
 #define _RTE_DISTRIBUTE_H_
 
-/**
- * @file
- * RTE distributor
- *
- * The distributor is a component which is designed to pass packets
- * one-at-a-time to workers, with dynamic load balancing.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
-
-struct rte_distributor;
-struct rte_mbuf;
-
-/**
- * Function to create a new distributor instance
- *
- * Reserves the memory needed for the distributor operation and
- * initializes the distributor to work with the configured number of workers.
- *
- * @param name
- *   The name to be given to the distributor instance.
- * @param socket_id
- *   The NUMA node on which the memory is to be allocated
- * @param num_workers
- *   The maximum number of workers that will request packets from this
- *   distributor
- * @return
- *   The newly created distributor instance
- */
-struct rte_distributor *
-rte_distributor_create(const char *name, unsigned socket_id,
-		unsigned num_workers);
-
-/*  *** APIS to be called on the distributor lcore ***  */
-/*
- * The following APIs are the public APIs which are designed for use on a
- * single lcore which acts as the distributor lcore for a given distributor
- * instance. These functions cannot be called on multiple cores simultaneously
- * without using locking to protect access to the internals of the distributor.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * Process a set of packets by distributing them among workers that request
- * packets. The distributor will ensure that no two packets that have the
- * same flow id, or tag, in the mbuf will be procesed at the same time.
- *
- * The user is advocated to set tag for each mbuf before calling this function.
- * If user doesn't set the tag, the tag value can be various values depending on
- * driver implementation and configuration.
- *
- * This is not multi-thread safe and should only be called on a single lcore.
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs to be distributed
- * @param num_mbufs
- *   The number of mbufs in the mbufs array
- * @return
- *   The number of mbufs processed.
- */
-int
-rte_distributor_process(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned num_mbufs);
-
-/**
- * Get a set of mbufs that have been returned to the distributor by workers
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs pointer array to be filled in
- * @param max_mbufs
- *   The size of the mbufs array
- * @return
- *   The number of mbufs returned in the mbufs array.
- */
-int
-rte_distributor_returned_pkts(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned max_mbufs);
-
-/**
- * Flush the distributor component, so that there are no in-flight or
- * backlogged packets awaiting processing
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @return
- *   The number of queued/in-flight packets that were completed by this call.
- */
-int
-rte_distributor_flush(struct rte_distributor *d);
-
-/**
- * Clears the array of returned packets used as the source for the
- * rte_distributor_returned_pkts() API call.
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- */
-void
-rte_distributor_clear_returns(struct rte_distributor *d);
-
-/*  *** APIS to be called on the worker lcores ***  */
-/*
- * The following APIs are the public APIs which are designed for use on
- * multiple lcores which act as workers for a distributor. Each lcore should use
- * a unique worker id when requesting packets.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * API called by a worker to get a new packet to process. Any previous packet
- * given to the worker is assumed to have completed processing, and may be
- * optionally returned to the distributor via the oldpkt parameter.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- *
- * @return
- *   A new packet to be processed by the worker thread.
- */
-struct rte_mbuf *
-rte_distributor_get_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to return a completed packet without requesting a
- * new packet, for example, because a worker thread is shutting down
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param mbuf
- *   The previous packet being processed by the worker
- */
-int
-rte_distributor_return_pkt(struct rte_distributor *d, unsigned worker_id,
-		struct rte_mbuf *mbuf);
-
-/**
- * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
- * processing, and may be optionally returned to the distributor via
- * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt(), this function does not wait for a new
- * packet to be provided by the distributor.
- *
- * NOTE: after calling this function, rte_distributor_poll_pkt() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt()
- * API should *not* be used to try and retrieve the new packet.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- */
-void
-rte_distributor_request_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to check for a new packet that was previously
- * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
- * not yet been fulfilled by the distributor.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- *
- * @return
- *   A new packet to be processed by the worker thread, or NULL if no
- *   packet is yet available.
- */
-struct rte_mbuf *
-rte_distributor_poll_pkt(struct rte_distributor *d,
-		unsigned worker_id);
-
-#ifdef __cplusplus
-}
-#endif
+#include <rte_distributor_v20.h>
 
 #endif
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor_v20.c
similarity index 99%
rename from lib/librte_distributor/rte_distributor.c
rename to lib/librte_distributor/rte_distributor_v20.c
index f3f778c..b890947 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -40,7 +40,7 @@
 #include <rte_errno.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
-#include "rte_distributor.h"
+#include "rte_distributor_v20.h"
 
 #define NO_FLAGS 0
 #define RTE_DISTRIB_PREFIX "DT_"
diff --git a/lib/librte_distributor/rte_distributor_v20.h b/lib/librte_distributor/rte_distributor_v20.h
new file mode 100644
index 0000000..b69aa27
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v20.h
@@ -0,0 +1,247 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V20_H_
+#define _RTE_DISTRIB_V20_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
+
+struct rte_distributor;
+struct rte_mbuf;
+
+/**
+ * Function to create a new distributor instance
+ *
+ * Reserves the memory needed for the distributor operation and
+ * initializes the distributor to work with the configured number of workers.
+ *
+ * @param name
+ *   The name to be given to the distributor instance.
+ * @param socket_id
+ *   The NUMA node on which the memory is to be allocated
+ * @param num_workers
+ *   The maximum number of workers that will request packets from this
+ *   distributor
+ * @return
+ *   The newly created distributor instance
+ */
+struct rte_distributor *
+rte_distributor_create(const char *name, unsigned int socket_id,
+		unsigned int num_workers);
+
+/*  *** APIS to be called on the distributor lcore ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on a
+ * single lcore which acts as the distributor lcore for a given distributor
+ * instance. These functions cannot be called on multiple cores simultaneously
+ * without using locking to protect access to the internals of the distributor.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * Process a set of packets by distributing them among workers that request
+ * packets. The distributor will ensure that no two packets that have the
+ * same flow id, or tag, in the mbuf will be processed at the same time.
+ *
+ * The user is advocated to set tag for each mbuf before calling this function.
+ * If user doesn't set the tag, the tag value can be various values depending on
+ * driver implementation and configuration.
+ *
+ * This is not multi-thread safe and should only be called on a single lcore.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs to be distributed
+ * @param num_mbufs
+ *   The number of mbufs in the mbufs array
+ * @return
+ *   The number of mbufs processed.
+ */
+int
+rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+/**
+ * Get a set of mbufs that have been returned to the distributor by workers
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs pointer array to be filled in
+ * @param max_mbufs
+ *   The size of the mbufs array
+ * @return
+ *   The number of mbufs returned in the mbufs array.
+ */
+int
+rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+/**
+ * Flush the distributor component, so that there are no in-flight or
+ * backlogged packets awaiting processing
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @return
+ *   The number of queued/in-flight packets that were completed by this call.
+ */
+int
+rte_distributor_flush(struct rte_distributor *d);
+
+/**
+ * Clears the array of returned packets used as the source for the
+ * rte_distributor_returned_pkts() API call.
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ */
+void
+rte_distributor_clear_returns(struct rte_distributor *d);
+
+/*  *** APIS to be called on the worker lcores ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on
+ * multiple lcores which act as workers for a distributor. Each lcore should use
+ * a unique worker id when requesting packets.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * API called by a worker to get a new packet to process. Any previous packet
+ * given to the worker is assumed to have completed processing, and may be
+ * optionally returned to the distributor via the oldpkt parameter.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ *
+ * @return
+ *   A new packet to be processed by the worker thread.
+ */
+struct rte_mbuf *
+rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to return a completed packet without requesting a
+ * new packet, for example, because a worker thread is shutting down
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param mbuf
+ *   The previous packet being processed by the worker
+ */
+int
+rte_distributor_return_pkt(struct rte_distributor *d, unsigned int worker_id,
+		struct rte_mbuf *mbuf);
+
+/**
+ * API called by a worker to request a new packet to process.
+ * Any previous packet given to the worker is assumed to have completed
+ * processing, and may be optionally returned to the distributor via
+ * the oldpkt parameter.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for a new
+ * packet to be provided by the distributor.
+ *
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packet requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packet.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ */
+void
+rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to check for a new packet that was previously
+ * requested by a call to rte_distributor_request_pkt(). It does not wait
+ * for the new packet to be available, but returns NULL if the request has
+ * not yet been fulfilled by the distributor.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ *
+ * @return
+ *   A new packet to be processed by the worker thread, or NULL if no
+ *   packet is yet available.
+ */
+struct rte_mbuf *
+rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.7.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v11 0/18] distributor lib performance enhancements
  2017-03-15  6:19  1%               ` [dpdk-dev] [PATCH v10 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-20 10:08  2%                 ` David Hunt
  2017-03-20 10:08  1%                   ` [dpdk-dev] [PATCH v11 01/18] lib: rename legacy distributor lib files David Hunt
  2017-03-20 10:08  2%                   ` [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor David Hunt
  0 siblings, 2 replies; 200+ results
From: David Hunt @ 2017-03-20 10:08 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson

This patch aims to improve the throughput of the distributor library.

It uses a similar handshake mechanism to the previous version of
the library, in that bits are used to indicate when packets are ready
to be sent to a worker and ready to be returned from a worker. One main
difference is that instead of sending one packet in a cache line, it makes
use of the 7 free spaces in the same cache line in order to send up to
8 packets at a time to/from a worker.

The flow matching algorithm has had significant re-work, and now keeps an
array of inflight flows and an array of backlog flows, and matches incoming
flows to the inflight/backlog flows of all workers so that flow pinning to
workers can be maintained.

The Flow Match algorithm has both scalar and a vector versions, and a
function pointer is used to select the post appropriate function at run time,
depending on the presence of the SSE2 cpu flag. On non-x86 platforms, the
the scalar match function is selected, which should still gives a good boost
in performance over the non-burst API.

v11 changes:
   * Fixed RTE_DIST_BURST_SIZE so it compiles on Arm platforms
   * Fixed compile issue in rte_distributor_match_generic.c on Arm plarforms
   * Tweaked distributor_app docs based on review and added John's Ack

v10 changes:
   * Addressed all review comments from v9 (thanks, Bruce)
   * Inherited v9 series Ack by Bruce, except new suggested addition
     for example app documentation (17/18)
   * Squashed the two patches containing distributor structs and code
   * Renamed confusing rte_distributor_v1705.h to rte_distributor_next.h
   * Added usleep in main so as to be a little more gentle with that core
   * Fixed some patch titles and improved some descriptions
   * Updated sample app guide documentation
   * Removed un-needed code limiting Tx rings and cleaned up patch

v9 changes:
   * fixed symbol versioning so it will compile on CentOS and RedHat

v8 changes:
   * Changed the patch set to have a more logical order order of
     the changes, but the end result is basically the same.
   * Fixed broken shared library build.
   * Split down the updates to example app more
   * No longer changes the test app and sample app to use a temporary
     API.
   * No longer temporarily re-names the functions in the
     version.map file.

v7 changes:
   * Reorganised patch so there's a more natural progression in the
     changes, and divided them down into easier to review chunks.
   * Previous versions of this patch set were effectively two APIs.
     We now have a single API. Legacy functionality can
     be used by by using the rte_distributor_create API call with the
     RTE_DISTRIBUTOR_SINGLE flag when creating a distributor instance.
   * Added symbol versioning for old API so that ABI is preserved.

v6 changes:
   * Fixed intermittent segfault where num pkts not divisible
     by BURST_SIZE
   * Cleanup due to review comments on mailing list
   * Renamed _priv.h to _private.h.

v5 changes:
   * Removed some un-needed code around retries in worker API calls
   * Cleanup due to review comments on mailing list
   * Cleanup of non-x86 platform compilation, fallback to scalar match

v4 changes:
   * fixed issue building shared libraries

v3 changes:
  * Addressed mailing list review comments
  * Test code removal
  * Split out SSE match into separate file to facilitate NEON addition
  * Cleaned up conditional compilation flags for SSE2
  * Addressed c99 style compilation errors
  * rebased on latest head (Jan 2 2017, Happy New Year to all)

v2 changes:
  * Created a common distributor_priv.h header file with common
    definitions and structures.
  * Added a scalar version so it can be built and used on machines without
    sse2 instruction set
  * Added unit autotests
  * Added perf autotest

Notes:
   Apps must now work in bursts, as up to 8 are given to a worker at a time
   For performance in matching, Flow ID's are 15-bits
   If 32 bits Flow IDs are required, use the packet-at-a-time (SINGLE)
   mode.

Performance Gains
   2.2GHz Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
   2 x XL710 40GbE NICS to 2 x 40Gbps traffic generator channels 64b packets
   separate cores for rx, tx, distributor
    1 worker  - up to 4.8x
    4 workers - up to 2.9x
    8 workers - up to 1.8x
   12 workers - up to 2.1x
   16 workers - up to 1.8x

[01/18] lib: rename legacy distributor lib files
[02/18] lib: create private header file
[03/18] lib: add new distributor code
[04/18] lib: add SIMD flow matching to distributor
[05/18] test/distributor: extra params for autotests
[06/18] lib: switch distributor over to new API
[07/18] lib: make v20 header file private
[08/18] lib: add symbol versioning to distributor
[09/18] test: test single and burst distributor API
[10/18] test: add perf test for distributor burst mode
[11/18] examples/distributor: allow for extra stats
[12/18] examples/distributor: wait for ports to come up
[13/18] examples/distributor: add dedicated core for dist
[14/18] examples/distributor: tweaks for performance
[15/18] examples/distributor: give Rx thread a core
[16/18] doc: distributor library changes for new burst API
[17/18] doc: distributor app changes for new burst API
[18/18] maintainers: add to distributor lib maintainers

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCH v2 00/13] introduce fail-safe PMD
  @ 2017-03-18 19:51  3%             ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2017-03-18 19:51 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Thomas Monjalon, Bruce Richardson, dev, Adrien Mazarguil, techboard

On Fri, Mar 17, 2017 at 11:56:21AM +0100, Gaëtan Rivet wrote:
> On Thu, Mar 16, 2017 at 04:50:43PM -0400, Neil Horman wrote:
> > On Wed, Mar 15, 2017 at 03:25:37PM +0100, Gaëtan Rivet wrote:
> > > On Wed, Mar 15, 2017 at 12:15:56PM +0100, Thomas Monjalon wrote:
> > > > 2017-03-15 03:28, Bruce Richardson:
> > > > > On Tue, Mar 14, 2017 at 03:49:47PM +0100, Gaëtan Rivet wrote:
> > > > > > - In the bonding, the init and configuration steps are still the
> > > > > >  responsibility of the application and no one else. The bonding PMD
> > > > > >  captures the device, re-applies its configuration upon dev_configure()
> > > > > >  which is actually re-applying part of the configuration already  present
> > > > > > within the slave eth_dev (cf rte_eth_dev_config_restore).
> > > > > >
> > > > > > - In the fail-safe, the init and configuration are both the
> > > > > >  responsibilities of the fail-safe PMD itself, not the application
> > > > > >  anymore. This handling of these responsibilities in lieu of the
> > > > > >  application is the whole point of the "deferred hot-plug" support, of
> > > > > >  proposing a simple implementation to the user.
> > > > > >
> > > > > > This change in responsibilities is the bulk of the fail-safe code. It
> > > > > > would have to be added as-is to the bonding. Verifying the correctness
> > > > > > of the sync of the initialization phase (acceptable states of a device
> > > > > > following several events registered by the fail-safe PMD) and the
> > > > > > configuration items between the state the application believes it is in
> > > > > > and the fail-safe knows it is in, is the bulk of the fail-safe code.
> > > > > >
> > > > > > This function is not overlapping with that of the bonding. The reason I
> > > > > > did not add this whole architecture to the bonding is that when I tried
> > > > > > to do so, I found that I only had two possibilities:
> > > > > >
> > > > > > - The current slave handling path is kept, and we only add a new one
> > > > > >  with additional functionalities: full init and conf handling with
> > > > > >  extended parsing capabilities.
> > > > > >
> > > > > > - The current slave handling is scraped and replaced entirely by the new
> > > > > >  slave management. The old capturing of existing device is not done
> > > > > >  anymore.
> > > > > >
> > > > > > The first solution is not acceptable, because we effectively end-up with
> > > > > > a maintenance nightmare by having to validate two types of slaves with
> > > > > > differing capabilities, differing initialization paths and differing
> > > > > > configuration code.  This is extremely awkward and architecturally
> > > > > > unsound. This is essentially the same as having the exact code of the
> > > > > > fail-safe as an aside in the bonding, maintening exactly the same
> > > > > > breadth of code while having muddier interfaces and organization.
> > > > > >
> > > > > > The second solution is not acceptable, because we are bending the whole
> > > > > > existing bonding API to our whim. We could just as well simply rename
> > > > > > the fail-safe PMD as bonding, add a few grouping capabilities and call
> > > > > > it a day. This is not acceptable for users.
> > > > > >
> > > > > If the first solution is indeed not an option, why do you think this
> > > > > second one would be unacceptable for users? If the functionality remains
> > > > > the same, I don't see how it matters much for users which driver
> > > > > provides it or where the code originates.
> > > > >
> > > 
> > > The problem with the second solution is also that bonding is not only a PMD.
> > > It exposes its own public API that existing applications rely on, see
> > > rte_eth_bond_*() definitions in rte_eth_bond.h.
> > > 
> > > Although bonding instances can be set up through command-line options,
> > > target "users" are mainly applications explicitly written to use it.
> > > This must be preserved for no other reason that it hasn't been deprecated.
> > > 
> > I fail to see how either of your points are relevant.  The fact that the bonding
> > pmd exposes an api to the application has no bearing on its ability to implement
> > a hot plug function.
> > 
> 
> This depends on the API making sense in the context of the new
> functionality.
> 
Well, the api should always make sense in the context of any added
functionality, but it seems to me thats just another way of saying you might
need to make some modifications to the bonding api.  I'm not saying thats
necessecarily going to have to be the case, but while I'm a big proponent of ABI
stability, I would support an API change to add valid functionality if it were a
big enough feature.  Though again, I don't think thats really necessecary if you
rethink the model a little bit

> This API offers to add and remove slaves to a bonding and to configure them.
> In the fail-safe arch, it is not possible to add and remove slaves from the
> grouping. Doing so would mean adding and removing devices from internal EAL
> structures.
> 
Ok, so you update the bonding option parser to include a fail-safe mode, which
only supports two slaves, one of which is an instance of the null-pmd and the
other is to be added at a later date in response to a hot plug event.  In the
fail safe mode, after initial option parsing, bonds operating in this mode
return an error to the application when/if it attempts to remove a slave from a
bond.  That doesn't seem so hard to me.

> It is also invalid to try to configure a fail-safe slave. An application
> only configures a fail-safe device, which will in turn configure its slaves.
> This separation follows from the nature of a device failover.
> 
See my previous mail, you augment the null pmd to allow storage of aribtrary
configuration strings or key/value pairs when operating as a bonded slave.  The
bond is then capable of retrieving configuration from that null instance and
pushing it to the real slave on a hot plug event.

> As seen previously, the fail-safe PMD handles different responsibilities
> from the bonding PMD. It is thus necessary to make different assumptions
> concerning what it can and cannot do with a slave.
> 
Only because you seem to refuse to think about the failsafe model in any other
way than the one you have implemented.

> > > Also, trying to implement this API for the device failover function would
> > > implies a device capture down to the devargs parsing level. This means that
> > > a PMD could request taking over a device, messing with the internals of the
> > > EAL: devargs list and busses lists of devices. This seems unacceptable.
> > > 
> > Why?  You just said yourself above that, while there is a devargs interface to
> > the bonding driver, there is also an api, which is the more used method to
> > configure bonding.  I'm not sure I agree with that, but I think its beside the
> > point.  Your PMD also requires configuration, and it appears necessecary that
> > you do so from the command line (you need to specifically ennumerate the
> > subdevices that you intend to provide failsafe behavior to).  I see no reason
> > why such a feature cant' be added to bonding, and the null pmd used as a
> > standin device, should the ennumerated device not yet exist).
> > 
> > To your argument regarding about taking over a device, I don't see how you find
> > that unacceptable, as it is precisely what the bonding driver does today, in the
> > sense that it allows an application to assign a master/slave relationship to
> > devices right now.  I see no reason that we can't convey the right and ability
> > for bonding to do that dynamically based on configuration.
> > 
> 
> No, the bonding PMD does not take over a device. It only cares about the
> ether layer for its link failover. It does not care about parsing parameters
> of a slave, probing devices, detaching drivers. It does not remove a device
> from the pci_device_list in the EAL for example.
> 
Again, please take a moment and think about how else your failsafe model might
be implemented in the context of bonding.  Right now, you are asserting that
your failsafe model can't be implemented in any other way becausee of the
decisions you have made.  If you change how you think of failsafe, you'll see it
can be done in other ways.

> Doing so would imply exposing private internals structures from the EAL,
> messing with elements reserved while doing the EAL init. This requires
> controlling a priority in the device initialization order to create the
> over-arching ones last (which is a hacky solution). It would wreak havoc
> with the DPDK arch.
> 
> The fail-safe PMD does not rely on the EAL for handling its slaves. This is
> what I explained before, when I touched upon the differing responsibilities
> implied by the differences in nature between a link failover and a device
> failover.
> 
> > > The bonding API is thus in conflict with the concept of a device failover in
> > > the context of the current DPDK arch.
> > > 
> > I really don't see how you get to this from your argument above.
> > 
> 
> The current DPDK arch does not expose EAL elements to be modified by PMDs,
> and with good reasons. In this context, it is not possible to handle slaves
> correctly for a device failover in the bonding PMD,
> because the bonding PMD from the get-go expects the EAL to handle its slaves
> on a device level.
> 
> > > > > Despite all the discussion, it still just doesn't make sense to me to
> > > > > have more than one DPDK driver to handle failover - be it link or
> > > > > device. If nothing else, it's going to be awkward to explain to users
> > > > > that if they want fail-over for when a link goes down they have to use
> > > > > driver A, but if they want fail-over when a NIC gets hotplugged they use
> > > > > driver B, and if they want both kinds of failover - which would surely
> > > > > be the expected case - they need to use both drivers. The usability is
> > > > > a problem here.
> > > 
> > > Having both kind of failovers in the same PMD will always lead to the first
> > > solution in some form or another.
> > > 
> > It really isn't because you can model hotplug behavior as a trival form of the
> > failover that bonding does now (i.e. failover between a null device and a
> > preferred real device).
> > 
> 
> The preferred real device still has to be created / destroyed. It still
> relies on EAL entry points for handling. It still puts additional
> responsibilities on a PMD. Those responsibilities are expressed in sub
> layers clearly defined in the fail-safe PMD. You would have to create these
> sub-layers in some form in the bonding for it to be able to create a
> preferred real device at some point. This additional way of handling slaves
> has already been discussed as inducing a messy architecture to the bonding
> PMD.
> 
> > > I am sure we can document all this in a way that does no cause users
> > > confusion, with the help of community feedback such as yours.
> > > 
> > > Perhaps "net_failsafe" is a misnomer? We also thought about "net_persistent"
> > > or "net_hotplug". Any other ideas?
> > > 
> > > It is also possible for me to remove the failover support from this series,
> > > only providing deferred hot-plug handling at first. I could then send the
> > > failover support as separate patches to better assert that it is a useful,
> > > secondary feature that is essentially free to implement.
> > > 
> > I think thats solving the wrong problem.  I've no issue with the functionality
> > in this patch, its really the implementation that we are all arguing against.
> > 
> > > >
> > > > It seems everybody agrees on the need for the failsafe code.
> > > > We are just discussing the right place to implement it.
> > > >
> > > > Gaetan, moving this code in the bonding PMD means replacing the bonding
> > > > API design by the failsafe design, right?
> > > > With the failsafe design in the bonding PMD, is it possible to keep other
> > > > bonding features?
> > > 
> > > As seen previously, the bonding API is incompatible with device failover.
> > > 
> > Its not been seen previously, you asserted it to be so, and I certainly disagree
> > with that assertion.  I think others might too.
> > 
> 
> I also explained at length my assertion. I can certainly expand further if
> necessary, but you need to point the elements you disagree with.
> 
> > Additionally, its not really in line with this discussion, but in looking at
> > your hotplug detection code, I think somewhat lacking.  Currently you seem to
> > implement this with a timer that wakes up and checks for device existance, which
> > is pretty substandard in my mind.  Thats going to waste cpu cycles that might
> > lead to packet loss.  I'd really prefer to see you augment the eal library with
> > an event handling code (it can tie into udev in linux and kqueue in bsd), and
> > create a generic event hook, that we can use to detect device adds/removes
> > without having to wake up constantly to see if anything has changed.
> > 
> > 
> 
> I think it's fine. We can discuss it further once we agree on the form the
> hot-plug implementation will take in the DPDK.
> 


Well, until then, I feel like we're talking past one another at this point, and
so I'll just say this driver is a NAK for me.

Neil


> > > Having some features enabled solely for one kind of failover, while
> > > having
> > > specific code paths for both, seems unecessarily complicated to me ;
> > > following suite with my previous points about the first solution.
> > > 
> > > >
> > > > In case we do not have a consensus in the following days, I suggest to add
> > > > this topic in the next techboard meeting agenda.
> 
> Best regards,
> -- 
> Gaëtan Rivet
> 6WIND
> 

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH] net/ark: poll-mode driver for AtomicRules Arkville
@ 2017-03-17 21:15  1% Ed Czeck
  0 siblings, 0 replies; 200+ results
From: Ed Czeck @ 2017-03-17 21:15 UTC (permalink / raw)
  To: dev; +Cc: Ed Czeck, Shepard Siegel, John Miller

This is the PMD for Atomic Rules's Arkville ARK family of devices.
See doc/guides/nics/ark.rst for detailed description.


Signed-off-by: Shepard Siegel <shepard.siegel@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
---
 MAINTAINERS                                 |   8 +
 config/common_base                          |  10 +
 config/defconfig_arm-armv7a-linuxapp-gcc    |   1 +
 config/defconfig_ppc_64-power8-linuxapp-gcc |   1 +
 doc/guides/nics/ark.rst                     | 238 +++++++
 doc/guides/nics/features/ark.ini            |  15 +
 doc/guides/nics/index.rst                   |   1 +
 drivers/net/Makefile                        |   1 +
 drivers/net/ark/Makefile                    |  73 +++
 drivers/net/ark/ark_ddm.c                   | 150 +++++
 drivers/net/ark/ark_ddm.h                   | 154 +++++
 drivers/net/ark/ark_debug.h                 |  72 ++
 drivers/net/ark/ark_ethdev.c                | 982 ++++++++++++++++++++++++++++
 drivers/net/ark/ark_ethdev.h                |  75 +++
 drivers/net/ark/ark_ethdev.o.pmd.c          |   2 +
 drivers/net/ark/ark_ethdev_rx.c             | 671 +++++++++++++++++++
 drivers/net/ark/ark_ethdev_tx.c             | 479 ++++++++++++++
 drivers/net/ark/ark_ext.h                   |  71 ++
 drivers/net/ark/ark_global.h                | 164 +++++
 drivers/net/ark/ark_mpu.c                   | 168 +++++
 drivers/net/ark/ark_mpu.h                   | 143 ++++
 drivers/net/ark/ark_pktchkr.c               | 445 +++++++++++++
 drivers/net/ark/ark_pktchkr.h               | 114 ++++
 drivers/net/ark/ark_pktdir.c                |  79 +++
 drivers/net/ark/ark_pktdir.h                |  68 ++
 drivers/net/ark/ark_pktgen.c                | 477 ++++++++++++++
 drivers/net/ark/ark_pktgen.h                | 106 +++
 drivers/net/ark/ark_rqp.c                   |  93 +++
 drivers/net/ark/ark_rqp.h                   |  75 +++
 drivers/net/ark/ark_udm.c                   | 221 +++++++
 drivers/net/ark/ark_udm.h                   | 175 +++++
 drivers/net/ark/rte_pmd_ark_version.map     |   4 +
 mk/rte.app.mk                               |   1 +
 33 files changed, 5337 insertions(+)
 create mode 100644 doc/guides/nics/ark.rst
 create mode 100644 doc/guides/nics/features/ark.ini
 create mode 100644 drivers/net/ark/Makefile
 create mode 100644 drivers/net/ark/ark_ddm.c
 create mode 100644 drivers/net/ark/ark_ddm.h
 create mode 100644 drivers/net/ark/ark_debug.h
 create mode 100644 drivers/net/ark/ark_ethdev.c
 create mode 100644 drivers/net/ark/ark_ethdev.h
 create mode 100644 drivers/net/ark/ark_ethdev.o.pmd.c
 create mode 100644 drivers/net/ark/ark_ethdev_rx.c
 create mode 100644 drivers/net/ark/ark_ethdev_tx.c
 create mode 100644 drivers/net/ark/ark_ext.h
 create mode 100644 drivers/net/ark/ark_global.h
 create mode 100644 drivers/net/ark/ark_mpu.c
 create mode 100644 drivers/net/ark/ark_mpu.h
 create mode 100644 drivers/net/ark/ark_pktchkr.c
 create mode 100644 drivers/net/ark/ark_pktchkr.h
 create mode 100644 drivers/net/ark/ark_pktdir.c
 create mode 100644 drivers/net/ark/ark_pktdir.h
 create mode 100644 drivers/net/ark/ark_pktgen.c
 create mode 100644 drivers/net/ark/ark_pktgen.h
 create mode 100644 drivers/net/ark/ark_rqp.c
 create mode 100644 drivers/net/ark/ark_rqp.h
 create mode 100644 drivers/net/ark/ark_udm.c
 create mode 100644 drivers/net/ark/ark_udm.h
 create mode 100644 drivers/net/ark/rte_pmd_ark_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 39bc78e..6f6136a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -280,6 +280,14 @@ M: Evgeny Schemeilin <evgenys@amazon.com>
 F: drivers/net/ena/
 F: doc/guides/nics/ena.rst
 
+Atomic Rules ark
+M: Shepard Siegel <shepard.siegel@atomicrules.com>
+M: Ed Czeck       <ed.czeck@atomicrules.com>
+M: John Miller    <john.miller@atomicrules.com>
+F: /drivers/net/ark/
+F: doc/guides/nics/ark.rst
+F: doc/guides/nics/features/ark.ini
+
 Broadcom bnxt
 M: Stephen Hurd <stephen.hurd@broadcom.com>
 M: Ajit Khaparde <ajit.khaparde@broadcom.com>
diff --git a/config/common_base b/config/common_base
index aeee13e..e64cd83 100644
--- a/config/common_base
+++ b/config/common_base
@@ -348,6 +348,16 @@ CONFIG_RTE_LIBRTE_QEDE_FW=""
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
 #
+# Compile ARK PMD
+#
+CONFIG_RTE_LIBRTE_ARK_PMD=y
+CONFIG_RTE_LIBRTE_ARK_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS=n
+CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE=n
+
+
+#
 # Compile the TAP PMD
 # It is enabled by default for Linux only.
 #
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
index d9bd2a8..6d2b5e0 100644
--- a/config/defconfig_arm-armv7a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -61,6 +61,7 @@ CONFIG_RTE_SCHED_VECTOR=n
 
 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_EM_PMD=n
 CONFIG_RTE_LIBRTE_IGB_PMD=n
 CONFIG_RTE_LIBRTE_CXGBE_PMD=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 35f7fb6..89bc396 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -48,6 +48,7 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
 # Note: Initially, all of the PMD drivers compilation are turned off on Power
 # Will turn on them only after the successful testing on Power
+CONFIG_RTE_LIBRTE_ARK_PMD=n
 CONFIG_RTE_LIBRTE_IXGBE_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
new file mode 100644
index 0000000..e1e1c5c
--- /dev/null
+++ b/doc/guides/nics/ark.rst
@@ -0,0 +1,238 @@
+.. BSD LICENSE
+
+    Copyright (c) 2015-2017 Atomic Rules LLC
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Atomic Rules LLC nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ARK Poll Mode Driver
+====================
+
+The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville
+(ARK) family of devices.
+
+More information can be found at the `Atomic Rules website
+<http://atomicrules.com>`_.
+
+Overview
+--------
+
+The Atomic Rules Arkville product is DPDK and AXI compliant product
+that marshals packets across a PCIe conduit between host DPDK mbufs and
+FPGA AXI streams.
+
+The ARK PMD, and the spirit of the overall Arkville product,
+has been to take the DPDK API/ABI as a fixed specification;
+then implement much of the business logic in FPGA RTL circuits.
+The approach of *working backwards* from the DPDK API/ABI and having
+the GPP host software *dictate*, while the FPGA hardware *copes*,
+results in significant performance gains over a naive implementation.
+
+While this document describes the ARK PMD software, it is helpful to
+understand what the FPGA hardware is and is not. The Arkville RTL
+component provides a single PCIe Physical Function (PF) supporting
+some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls
+the Arkville core through a dedicated opaque Core BAR (CBAR).
+To allow users full freedom for their own FPGA application IP,
+an independent FPGA Application BAR (ABAR) is provided.
+
+One popular way to imagine Arkville's FPGA hardware aspect is as the
+FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does
+not contain any MACs, and is link-speed independent, as well as
+agnostic to the number of physical ports the application chooses to
+use. The ARK driver exposes the familiar PMD interface to allow packet
+movement to and from mbufs across multiple queues.
+
+However FPGA RTL applications could contain a universe of added
+functionality that an Arkville RTL core does not provide or can
+not anticipate. To allow for this expectation of user-defined
+innovation, the ARK PMD provides a dynamic mechanism of adding
+capabilities without having to modify the ARK PMD.
+
+The ARK PMD is intended to support all instances of the Arkville
+RTL Core, regardless of configuration, FPGA vendor, or target
+board. While specific capabilities such as number of physical
+hardware queue-pairs are negotiated; the driver is designed to
+remain constant over a broad and extendable feature set.
+
+Intentionally, Arkville by itself DOES NOT provide common NIC
+capabilities such as offload or receive-side scaling (RSS).
+These capabilities would be viewed as a gate-level "tax" on
+Green-box FPGA applications that do not require such function.
+Instead, they can be added as needed with essentially no
+overhead to the FPGA Application.
+
+Data Path Interface
+-------------------
+
+Ingress RX and Egress TX operation is by the nominal DPDK API .
+The driver supports single-port, multi-queue for both RX and TX.
+
+Refer to ``ark_ethdev.h`` for the list of supported methods to
+act upon RX and TX Queues.
+
+Configuration Information
+-------------------------
+
+**DPDK Configuration Parameters**
+
+  The following configuration options are available for the ARK PMD:
+
+   * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion
+     of the ARK PMD driver in the DPDK compilation.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug
+     logging and internal checking of RX ingress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug
+     logging and internal checking of TX egress logic within the ARK PMD driver.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug
+     logging of detailed packet and performance statistics gathered in
+     the PMD and FPGA.
+
+   * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug
+     logging of detailed PMD events and status.
+
+
+Building DPDK
+-------------
+
+See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for
+instructions on how to build DPDK.
+
+By default the ARK PMD library will be built into the DPDK library.
+
+For configuring and using UIO and VFIO frameworks, please also refer :ref:`the
+documentation that comes with DPDK suite <linux_gsg>`.
+
+Supported ARK RTL PCIe Instances
+--------------------------------
+
+ARK PMD supports the following Arkville RTL PCIe instances including:
+
+* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
+* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+
+Supported Operating Systems
+---------------------------
+
+Any Linux distribution fulfilling the conditions described in ``System Requirements``
+section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK Release Notes*.
+
+Supported Features
+------------------
+
+* Dynamic ARK PMD extensions
+* Multiple receive and transmit queues
+* Jumbo frames up to 9K
+* Hardware Statistics
+
+Unsupported Features
+--------------------
+
+Features that may be part of, or become part of, the Arkville RTL IP that are
+not currently supported or exposed by the ARK PMD include:
+
+* PCIe SR-IOV Virtual Functions (VFs)
+* Arkville's Packet Generator Control and Status
+* Arkville's Packet Director Control and Status
+* Arkville's Packet Checker Control and Status
+* Arkville's Timebase Management
+
+Pre-Requisites
+--------------
+
+#. Prepare the system as recommended by DPDK suite.  This includes environment
+   variables, hugepages configuration, tool-chains and configuration
+
+#. Insert igb_uio kernel module using the command 'modprobe igb_uio'
+
+#. Bind the intended ARK device to igb_uio module
+
+At this point the system should be ready to run DPDK applications. Once the
+application runs to completion, the ARK PMD can be detached from igb_uio if necessary.
+
+Usage Example
+-------------
+
+This section demonstrates how to launch **testpmd** with Atomic Rules ARK
+devices managed by librte_pmd_ark.
+
+#. Load the kernel modules:
+
+   .. code-block:: console
+
+      modprobe uio
+      insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
+
+   .. note::
+
+      The ARK PMD driver depends upon the igb_uio user space I/O kernel module
+
+#. Mount and request huge pages:
+
+   .. code-block:: console
+
+      mount -t hugetlbfs nodev /mnt/huge
+      echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+
+#. Bind UIO driver to ARK device at 0000:01:00.0 (using dpdk-devbind.py):
+
+   .. code-block:: console
+
+      ./usertools/dpdk-devbind.py --bind=igb_uio 0000:01:00.0
+
+   .. note::
+
+      The last argument to dpdk-devbind.py is the 4-tuple that indentifies a specific PCIe
+      device. You can use lspci -d 1d6c: to indentify all Atomic Rules devices in the system,
+      and thus determine the correct 4-tuple argument to dpdk-devbind.py
+
+#. Start testpmd with basic parameters:
+
+   .. code-block:: console
+
+      ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i
+
+   Example output:
+
+   .. code-block:: console
+
+      [...]
+      EAL: PCI device 0000:01:00.0 on NUMA socket -1
+      EAL:   probe driver: 1d6c:100e rte_ark_pmd
+      EAL:   PCI memory mapped at 0x7f9b6c400000
+      PMD: eth_ark_dev_init(): Initializing 0:2:0.1
+      ARKP PMD CommitID: 378f3a67
+      Configuring Port 0 (socket 0)
+      Port 0: DC:3C:F6:00:00:01
+      Checking link statuses...
+      Port 0 Link Up - speed 100000 Mbps - full-duplex
+      Done
+      testpmd>
+
diff --git a/doc/guides/nics/features/ark.ini b/doc/guides/nics/features/ark.ini
new file mode 100644
index 0000000..dc8a0e2
--- /dev/null
+++ b/doc/guides/nics/features/ark.ini
@@ -0,0 +1,15 @@
+;
+; Supported features of the 'ark' poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Queue start/stop     = Y
+Jumbo frame          = Y
+Scattered Rx         = Y
+Basic stats          = Y
+Stats per queue      = Y
+FW version           = Y
+Linux UIO            = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 87f9334..381d82c 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     :numbered:
 
     overview
+    ark
     bnx2x
     bnxt
     cxgbe
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index a16f25e..ea9868b 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@
 include $(RTE_SDK)/mk/rte.vars.mk
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
+DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
 DIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += bonding
 DIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe
diff --git a/drivers/net/ark/Makefile b/drivers/net/ark/Makefile
new file mode 100644
index 0000000..5d70ef3
--- /dev/null
+++ b/drivers/net/ark/Makefile
@@ -0,0 +1,73 @@
+# BSD LICENSE
+#
+# Copyright (c) 2015-2017 Atomic Rules LLC
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_ark.a
+
+CFLAGS += -O3 -I./
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_ark_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD)
+#
+
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ethdev_tx.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_pktgen.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_pktchkr.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_pktdir.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_mpu.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_ddm.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_udm.c
+SRCS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark_rqp.c
+
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_kvargs
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libpthread
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += lib/libdl
+
+
+include $(RTE_SDK)/mk/rte.lib.mk
+
diff --git a/drivers/net/ark/ark_ddm.c b/drivers/net/ark/ark_ddm.c
new file mode 100644
index 0000000..a8e20a2
--- /dev/null
+++ b/drivers/net/ark/ark_ddm.c
@@ -0,0 +1,150 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_debug.h"
+#include "ark_ddm.h"
+
+/* ************************************************************************* */
+int
+ark_ddm_verify(struct ark_ddm_t *ddm)
+{
+	if (sizeof(struct ark_ddm_t) != ARK_DDM_EXPECTED_SIZE) {
+	fprintf(stderr, "  DDM structure looks incorrect %#x vs %#lx\n",
+		ARK_DDM_EXPECTED_SIZE, sizeof(struct ark_ddm_t));
+	return -1;
+	}
+
+	if (ddm->cfg.const0 != ARK_DDM_CONST) {
+	fprintf(stderr, "  DDM module not found as expected 0x%08x\n",
+		ddm->cfg.const0);
+	return -1;
+	}
+	return 0;
+}
+
+void
+ark_ddm_start(struct ark_ddm_t *ddm)
+{
+	ddm->cfg.command = 1;
+}
+
+int
+ark_ddm_stop(struct ark_ddm_t *ddm, const int wait)
+{
+	int cnt = 0;
+
+	ddm->cfg.command = 2;
+	while (wait && (ddm->cfg.stopFlushed & 0x01) == 0) {
+	if (cnt++ > 1000)
+		return 1;
+
+	usleep(10);
+	}
+	return 0;
+}
+
+void
+ark_ddm_reset(struct ark_ddm_t *ddm)
+{
+	int status;
+
+	/* reset only works if ddm has stopped properly. */
+	status = ark_ddm_stop(ddm, 1);
+
+	if (status != 0) {
+	ARK_DEBUG_TRACE("ARKP: %s  stop failed  doing forced reset\n",
+		__func__);
+	ddm->cfg.command = 4;
+	usleep(10);
+	}
+	ddm->cfg.command = 3;
+}
+
+void
+ark_ddm_setup(struct ark_ddm_t *ddm, phys_addr_t consAddr, uint32_t interval)
+{
+	ddm->setup.consWriteIndexAddr = consAddr;
+	ddm->setup.writeIndexInterval = interval / 4;	/* 4 ns period */
+}
+
+void
+ark_ddm_stats_reset(struct ark_ddm_t *ddm)
+{
+	ddm->cfg.tlpStatsClear = 1;
+}
+
+void
+ark_ddm_dump(struct ark_ddm_t *ddm, const char *msg)
+{
+	ARK_DEBUG_TRACE("ARKP DDM Dump: %s Stopped: %d\n", msg,
+	ark_ddm_is_stopped(ddm)
+	);
+}
+
+void
+ark_ddm_dump_stats(struct ark_ddm_t *ddm, const char *msg)
+{
+	struct ark_ddm_stats_t *stats = &ddm->stats;
+
+	ARK_DEBUG_STATS("ARKP DDM Stats: %s"
+					FMT_SU64 FMT_SU64 FMT_SU64
+					"\n", msg,
+	"Bytes:", stats->txByteCount,
+	"Packets:", stats->txPktCount, "MBufs", stats->txMbufCount);
+}
+
+int
+ark_ddm_is_stopped(struct ark_ddm_t *ddm)
+{
+	return (ddm->cfg.stopFlushed & 0x01) != 0;
+}
+
+uint64_t
+ark_ddm_queue_byte_count(struct ark_ddm_t *ddm)
+{
+	return ddm->queue_stats.byteCount;
+}
+
+uint64_t
+ark_ddm_queue_pkt_count(struct ark_ddm_t *ddm)
+{
+	return ddm->queue_stats.pktCount;
+}
+
+void
+ark_ddm_queue_reset_stats(struct ark_ddm_t *ddm)
+{
+	ddm->queue_stats.byteCount = 1;
+}
diff --git a/drivers/net/ark/ark_ddm.h b/drivers/net/ark/ark_ddm.h
new file mode 100644
index 0000000..311dbc5
--- /dev/null
+++ b/drivers/net/ark/ark_ddm.h
@@ -0,0 +1,154 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DDM_H_
+#define _ARK_DDM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/* DDM core hardware structures */
+#define ARK_DDM_CFG 0x0000
+#define ARK_DDM_CONST 0xfacecafe
+struct ark_ddm_cfg_t {
+	uint32_t r0;
+	volatile uint32_t tlpStatsClear;
+	uint32_t const0;
+	volatile uint32_t tag_max;
+	volatile uint32_t command;
+	volatile uint32_t stopFlushed;
+};
+
+#define ARK_DDM_STATS 0x0020
+struct ark_ddm_stats_t {
+	volatile uint64_t txByteCount;
+	volatile uint64_t txPktCount;
+	volatile uint64_t txMbufCount;
+};
+
+#define ARK_DDM_MRDQ 0x0040
+struct ark_ddm_mrdq_t {
+	volatile uint32_t mrd_q1;
+	volatile uint32_t mrd_q2;
+	volatile uint32_t mrd_q3;
+	volatile uint32_t mrd_q4;
+	volatile uint32_t mrd_full;
+};
+
+#define ARK_DDM_CPLDQ 0x0068
+struct ark_ddm_cpldq_t {
+	volatile uint32_t cpld_q1;
+	volatile uint32_t cpld_q2;
+	volatile uint32_t cpld_q3;
+	volatile uint32_t cpld_q4;
+	volatile uint32_t cpld_full;
+};
+
+#define ARK_DDM_MRD_PS 0x0090
+struct ark_ddm_mrd_ps_t {
+	volatile uint32_t mrd_ps_min;
+	volatile uint32_t mrd_ps_max;
+	volatile uint32_t mrd_full_ps_min;
+	volatile uint32_t mrd_full_ps_max;
+	volatile uint32_t mrd_dw_ps_min;
+	volatile uint32_t mrd_dw_ps_max;
+};
+
+#define ARK_DDM_QUEUE_STATS 0x00a8
+struct ark_ddm_qstats_t {
+	volatile uint64_t byteCount;
+	volatile uint64_t pktCount;
+	volatile uint64_t mbufCount;
+};
+
+#define ARK_DDM_CPLD_PS 0x00c0
+struct ark_ddm_cpld_ps_t {
+	volatile uint32_t cpld_ps_min;
+	volatile uint32_t cpld_ps_max;
+	volatile uint32_t cpld_full_ps_min;
+	volatile uint32_t cpld_full_ps_max;
+	volatile uint32_t cpld_dw_ps_min;
+	volatile uint32_t cpld_dw_ps_max;
+};
+
+#define ARK_DDM_SETUP  0x00e0
+struct ark_ddm_setup_t {
+	phys_addr_t consWriteIndexAddr;
+	uint32_t writeIndexInterval;	/* 4ns each */
+	volatile uint32_t consIndex;
+};
+
+/*  Consolidated structure */
+struct ark_ddm_t {
+	struct ark_ddm_cfg_t cfg;
+	uint8_t reserved0[(ARK_DDM_STATS - ARK_DDM_CFG) -
+					  sizeof(struct ark_ddm_cfg_t)];
+	struct ark_ddm_stats_t stats;
+	uint8_t reserved1[(ARK_DDM_MRDQ - ARK_DDM_STATS) -
+					  sizeof(struct ark_ddm_stats_t)];
+	struct ark_ddm_mrdq_t mrdq;
+	uint8_t reserved2[(ARK_DDM_CPLDQ - ARK_DDM_MRDQ) -
+					  sizeof(struct ark_ddm_mrdq_t)];
+	struct ark_ddm_cpldq_t cpldq;
+	uint8_t reserved3[(ARK_DDM_MRD_PS - ARK_DDM_CPLDQ) -
+					  sizeof(struct ark_ddm_cpldq_t)];
+	struct ark_ddm_mrd_ps_t mrd_ps;
+	struct ark_ddm_qstats_t queue_stats;
+	struct ark_ddm_cpld_ps_t cpld_ps;
+	uint8_t reserved5[(ARK_DDM_SETUP - ARK_DDM_CPLD_PS) -
+					  sizeof(struct ark_ddm_cpld_ps_t)];
+	struct ark_ddm_setup_t setup;
+	uint8_t reservedP[(256 - ARK_DDM_SETUP)
+					  - sizeof(struct ark_ddm_setup_t)];
+};
+
+#define ARK_DDM_EXPECTED_SIZE 256
+#define ARK_DDM_QOFFSET ARK_DDM_EXPECTED_SIZE
+
+/* DDM function prototype */
+int ark_ddm_verify(struct ark_ddm_t *ddm);
+void ark_ddm_start(struct ark_ddm_t *ddm);
+int ark_ddm_stop(struct ark_ddm_t *ddm, const int wait);
+void ark_ddm_reset(struct ark_ddm_t *ddm);
+void ark_ddm_stats_reset(struct ark_ddm_t *ddm);
+void ark_ddm_setup(struct ark_ddm_t *ddm, phys_addr_t consAddr,
+	uint32_t interval);
+void ark_ddm_dump_stats(struct ark_ddm_t *ddm, const char *msg);
+void ark_ddm_dump(struct ark_ddm_t *ddm, const char *msg);
+int ark_ddm_is_stopped(struct ark_ddm_t *ddm);
+uint64_t ark_ddm_queue_byte_count(struct ark_ddm_t *ddm);
+uint64_t ark_ddm_queue_pkt_count(struct ark_ddm_t *ddm);
+void ark_ddm_queue_reset_stats(struct ark_ddm_t *ddm);
+
+#endif
diff --git a/drivers/net/ark/ark_debug.h b/drivers/net/ark/ark_debug.h
new file mode 100644
index 0000000..d50557f
--- /dev/null
+++ b/drivers/net/ark/ark_debug.h
@@ -0,0 +1,72 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_DEBUG_H_
+#define _ARK_DEBUG_H_
+
+#include <rte_log.h>
+
+/* Format specifiers for string data pairs */
+#define FMT_SU32  "\n\t%-20s    %'20u"
+#define FMT_SU64  "\n\t%-20s    %'20lu"
+#define FMT_SPTR  "\n\t%-20s    %20p"
+
+#define ARK_TRACE_ON(fmt, ...) \
+  fprintf(stderr, fmt, ##__VA_ARGS__)
+
+#define ARK_TRACE_OFF(fmt, ...) \
+  do {if (0) fprintf(stderr, fmt, ##__VA_ARGS__); } while (0)
+
+/* Debug macro for reporting Packet stats */
+#ifdef RTE_LIBRTE_ARK_DEBUG_STATS
+#define ARK_DEBUG_STATS(fmt, ...) ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_STATS(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+/* Debug macro for tracing full behavior*/
+#ifdef RTE_LIBRTE_ARK_DEBUG_TRACE
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_ON(fmt, ##__VA_ARGS__)
+#else
+#define ARK_DEBUG_TRACE(fmt, ...)  ARK_TRACE_OFF(fmt, ##__VA_ARGS__)
+#endif
+
+#ifdef ARK_STD_LOG
+#define PMD_DRV_LOG(level, fmt, args...) \
+  fprintf(stderr, fmt, args)
+#else
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
+#endif
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
new file mode 100644
index 0000000..8479435
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.c
@@ -0,0 +1,982 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <dlfcn.h>
+
+#include <rte_kvargs.h>
+#include <rte_vdev.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+#include "ark_mpu.h"
+#include "ark_ddm.h"
+#include "ark_udm.h"
+#include "ark_rqp.h"
+#include "ark_pktdir.h"
+#include "ark_pktgen.h"
+#include "ark_pktchkr.h"
+
+/*  Internal prototypes */
+static int eth_ark_check_args(const char *params);
+static int eth_ark_dev_init(struct rte_eth_dev *dev);
+static int ark_config_device(struct rte_eth_dev *dev);
+static int eth_ark_dev_uninit(struct rte_eth_dev *eth_dev);
+static int eth_ark_dev_configure(struct rte_eth_dev *dev);
+static int eth_ark_dev_start(struct rte_eth_dev *dev);
+static void eth_ark_dev_stop(struct rte_eth_dev *dev);
+static void eth_ark_dev_close(struct rte_eth_dev *dev);
+static void eth_ark_dev_info_get(struct rte_eth_dev *dev,
+	struct rte_eth_dev_info *dev_info);
+static int eth_ark_dev_link_update(struct rte_eth_dev *dev,
+	int wait_to_complete);
+static int eth_ark_dev_set_link_up(struct rte_eth_dev *dev);
+static int eth_ark_dev_set_link_down(struct rte_eth_dev *dev);
+static void eth_ark_dev_stats_get(struct rte_eth_dev *dev,
+	struct rte_eth_stats *stats);
+static void eth_ark_dev_stats_reset(struct rte_eth_dev *dev);
+static void eth_ark_set_default_mac_addr(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr);
+static void eth_ark_macaddr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint32_t index, uint32_t pool);
+static void eth_ark_macaddr_remove(struct rte_eth_dev *dev, uint32_t index);
+
+#define ARK_DEV_TO_PCI(eth_dev) \
+	RTE_DEV_TO_PCI((eth_dev)->device)
+
+#define ARK_MAX_ARG_LEN 256
+static uint32_t pktDirV;
+static char pktGenArgs[ARK_MAX_ARG_LEN];
+static char pktChkrArgs[ARK_MAX_ARG_LEN];
+
+#define ARK_PKTGEN_ARG "PktGen"
+#define ARK_PKTCHKR_ARG "PktChkr"
+#define ARK_PKTDIR_ARG "PktDir"
+
+static const char *valid_arguments[] = {
+	ARK_PKTGEN_ARG,
+	ARK_PKTCHKR_ARG,
+	ARK_PKTDIR_ARG,
+	"iface",
+	NULL
+};
+
+#define MAX_ARK_PHYS 16
+struct ark_adapter *gark[MAX_ARK_PHYS];
+
+static const struct rte_pci_id pci_id_ark_map[] = {
+	{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
+	{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static struct eth_driver rte_ark_pmd = {
+	.pci_drv = {
+		.probe = rte_eth_dev_pci_probe,
+		.remove = rte_eth_dev_pci_remove,
+		.id_table = pci_id_ark_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC},
+	.eth_dev_init = eth_ark_dev_init,
+	.eth_dev_uninit = eth_ark_dev_uninit,
+	.dev_private_size = sizeof(struct ark_adapter),
+};
+
+static const struct eth_dev_ops ark_eth_dev_ops = {
+	.dev_configure = eth_ark_dev_configure,
+	.dev_start = eth_ark_dev_start,
+	.dev_stop = eth_ark_dev_stop,
+	.dev_close = eth_ark_dev_close,
+
+	.dev_infos_get = eth_ark_dev_info_get,
+
+	.rx_queue_setup = eth_ark_dev_rx_queue_setup,
+	.rx_queue_count = eth_ark_dev_rx_queue_count,
+	.tx_queue_setup = eth_ark_tx_queue_setup,
+
+	.link_update = eth_ark_dev_link_update,
+	.dev_set_link_up = eth_ark_dev_set_link_up,
+	.dev_set_link_down = eth_ark_dev_set_link_down,
+
+	.rx_queue_start = eth_ark_rx_start_queue,
+	.rx_queue_stop = eth_ark_rx_stop_queue,
+
+	.tx_queue_start = eth_ark_tx_queue_start,
+	.tx_queue_stop = eth_ark_tx_queue_stop,
+
+	.stats_get = eth_ark_dev_stats_get,
+	.stats_reset = eth_ark_dev_stats_reset,
+
+	.mac_addr_add = eth_ark_macaddr_add,
+	.mac_addr_remove = eth_ark_macaddr_remove,
+	.mac_addr_set = eth_ark_set_default_mac_addr,
+
+};
+
+int
+ark_get_port_id(struct rte_eth_dev *dev, struct ark_adapter *ark)
+{
+	int n = ark->num_ports;
+	int i;
+
+	/* There has to be a smarter way to do this ... */
+	for (i = 0; i < n; i++) {
+	if (ark->port[i].eth_dev == dev)
+		return i;
+	}
+	ARK_DEBUG_TRACE("ARK: Device is NOT associated with a port !!");
+	return -1;
+}
+
+static
+	int
+check_for_ext(struct rte_eth_dev *dev __rte_unused,
+	struct ark_adapter *ark __rte_unused)
+{
+	int found = 0;
+
+	/* Get the env */
+	const char *dllpath = getenv("ARK_EXT_PATH");
+
+	if (dllpath == NULL) {
+	ARK_DEBUG_TRACE("ARK EXT NO dll path specified \n");
+	return 0;
+	}
+	ARK_DEBUG_TRACE("ARK EXT found dll path at %s\n", dllpath);
+
+	/* Open and load the .so */
+	ark->dHandle = dlopen(dllpath, RTLD_LOCAL | RTLD_LAZY);
+	if (ark->dHandle == NULL) {
+	PMD_DRV_LOG(ERR, "Could not load user extension %s \n", dllpath);
+	} else {
+	ARK_DEBUG_TRACE("SUCCESS: loaded user extension %s\n", dllpath);
+	}
+
+	/* Get the entry points */
+	ark->user_ext.dev_init =
+	(void *(*)(struct rte_eth_dev *, void *, int)) dlsym(ark->dHandle,
+	"dev_init");
+	ARK_DEBUG_TRACE("device ext init pointer = %p\n", ark->user_ext.dev_init);
+	ark->user_ext.dev_get_port_count =
+	(int (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_get_port_count");
+	ark->user_ext.dev_uninit =
+	(void (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_uninit");
+	ark->user_ext.dev_configure =
+	(int (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_configure");
+	ark->user_ext.dev_start =
+	(int (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_start");
+	ark->user_ext.dev_stop =
+	(void (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_stop");
+	ark->user_ext.dev_close =
+	(void (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_close");
+	ark->user_ext.link_update =
+	(int (*)(struct rte_eth_dev *, int, void *)) dlsym(ark->dHandle,
+	"link_update");
+	ark->user_ext.dev_set_link_up =
+	(int (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_set_link_up");
+	ark->user_ext.dev_set_link_down =
+	(int (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"dev_set_link_down");
+	ark->user_ext.stats_get =
+	(void (*)(struct rte_eth_dev *, struct rte_eth_stats *,
+		void *)) dlsym(ark->dHandle, "stats_get");
+	ark->user_ext.stats_reset =
+	(void (*)(struct rte_eth_dev *, void *)) dlsym(ark->dHandle,
+	"stats_reset");
+	ark->user_ext.mac_addr_add =
+	(void (*)(struct rte_eth_dev *, struct ether_addr *, uint32_t,
+		uint32_t, void *)) dlsym(ark->dHandle, "mac_addr_add");
+	ark->user_ext.mac_addr_remove =
+	(void (*)(struct rte_eth_dev *, uint32_t, void *)) dlsym(ark->dHandle,
+	"mac_addr_remove");
+	ark->user_ext.mac_addr_set =
+	(void (*)(struct rte_eth_dev *, struct ether_addr *,
+		void *)) dlsym(ark->dHandle, "mac_addr_set");
+
+	return found;
+}
+
+static int
+eth_ark_dev_init(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	struct rte_pci_device *pci_dev;
+	int ret;
+
+	ark->eth_dev = dev;
+
+	ARK_DEBUG_TRACE("eth_ark_dev_init(struct rte_eth_dev *dev)");
+	gark[0] = ark;
+
+	/* Check to see if there is an extension that we need to load */
+	check_for_ext(dev, ark);
+	pci_dev = ARK_DEV_TO_PCI(dev);
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	if (pci_dev->device.devargs)
+	eth_ark_check_args(pci_dev->device.devargs->args);
+	else
+	PMD_DRV_LOG(INFO, "No Device args found\n");
+
+	/* Use dummy function until setup */
+	dev->rx_pkt_burst = &eth_ark_recv_pkts_noop;
+	dev->tx_pkt_burst = &eth_ark_xmit_pkts_noop;
+
+	ark->bar0 = (uint8_t *) pci_dev->mem_resource[0].addr;
+	ark->Abar = (uint8_t *) pci_dev->mem_resource[2].addr;
+
+	SetPtr(bar0, ark, sysctrl, ARK_SYSCTRL_BASE);
+	SetPtr(bar0, ark, mpurx, ARK_MPURx_BASE);
+	SetPtr(bar0, ark, udm, ARK_UDM_BASE);
+	SetPtr(bar0, ark, mputx, ARK_MPUTx_BASE);
+	SetPtr(bar0, ark, ddm, ARK_DDM_BASE);
+	SetPtr(bar0, ark, cmac, ARK_CMAC_BASE);
+	SetPtr(bar0, ark, external, ARK_EXTERNAL_BASE);
+	SetPtr(bar0, ark, pktdir, ARK_PKTDIR_BASE);
+	SetPtr(bar0, ark, pktgen, ARK_PKTGEN_BASE);
+	SetPtr(bar0, ark, pktchkr, ARK_PKTCHKR_BASE);
+
+	ark->rqpacing = (struct ark_rqpace_t *) (ark->bar0 + ARK_RCPACING_BASE);
+	ark->started = 0;
+
+	ARK_DEBUG_TRACE("Sys Ctrl Const = 0x%x  DEV CommitID: %08x\n",
+	ark->sysctrl.t32[4], rte_be_to_cpu_32(ark->sysctrl.t32[0x20 / 4]));
+	PMD_DRV_LOG(INFO, "ARKP PMD  CommitID: %08x\n",
+	rte_be_to_cpu_32(ark->sysctrl.t32[0x20 / 4]));
+
+	/* If HW sanity test fails, return an error */
+	if (ark->sysctrl.t32[4] != 0xcafef00d) {
+	PMD_DRV_LOG(ERR,
+		"HW Sanity test has failed, expected constant 0x%x, read 0x%x (%s)\n",
+		0xcafef00d, ark->sysctrl.t32[4], __func__);
+	return -1;
+	} else {
+	PMD_DRV_LOG(INFO,
+		"HW Sanity test has PASSED, expected constant 0x%x, read 0x%x (%s)\n",
+		0xcafef00d, ark->sysctrl.t32[4], __func__);
+	}
+
+	/* We are a single function multi-port device. */
+	const unsigned int numa_node = rte_socket_id();
+	struct ether_addr adr;
+
+	ret = ark_config_device(dev);
+	dev->dev_ops = &ark_eth_dev_ops;
+
+	dev->data->mac_addrs = rte_zmalloc("ark", ETHER_ADDR_LEN, 0);
+	if (!dev->data->mac_addrs) {
+	PMD_DRV_LOG(ERR, "Failed to allocated memory for storing mac address");
+	}
+	ether_addr_copy((struct ether_addr *) &adr, &dev->data->mac_addrs[0]);
+
+	if (ark->user_ext.dev_init) {
+	ark->user_data = ark->user_ext.dev_init(dev, ark->Abar, 0);
+	if (!ark->user_data) {
+		PMD_DRV_LOG(INFO,
+		"Failed to initialize PMD extension !!, continuing without it\n");
+		memset(&ark->user_ext, 0, sizeof(struct ark_user_ext));
+		dlclose(ark->dHandle);
+	}
+	}
+
+	/* We will create additional devices based on the number of requested
+	 * ports */
+	int pc = 1;
+	int p;
+
+	if (ark->user_ext.dev_get_port_count) {
+	pc = ark->user_ext.dev_get_port_count(dev, ark->user_data);
+	ark->num_ports = pc;
+	} else {
+	ark->num_ports = 1;
+	}
+	for (p = 0; p < pc; p++) {
+	struct ark_port *port;
+
+	port = &ark->port[p];
+	struct rte_eth_dev_data *data = NULL;
+
+	port->id = p;
+
+	char name[RTE_ETH_NAME_MAX_LEN];
+
+	snprintf(name, sizeof(name), "arketh%d", dev->data->port_id + p);
+
+	if (p == 0) {
+		/* First port is already allocated by DPDK */
+		port->eth_dev = ark->eth_dev;
+		continue;
+	}
+
+	/* reserve an ethdev entry */
+	port->eth_dev = rte_eth_dev_allocate(name);
+	if (!port->eth_dev) {
+		PMD_DRV_LOG(ERR, "Could not allocate eth_dev for port %d\n", p);
+		goto error;
+	}
+
+	data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+	if (!data) {
+		PMD_DRV_LOG(ERR, "Could not allocate eth_dev for port %d\n", p);
+		goto error;
+	}
+	data->port_id = ark->eth_dev->data->port_id + p;
+	port->eth_dev->data = data;
+	port->eth_dev->device = &pci_dev->device;
+	port->eth_dev->data->dev_private = ark;
+	port->eth_dev->driver = ark->eth_dev->driver;
+	port->eth_dev->dev_ops = ark->eth_dev->dev_ops;
+	port->eth_dev->tx_pkt_burst = ark->eth_dev->tx_pkt_burst;
+	port->eth_dev->rx_pkt_burst = ark->eth_dev->rx_pkt_burst;
+
+	rte_eth_copy_pci_info(port->eth_dev, pci_dev);
+
+	port->eth_dev->data->mac_addrs = rte_zmalloc(name, ETHER_ADDR_LEN, 0);
+	if (!port->eth_dev->data->mac_addrs) {
+		PMD_DRV_LOG(ERR, "Memory allocation for MAC failed !, exiting\n");
+		goto error;
+	}
+	ether_addr_copy((struct ether_addr *) &adr,
+		&port->eth_dev->data->mac_addrs[0]);
+
+	if (ark->user_ext.dev_init) {
+		ark->user_data = ark->user_ext.dev_init(dev, ark->Abar, p);
+	}
+	}
+
+	return ret;
+
+error:
+	if (dev->data->mac_addrs)
+	rte_free(dev->data->mac_addrs);
+
+	for (p = 0; p < pc; p++) {
+	if (ark->port[p].eth_dev->data)
+		rte_free(ark->port[p].eth_dev->data);
+	if (ark->port[p].eth_dev->data->mac_addrs)
+		rte_free(ark->port[p].eth_dev->data->mac_addrs);
+	}
+
+	return -1;
+
+}
+
+/* Initial device configuration when device is opened
+ setup the DDM, and UDM
+ Called once per PCIE device
+*/
+static int
+ark_config_device(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	uint16_t numQ, i;
+	struct ark_mpu_t *mpu;
+
+	/* Make sure that the packet director, generator and checker are in a
+	 * known state */
+	ark->start_pg = 0;
+	ark->pg = ark_pmd_pktgen_init(ark->pktgen.v, 0, 1);
+	ark_pmd_pktgen_reset(ark->pg);
+	ark->pc = ark_pmd_pktchkr_init(ark->pktchkr.v, 0, 1);
+	ark_pmd_pktchkr_stop(ark->pc);
+	ark->pd = ark_pmd_pktdir_init(ark->pktdir.v);
+
+	/* Verify HW */
+	if (ark_udm_verify(ark->udm.v)) {
+	return -1;
+	}
+	if (ark_ddm_verify(ark->ddm.v)) {
+	return -1;
+	}
+
+	/* UDM */
+	if (ark_udm_reset(ark->udm.v)) {
+	PMD_DRV_LOG(ERR, "Unable to stop and reset UDM \n");
+	return -1;
+	}
+	/* Keep in reset until the MPU are cleared */
+
+	/* MPU reset */
+	mpu = ark->mpurx.v;
+	numQ = ark_api_num_queues(mpu);
+	ark->rxQueues = numQ;
+	for (i = 0; i < numQ; i++) {
+	ark_mpu_reset(mpu);
+	mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+	}
+
+	ark_udm_stop(ark->udm.v, 0);
+	ark_udm_configure(ark->udm.v, RTE_PKTMBUF_HEADROOM,
+	RTE_MBUF_DEFAULT_DATAROOM, ARK_RX_WRITE_TIME_NS);
+	ark_udm_stats_reset(ark->udm.v);
+	ark_udm_stop(ark->udm.v, 0);
+
+	/* TX -- DDM */
+	if (ark_ddm_stop(ark->ddm.v, 1)) {
+	PMD_DRV_LOG(ERR, "Unable to stop DDM \n");
+	};
+
+	mpu = ark->mputx.v;
+	numQ = ark_api_num_queues(mpu);
+	ark->txQueues = numQ;
+	for (i = 0; i < numQ; i++) {
+	ark_mpu_reset(mpu);
+	mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+	}
+
+	ark_ddm_reset(ark->ddm.v);
+	ark_ddm_stats_reset(ark->ddm.v);
+	/* ark_ddm_dump(ark->ddm.v, "Config"); */
+	/* ark_ddm_dump_stats(ark->ddm.v, "Config"); */
+
+	/* MPU reset */
+	ark_ddm_stop(ark->ddm.v, 0);
+	ark_rqp_stats_reset(ark->rqpacing);
+
+	return 0;
+}
+
+static int
+eth_ark_dev_uninit(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+	return 0;
+
+	if (ark->user_ext.dev_uninit) {
+	ark->user_ext.dev_uninit(dev, ark->user_data);
+	}
+
+	ark_pmd_pktgen_uninit(ark->pg);
+	ark_pmd_pktchkr_uninit(ark->pc);
+
+	dev->dev_ops = NULL;
+	dev->rx_pkt_burst = NULL;
+	dev->tx_pkt_burst = NULL;
+	if (dev->data->mac_addrs)
+	rte_free(dev->data->mac_addrs);
+	if (dev->data)
+	rte_free(dev->data);
+
+	return 0;
+}
+
+static int
+eth_ark_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	ARK_DEBUG_TRACE
+	("ARKP: In eth_ark_dev_configure(struct rte_eth_dev *dev)\n");
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	eth_ark_dev_set_link_up(dev);
+	if (ark->user_ext.dev_configure) {
+	return ark->user_ext.dev_configure(dev, ark->user_data);
+	}
+	return 0;
+}
+
+static void *
+delay_pg_start(void *arg)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) arg;
+
+	/* This function is used exclusively for regression testing, We perform a
+	 * blind sleep here to ensure that the external test application has time
+	 * to setup the test before we generate packets */
+	usleep(100000);
+	ark_pmd_pktgen_run(ark->pg);
+	return NULL;
+}
+
+static int
+eth_ark_dev_start(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	int i;
+
+	ARK_DEBUG_TRACE("ARKP: In eth_ark_dev_start\n");
+
+	/* RX Side */
+	/* start UDM */
+	ark_udm_start(ark->udm.v);
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+	eth_ark_rx_start_queue(dev, i);
+	}
+
+	/* TX Side */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+	eth_ark_tx_queue_start(dev, i);
+	}
+
+	/* start DDM */
+	ark_ddm_start(ark->ddm.v);
+
+	ark->started = 1;
+	/* set xmit and receive function */
+	dev->rx_pkt_burst = &eth_ark_recv_pkts;
+	dev->tx_pkt_burst = &eth_ark_xmit_pkts;
+
+	if (ark->start_pg) {
+	ark_pmd_pktchkr_run(ark->pc);
+	}
+
+	if (ark->start_pg && (ark_get_port_id(dev, ark) == 0)) {
+	pthread_t thread;
+
+	/* TODO: add comment here */
+	pthread_create(&thread, NULL, delay_pg_start, ark);
+	}
+
+	if (ark->user_ext.dev_start) {
+	ark->user_ext.dev_start(dev, ark->user_data);
+	}
+
+	return 0;
+}
+
+static void
+eth_ark_dev_stop(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	int status;
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	struct ark_mpu_t *mpu;
+
+	ARK_DEBUG_TRACE("ARKP: In eth_ark_dev_stop\n");
+
+	if (ark->started == 0)
+	return;
+	ark->started = 0;
+
+	/* Stop the extension first */
+	if (ark->user_ext.dev_stop) {
+	ark->user_ext.dev_stop(dev, ark->user_data);
+	}
+
+	/* Stop the packet generator */
+	if (ark->start_pg) {
+	ark_pmd_pktgen_pause(ark->pg);
+	}
+
+	dev->rx_pkt_burst = &eth_ark_recv_pkts_noop;
+	dev->tx_pkt_burst = &eth_ark_xmit_pkts_noop;
+
+	/* STOP TX Side */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+	status = eth_ark_tx_queue_stop(dev, i);
+	if (status != 0) {
+		uint8_t port = dev->data->port_id;
+
+		fprintf(stderr, "ARKP tx_queue stop anomaly port %u, queue %u\n",
+		port, i);
+	}
+	}
+
+	/* Stop DDM */
+	/* Wait up to 0.1 second.  each stop is upto 1000 * 10 useconds */
+	for (i = 0; i < 10; i++) {
+	status = ark_ddm_stop(ark->ddm.v, 1);
+	if (status == 0)
+		break;
+	}
+	if (status || i != 0) {
+	PMD_DRV_LOG(ERR, "DDM stop anomaly. status: %d iter: %u. (%s)\n",
+		status, i, __func__);
+	ark_ddm_dump(ark->ddm.v, "Stop anomaly");
+
+	mpu = ark->mputx.v;
+	for (i = 0; i < ark->txQueues; i++) {
+		ark_mpu_dump(mpu, "DDM failure dump", i);
+		mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+	}
+	}
+
+	/* STOP RX Side */
+	/* Stop UDM */
+	for (i = 0; i < 10; i++) {
+	status = ark_udm_stop(ark->udm.v, 1);
+	if (status == 0)
+		break;
+	}
+	if (status || i != 0) {
+	PMD_DRV_LOG(ERR, "UDM stop anomaly. status %d iter: %u. (%s)\n",
+		status, i, __func__);
+	ark_udm_dump(ark->udm.v, "Stop anomaly");
+
+	mpu = ark->mpurx.v;
+	for (i = 0; i < ark->rxQueues; i++) {
+		ark_mpu_dump(mpu, "UDM Stop anomaly", i);
+		mpu = RTE_PTR_ADD(mpu, ARK_MPU_QOFFSET);
+	}
+	}
+
+	ark_udm_dump_stats(ark->udm.v, "Post stop");
+	ark_udm_dump_perf(ark->udm.v, "Post stop");
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+	eth_ark_rx_dump_queue(dev, i, __func__);
+	}
+
+	/* Stop the packet checker if it is running */
+	if (ark->start_pg) {
+	ark_pmd_pktchkr_dump_stats(ark->pc);
+	ark_pmd_pktchkr_stop(ark->pc);
+	}
+}
+
+static void
+eth_ark_dev_close(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	uint16_t i;
+
+	if (ark->user_ext.dev_close) {
+	ark->user_ext.dev_close(dev, ark->user_data);
+	}
+
+	eth_ark_dev_stop(dev);
+	eth_ark_udm_force_close(dev);
+
+	/* TODO This should only be called once for the device during shutdown */
+	ark_rqp_dump(ark->rqpacing);
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+	eth_ark_tx_queue_release(dev->data->tx_queues[i]);
+	dev->data->tx_queues[i] = 0;
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+	eth_ark_dev_rx_queue_release(dev->data->rx_queues[i]);
+	dev->data->rx_queues[i] = 0;
+	}
+
+}
+
+static void
+eth_ark_dev_info_get(struct rte_eth_dev *dev,
+	struct rte_eth_dev_info *dev_info)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	struct ark_mpu_t *tx_mpu = RTE_PTR_ADD(ark->bar0, ARK_MPUTx_BASE);
+	struct ark_mpu_t *rx_mpu = RTE_PTR_ADD(ark->bar0, ARK_MPURx_BASE);
+
+	uint16_t ports = ark->num_ports;
+
+	/* device specific configuration */
+	memset(dev_info, 0, sizeof(*dev_info));
+
+	dev_info->max_rx_queues = ark_api_num_queues_per_port(rx_mpu, ports);
+	dev_info->max_tx_queues = ark_api_num_queues_per_port(tx_mpu, ports);
+	dev_info->max_mac_addrs = 0;
+	dev_info->if_index = 0;
+	dev_info->max_rx_pktlen = (16 * 1024) - 128;
+	dev_info->min_rx_bufsize = 1024;
+	dev_info->rx_offload_capa = 0;
+	dev_info->tx_offload_capa = 0;
+
+	dev_info->rx_desc_lim = (struct rte_eth_desc_lim) {
+	.nb_max = 4096 * 4,
+	.nb_min = 512,	/* HW Q size for RX */
+	.nb_align = 2,};
+
+	dev_info->tx_desc_lim = (struct rte_eth_desc_lim) {
+	.nb_max = 4096 * 4,
+	.nb_min = 256,	/* HW Q size for TX */
+	.nb_align = 2,};
+
+	dev_info->rx_offload_capa = 0;
+	dev_info->tx_offload_capa = 0;
+
+	/* ARK PMD supports all line rates, how do we indicate that here ?? */
+	dev_info->speed_capa =
+	ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G | ETH_LINK_SPEED_25G |
+	ETH_LINK_SPEED_40G | ETH_LINK_SPEED_50G | ETH_LINK_SPEED_100G;
+	dev_info->pci_dev = ARK_DEV_TO_PCI(dev);
+	dev_info->driver_name = dev->data->drv_name;
+
+}
+
+static int
+eth_ark_dev_link_update(struct rte_eth_dev *dev, int wait_to_complete)
+{
+	ARK_DEBUG_TRACE("ARKP: link status = %d\n",
+	dev->data->dev_link.link_status);
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (ark->user_ext.link_update) {
+	return ark->user_ext.link_update(dev, wait_to_complete,
+		ark->user_data);
+	}
+	return 0;
+}
+
+static int
+eth_ark_dev_set_link_up(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = 1;
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (ark->user_ext.dev_set_link_up) {
+	return ark->user_ext.dev_set_link_up(dev, ark->user_data);
+	}
+	return 0;
+}
+
+static int
+eth_ark_dev_set_link_down(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = 0;
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (ark->user_ext.dev_set_link_down) {
+	return ark->user_ext.dev_set_link_down(dev, ark->user_data);
+	}
+	return 0;
+}
+
+static void
+eth_ark_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	uint16_t i;
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->imissed = 0;
+	stats->oerrors = 0;
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+	eth_tx_queue_stats_get(dev->data->tx_queues[i], stats);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+	eth_rx_queue_stats_get(dev->data->rx_queues[i], stats);
+	}
+
+	if (ark->user_ext.stats_get) {
+	ark->user_ext.stats_get(dev, stats, ark->user_data);
+	}
+
+}
+
+static void
+eth_ark_dev_stats_reset(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+	eth_tx_queue_stats_reset(dev->data->rx_queues[i]);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+	eth_rx_queue_stats_reset(dev->data->rx_queues[i]);
+	}
+
+	if (ark->user_ext.stats_reset) {
+	ark->user_ext.stats_reset(dev, ark->user_data);
+	}
+
+}
+
+static void
+eth_ark_macaddr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint32_t index, uint32_t pool)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (ark->user_ext.mac_addr_add) {
+	ark->user_ext.mac_addr_add(dev, mac_addr, index, pool, ark->user_data);
+	}
+}
+
+static void
+eth_ark_macaddr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (ark->user_ext.mac_addr_remove) {
+	ark->user_ext.mac_addr_remove(dev, index, ark->user_data);
+	}
+}
+
+static void
+eth_ark_set_default_mac_addr(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+
+	if (ark->user_ext.mac_addr_set) {
+	ark->user_ext.mac_addr_set(dev, mac_addr, ark->user_data);
+	}
+}
+
+static inline int
+process_pktdir_arg(const char *key, const char *value,
+	void *extra_args __rte_unused)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktdir_arg, key = %s, value = %s\n", key,
+	value);
+	pktDirV = strtol(value, NULL, 16);
+	ARK_DEBUG_TRACE("pktDirV = 0x%x\n", pktDirV);
+	return 0;
+}
+
+static inline int
+process_file_args(const char *key, const char *value, void *extra_args)
+{
+	ARK_DEBUG_TRACE("**** IN process_pktgen_arg, key = %s, value = %s\n", key,
+	value);
+	char *args = (char *) extra_args;
+
+	/* Open the configuration file */
+	FILE *file = fopen(value, "r");
+	char line[256];
+	int first = 1;
+
+	while (fgets(line, sizeof(line), file)) {
+	/* ARK_DEBUG_TRACE("%s\n", line); */
+	if (first) {
+		strncpy(args, line, ARK_MAX_ARG_LEN);
+		first = 0;
+	} else {
+		strncat(args, line, ARK_MAX_ARG_LEN);
+	}
+	}
+	ARK_DEBUG_TRACE("file = %s\n", args);
+	fclose(file);
+	return 0;
+}
+
+static int
+eth_ark_check_args(const char *params)
+{
+	struct rte_kvargs *kvlist;
+	unsigned k_idx;
+	struct rte_kvargs_pair *pair = NULL;
+
+	/* TODO: the index of gark[index] should be associated with phy dev map */
+	struct ark_adapter *ark = gark[0];
+
+	kvlist = rte_kvargs_parse(params, valid_arguments);
+	if (kvlist == NULL)
+	return 0;
+
+	pktGenArgs[0] = 0;
+	pktChkrArgs[0] = 0;
+
+	for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
+	pair = &kvlist->pairs[k_idx];
+	ARK_DEBUG_TRACE("**** Arg passed to PMD = %s:%s\n", pair->key,
+		pair->value);
+	}
+
+	if (rte_kvargs_process(kvlist, ARK_PKTDIR_ARG,
+		&process_pktdir_arg, NULL) != 0) {
+	PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTDIR_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist, ARK_PKTGEN_ARG,
+		&process_file_args, pktGenArgs) != 0) {
+	PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTGEN_ARG);
+	}
+
+	if (rte_kvargs_process(kvlist, ARK_PKTCHKR_ARG,
+		&process_file_args, pktChkrArgs) != 0) {
+	PMD_DRV_LOG(ERR, "Unable to parse arg %s\n", ARK_PKTCHKR_ARG);
+	}
+
+	/* Setup the packet director */
+	ark_pmd_pktdir_setup(ark->pd, pktDirV);
+	ARK_DEBUG_TRACE("INFO: packet director set to 0x%x\n", pktDirV);
+
+	/* Setup the packet generator */
+	if (pktGenArgs[0]) {
+	PMD_DRV_LOG(INFO, "Setting up the packet generator\n");
+	ark_pmd_pktgen_parse(pktGenArgs);
+	ark_pmd_pktgen_reset(ark->pg);
+	ark_pmd_pktgen_setup(ark->pg);
+	ark->start_pg = 1;
+	}
+
+	/* Setup the packet checker */
+	if (pktChkrArgs[0]) {
+	ark_pmd_pktchkr_parse(pktChkrArgs);
+	ark_pmd_pktchkr_setup(ark->pc);
+	}
+
+	return 1;
+}
+
+static int
+pmd_ark_probe(const char *name, const char *params)
+{
+	RTE_LOG(INFO, PMD, "Initializing pmd_ark for %s params = %s\n", name,
+	params);
+
+	/* Parse off the v index */
+
+	eth_ark_check_args(params);
+	return 0;
+}
+
+static int
+pmd_ark_remove(const char *name)
+{
+	RTE_LOG(INFO, PMD, "Closing ark %s ethdev on numa socket %u\n", name,
+	rte_socket_id());
+	return 1;
+}
+
+static struct rte_vdev_driver pmd_ark_drv = {
+	.probe = pmd_ark_probe,
+	.remove = pmd_ark_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_ark, pmd_ark_drv);
+RTE_PMD_REGISTER_ALIAS(net_ark, eth_ark);
+RTE_PMD_REGISTER_PCI(eth_ark, rte_ark_pmd.pci_drv);
+RTE_PMD_REGISTER_KMOD_DEP(net_ark, "* igb_uio | uio_pci_generic ");
+RTE_PMD_REGISTER_PCI_TABLE(eth_ark, pci_id_ark_map);
diff --git a/drivers/net/ark/ark_ethdev.h b/drivers/net/ark/ark_ethdev.h
new file mode 100644
index 0000000..9167181
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.h
@@ -0,0 +1,75 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_ETHDEV_H_
+#define _ARK_ETHDEV_H_
+
+int ark_get_port_id(struct rte_eth_dev *dev, struct ark_adapter *ark);
+
+/* RX functions */
+int eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t queue_idx,
+	uint16_t nb_desc,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf, struct rte_mempool *mp);
+uint32_t eth_ark_dev_rx_queue_count(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id);
+int eth_ark_rx_stop_queue(struct rte_eth_dev *dev, uint16_t queue_id);
+int eth_ark_rx_start_queue(struct rte_eth_dev *dev, uint16_t queue_id);
+uint16_t eth_ark_recv_pkts_noop(void *rx_queue, struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts);
+uint16_t eth_ark_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts);
+void eth_ark_dev_rx_queue_release(void *rx_queue);
+void eth_rx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats);
+void eth_rx_queue_stats_reset(void *vqueue);
+void eth_ark_rx_dump_queue(struct rte_eth_dev *dev, uint16_t queue_id,
+	const char *msg);
+
+void eth_ark_udm_force_close(struct rte_eth_dev *dev);
+
+/* TX functions */
+uint16_t eth_ark_xmit_pkts_noop(void *txq, struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts);
+uint16_t eth_ark_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts);
+int eth_ark_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+	uint16_t nb_desc, unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf);
+void eth_ark_tx_queue_release(void *tx_queue);
+int eth_ark_tx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id);
+int eth_ark_tx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id);
+void eth_tx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats);
+void eth_tx_queue_stats_reset(void *vqueue);
+
+#endif
diff --git a/drivers/net/ark/ark_ethdev.o.pmd.c b/drivers/net/ark/ark_ethdev.o.pmd.c
new file mode 100644
index 0000000..fb84d20
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev.o.pmd.c
@@ -0,0 +1,2 @@
+const char net_ark_pmd_info[] __attribute__((used)) = "PMD_INFO_STRING= {\"name\" : \"net_ark\", \"kmod\" : \"* igb_uio | uio_pci_generic \", \"pci_ids\" : []}";
+const char eth_ark_pmd_info[] __attribute__((used)) = "PMD_INFO_STRING= {\"name\" : \"eth_ark\", \"pci_ids\" : [[7532, 4109, 65535, 65535],[7532, 4110, 65535, 65535] ]}";
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
new file mode 100644
index 0000000..3c3ba0f
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -0,0 +1,671 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_global.h"
+#include "ark_debug.h"
+#include "ark_ethdev.h"
+#include "ark_mpu.h"
+#include "ark_udm.h"
+
+#define ARK_RX_META_SIZE 32
+#define ARK_RX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_RX_META_SIZE)
+#define ARK_RX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
+
+#ifdef RTE_LIBRTE_ARK_DEBUG_RX
+#define ARK_RX_DEBUG 1
+#define ARK_FULL_DEBUG 1
+#else
+#define ARK_RX_DEBUG 0
+#define ARK_FULL_DEBUG 0
+#endif
+
+/* Forward declarations */
+struct ark_rx_queue;
+struct ark_rx_meta;
+
+static void dump_mbuf_data(struct rte_mbuf *mbuf, uint16_t lo, uint16_t hi);
+static void ark_ethdev_rx_dump(const char *name, struct ark_rx_queue *queue);
+static uint32_t eth_ark_rx_jumbo(struct ark_rx_queue *queue,
+	struct ark_rx_meta *meta, struct rte_mbuf *mbuf0, uint32_t consIndex);
+static inline int eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue);
+
+/* ************************************************************************* */
+struct ark_rx_queue {
+
+	/* array of mbufs to populate */
+	struct rte_mbuf **reserveQ;
+	/* array of physical addrresses of the mbuf data pointer */
+	/* This point is a virtual address */
+	phys_addr_t *paddressQ;
+	struct rte_mempool *mb_pool;
+
+	struct ark_udm_t *udm;
+	struct ark_mpu_t *mpu;
+
+	uint32_t queueSize;
+	uint32_t queueMask;
+
+	uint32_t seedIndex;		/* 1 set with an empty mbuf */
+	uint32_t consIndex;		/* 3 consumed by the driver */
+
+	/* The queue Id is used to identify the HW Q */
+	uint16_t phys_qid;
+
+	/* The queue Index is used within the dpdk device structures */
+	uint16_t queueIndex;
+
+	uint32_t pad1;
+
+	/* separate cache line */
+	/* second cache line - fields only used in slow path */
+	MARKER cacheline1 __rte_cache_min_aligned;
+
+	volatile uint32_t prodIndex;	/* 2 filled by the HW */
+
+} __rte_cache_aligned;
+
+/* ************************************************************************* */
+
+/* MATCHES struct in UDMDefines.bsv */
+
+/* TODO move to ark_udm.h */
+struct ark_rx_meta {
+	uint64_t timestamp;
+	uint64_t userData;
+	uint8_t port;
+	uint8_t dstQueue;
+	uint16_t pktLen;
+};
+
+/* ************************************************************************* */
+
+/* TODO  pick a better function name */
+static int
+eth_ark_rx_queue_setup(struct rte_eth_dev *dev,
+	struct ark_rx_queue *queue,
+	uint16_t rx_queue_id __rte_unused, uint16_t rx_queue_idx)
+{
+	phys_addr_t queueBase;
+	phys_addr_t physAddrQBase;
+	phys_addr_t physAddrProdIndex;
+
+	queueBase = rte_malloc_virt2phy(queue);
+	physAddrProdIndex = queueBase +
+		offsetof(struct ark_rx_queue, prodIndex);
+
+	physAddrQBase = rte_malloc_virt2phy(queue->paddressQ);
+
+	/* Verify HW */
+	if (ark_mpu_verify(queue->mpu, sizeof(phys_addr_t))) {
+	PMD_DRV_LOG(ERR, "ARKP: Illegal configuration rx queue\n");
+	return -1;
+	}
+
+	/* Stop and Reset and configure MPU */
+	ark_mpu_configure(queue->mpu, physAddrQBase, queue->queueSize, 0);
+
+	ark_udm_write_addr(queue->udm, physAddrProdIndex);
+
+	/* advance the valid pointer, but don't start until the queue starts */
+	ark_mpu_reset_stats(queue->mpu);
+
+	/* The seed is the producer index for the HW */
+	ark_mpu_set_producer(queue->mpu, queue->seedIndex);
+	dev->data->rx_queue_state[rx_queue_idx] = RTE_ETH_QUEUE_STATE_STOPPED;
+
+	return 0;
+}
+
+static inline void
+eth_ark_rx_update_consIndex(struct ark_rx_queue *queue, uint32_t consIndex)
+{
+	queue->consIndex = consIndex;
+	eth_ark_rx_seed_mbufs(queue);
+	ark_mpu_set_producer(queue->mpu, queue->seedIndex);
+}
+
+/* ************************************************************************* */
+int
+eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t queue_idx,
+	uint16_t nb_desc,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf, struct rte_mempool *mb_pool)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	static int warning1;		/* = 0 */
+
+	struct ark_rx_queue *queue;
+	uint32_t i;
+	int status;
+
+	int port = ark_get_port_id(dev, ark);
+	int qidx = port + queue_idx;	/* TODO FIXME */
+
+	// TODO: We may already be setup, check here if there is nothing to do
+	/* Free memory prior to re-allocation if needed */
+	if (dev->data->rx_queues[queue_idx] != NULL) {
+	// TODO: release any allocated queues
+	dev->data->rx_queues[queue_idx] = NULL;
+	}
+
+	if (rx_conf != NULL && warning1 == 0) {
+	warning1 = 1;
+	PMD_DRV_LOG(INFO,
+		"ARKP: Arkville PMD ignores  rte_eth_rxconf argument.\n");
+	}
+
+	if (RTE_PKTMBUF_HEADROOM < ARK_RX_META_SIZE) {
+	PMD_DRV_LOG(ERR,
+		"Error: DPDK Arkville requires head room > %d bytes (%s)\n",
+		ARK_RX_META_SIZE, __func__);
+	return -1;		/* ERROR CODE */
+	}
+
+	if (!rte_is_power_of_2(nb_desc)) {
+	PMD_DRV_LOG(ERR,
+		"DPDK Arkville configuration queue size must be power of two %u (%s)\n",
+		nb_desc, __func__);
+	return -1;		/* ERROR CODE */
+	}
+
+	/* Allocate queue struct */
+	queue =
+	rte_zmalloc_socket("ArkRXQueue", sizeof(struct ark_rx_queue), 64,
+	socket_id);
+	if (queue == 0) {
+	PMD_DRV_LOG(ERR, "Failed to allocate memory in %s\n", __func__);
+	return -ENOMEM;
+	}
+
+	/* NOTE zmalloc is used, no need to 0 indexes, etc. */
+	queue->mb_pool = mb_pool;
+	queue->phys_qid = qidx;
+	queue->queueIndex = queue_idx;
+	queue->queueSize = nb_desc;
+	queue->queueMask = nb_desc - 1;
+
+	queue->reserveQ =
+	rte_zmalloc_socket("ArkRXQueue mbuf",
+	nb_desc * sizeof(struct rte_mbuf *), 64, socket_id);
+	queue->paddressQ =
+	rte_zmalloc_socket("ArkRXQueue paddr", nb_desc * sizeof(phys_addr_t),
+	64, socket_id);
+	if (queue->reserveQ == 0 || queue->paddressQ == 0) {
+	PMD_DRV_LOG(ERR, "Failed to allocate queue memory in %s\n", __func__);
+	rte_free(queue->reserveQ);
+	rte_free(queue->paddressQ);
+	rte_free(queue);
+	return -ENOMEM;
+	}
+
+	dev->data->rx_queues[queue_idx] = queue;
+	queue->udm = RTE_PTR_ADD(ark->udm.v, qidx * ARK_UDM_QOFFSET);
+	queue->mpu = RTE_PTR_ADD(ark->mpurx.v, qidx * ARK_MPU_QOFFSET);
+
+	/* populate mbuf reserve */
+	status = eth_ark_rx_seed_mbufs(queue);
+
+	/* MPU Setup */
+	if (status == 0)
+		status = eth_ark_rx_queue_setup(dev, queue, qidx, queue_idx);
+
+	if (unlikely(status != 0)) {
+	struct rte_mbuf *mbuf;
+
+	PMD_DRV_LOG(ERR, "ARKP Failed to initialize RX queue %d %s\n", qidx,
+		__func__);
+	/* Free the mbufs allocated */
+	for (i = 0, mbuf = queue->reserveQ[0]; i < nb_desc; ++i, mbuf++) {
+		if (mbuf != 0)
+			rte_pktmbuf_free(mbuf);
+	}
+	rte_free(queue->reserveQ);
+	rte_free(queue->paddressQ);
+	rte_free(queue);
+	return -1;		/* ERROR CODE */
+	}
+
+	return 0;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_recv_pkts_noop(void *rx_queue __rte_unused,
+	struct rte_mbuf **rx_pkts __rte_unused, uint16_t nb_pkts __rte_unused)
+{
+	return 0;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+	struct ark_rx_queue *queue;
+	register uint32_t consIndex, prodIndex;
+	uint16_t nb;
+	uint64_t rx_bytes = 0;
+	struct rte_mbuf *mbuf;
+	struct ark_rx_meta *meta;
+
+	queue = (struct ark_rx_queue *) rx_queue;
+	if (unlikely(queue == 0))
+	return 0;
+	if (unlikely(nb_pkts == 0))
+	return 0;
+	prodIndex = queue->prodIndex;
+	consIndex = queue->consIndex;
+	nb = 0;
+
+	while (prodIndex != consIndex) {
+		mbuf = queue->reserveQ[consIndex & queue->queueMask];
+		/* prefetch mbuf ? */
+		rte_mbuf_prefetch_part1(mbuf);
+		rte_mbuf_prefetch_part2(mbuf);
+
+		/* META DATA burried in buffer */
+		meta = RTE_PTR_ADD(mbuf->buf_addr, ARK_RX_META_OFFSET);
+
+		mbuf->port = meta->port;
+		mbuf->pkt_len = meta->pktLen;
+		mbuf->data_len = meta->pktLen;
+		mbuf->data_off = RTE_PKTMBUF_HEADROOM;
+		mbuf->udata64 = meta->userData;
+		if (ARK_RX_DEBUG) {	/* debug use */
+			if ((meta->pktLen > (1024 * 16)) ||
+				(meta->pktLen == 0)) {
+				PMD_DRV_LOG(INFO,
+						"ARKP RX: Bad Meta Q: %u cons: %u prod: %u\n",
+						queue->phys_qid,
+						consIndex,
+						queue->prodIndex);
+
+				PMD_DRV_LOG(INFO, "       :  cons: %u prod: %u seedIndex %u\n",
+						consIndex,
+						queue->prodIndex,
+						queue->seedIndex);
+
+				PMD_DRV_LOG(INFO, "       :  UDM prod: %u  len: %u\n",
+						queue->udm->rt_cfg.prodIdx,
+						meta->pktLen);
+				ark_mpu_dump(queue->mpu,
+							 "    ",
+							 queue->phys_qid);
+
+				dump_mbuf_data(mbuf, 0, 256);
+				/* its FUBAR so fix it */
+				mbuf->pkt_len = 63;
+				meta->pktLen = 63;
+			}
+			mbuf->seqn = consIndex;
+		}
+
+		rx_bytes += meta->pktLen;	/* TEMP stats */
+
+		if (unlikely(meta->pktLen > ARK_RX_MAX_NOCHAIN))
+			consIndex = eth_ark_rx_jumbo
+				(queue, meta, mbuf, consIndex + 1);
+		else
+			consIndex += 1;
+
+		rx_pkts[nb] = mbuf;
+		nb++;
+		if (nb >= nb_pkts)
+			break;
+	}
+
+	if (unlikely(nb != 0))
+		/* report next free to FPGA */
+		eth_ark_rx_update_consIndex(queue, consIndex);
+
+	return nb;
+}
+
+/* ************************************************************************* */
+static uint32_t
+eth_ark_rx_jumbo(struct ark_rx_queue *queue,
+	struct ark_rx_meta *meta, struct rte_mbuf *mbuf0, uint32_t consIndex)
+{
+	struct rte_mbuf *mbuf_prev;
+	struct rte_mbuf *mbuf;
+
+	uint16_t remaining;
+	uint16_t data_len;
+	uint8_t segments;
+
+	/* first buf populated by called */
+	mbuf_prev = mbuf0;
+	segments = 1;
+	data_len = RTE_MIN(meta->pktLen, RTE_MBUF_DEFAULT_DATAROOM);
+	remaining = meta->pktLen - data_len;
+	mbuf0->data_len = data_len;
+
+	/* TODO check that the data does not exceed prodIndex! */
+	while (remaining != 0) {
+		data_len =
+			RTE_MIN(remaining,
+					RTE_MBUF_DEFAULT_DATAROOM +
+					RTE_PKTMBUF_HEADROOM);
+
+		remaining -= data_len;
+		segments += 1;
+
+		mbuf = queue->reserveQ[consIndex & queue->queueMask];
+		mbuf_prev->next = mbuf;
+		mbuf_prev = mbuf;
+		mbuf->data_len = data_len;
+		mbuf->data_off = 0;
+		if (ARK_RX_DEBUG)
+			mbuf->seqn = consIndex;	/* for debug only */
+
+		consIndex += 1;
+	}
+
+	mbuf0->nb_segs = segments;
+	return consIndex;
+}
+
+/* Drain the internal queue allowing hw to clear out. */
+static void
+eth_ark_rx_queue_drain(struct ark_rx_queue *queue)
+{
+	register uint32_t consIndex;
+	struct rte_mbuf *mbuf;
+
+	consIndex = queue->consIndex;
+
+	/* NOT performance optimized, since this is a one-shot call */
+	while ((consIndex ^ queue->prodIndex) & queue->queueMask) {
+		mbuf = queue->reserveQ[consIndex & queue->queueMask];
+		rte_pktmbuf_free(mbuf);
+		consIndex++;
+		eth_ark_rx_update_consIndex(queue, consIndex);
+	}
+}
+
+uint32_t
+eth_ark_dev_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+	return (queue->prodIndex - queue->consIndex);	/* mod arith */
+}
+
+/* ************************************************************************* */
+int
+eth_ark_rx_start_queue(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+	if (queue == 0)
+	return -1;
+
+	dev->data->rx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STARTED;
+
+	ark_mpu_set_producer(queue->mpu, queue->seedIndex);
+	ark_mpu_start(queue->mpu);
+
+	ark_udm_queue_enable(queue->udm, 1);
+
+	return 0;
+}
+
+/* ************************************************************************* */
+
+/* Queue can be restarted.   data remains
+ */
+int
+eth_ark_rx_stop_queue(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+	if (queue == 0)
+	return -1;
+
+	ark_udm_queue_enable(queue->udm, 0);
+
+	dev->data->rx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STOPPED;
+
+	return 0;
+}
+
+/* ************************************************************************* */
+static inline int
+eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue)
+{
+	uint32_t limit = queue->consIndex + queue->queueSize;
+	uint32_t seedIndex = queue->seedIndex;
+
+	uint32_t count = 0;
+	uint32_t seedM = queue->seedIndex & queue->queueMask;
+
+	uint32_t nb = limit - seedIndex;
+
+	/* Handle wrap around -- remainder is filled on the next call */
+	if (unlikely(seedM + nb > queue->queueSize))
+		nb = queue->queueSize - seedM;
+
+	struct rte_mbuf **mbufs = &queue->reserveQ[seedM];
+	int status = rte_pktmbuf_alloc_bulk(queue->mb_pool, mbufs, nb);
+
+	if (unlikely(status != 0))
+		return -1;
+
+	if (ARK_RX_DEBUG) {		/* DEBUG */
+		while (count != nb) {
+			struct rte_mbuf *mbuf_init =
+				queue->reserveQ[seedM + count];
+
+			memset(mbuf_init->buf_addr, -1, 512);
+			*((uint32_t *) mbuf_init->buf_addr) = seedIndex + count;
+			*(uint16_t *) RTE_PTR_ADD(mbuf_init->buf_addr, 4) =
+				queue->phys_qid;
+			count++;
+		}
+		count = 0;
+	}
+	/* DEBUG */
+	queue->seedIndex += nb;
+
+	/* Duff's device https://en.wikipedia.org/wiki/Duff's_device */
+	switch (nb % 4) {
+	case 0:
+	while (count != nb) {
+		queue->paddressQ[seedM++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+	case 3:
+		queue->paddressQ[seedM++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+	case 2:
+		queue->paddressQ[seedM++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+	case 1:
+		queue->paddressQ[seedM++] = (*mbufs++)->buf_physaddr;
+		count++;
+		/* FALLTHROUGH */
+
+	} /* while (count != nb) */
+	} /* switch */
+
+	return 0;
+}
+
+void
+eth_ark_rx_dump_queue(struct rte_eth_dev *dev, uint16_t queue_id,
+	const char *msg)
+{
+	struct ark_rx_queue *queue;
+
+	queue = dev->data->rx_queues[queue_id];
+
+	ark_ethdev_rx_dump(msg, queue);
+}
+
+/* ************************************************************************* */
+
+/* Call on device closed no user API, queue is stopped */
+void
+eth_ark_dev_rx_queue_release(void *vqueue)
+{
+	struct ark_rx_queue *queue;
+	uint32_t i;
+
+	queue = (struct ark_rx_queue *) vqueue;
+	if (queue == 0)
+		return;
+
+	ark_udm_queue_enable(queue->udm, 0);
+	/* Stop the MPU since pointer are going away */
+	ark_mpu_stop(queue->mpu);
+
+	/* Need to clear out mbufs here, dropping packets along the way */
+	eth_ark_rx_queue_drain(queue);
+
+	for (i = 0; i < queue->queueSize; ++i)
+		rte_pktmbuf_free(queue->reserveQ[i]);
+
+	rte_free(queue->reserveQ);
+	rte_free(queue->paddressQ);
+	rte_free(queue);
+}
+
+void
+eth_rx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats)
+{
+	struct ark_rx_queue *queue;
+	struct ark_udm_t *udm;
+
+	queue = vqueue;
+	if (queue == 0)
+	return;
+	udm = queue->udm;
+
+	uint64_t ibytes = ark_udm_bytes(udm);
+	uint64_t ipackets = ark_udm_packets(udm);
+	uint64_t idropped = ark_udm_dropped(queue->udm);
+
+	stats->q_ipackets[queue->queueIndex] = ipackets;
+	stats->q_ibytes[queue->queueIndex] = ibytes;
+	stats->q_errors[queue->queueIndex] = idropped;
+	stats->ipackets += ipackets;
+	stats->ibytes += ibytes;
+	stats->imissed += idropped;
+}
+
+void
+eth_rx_queue_stats_reset(void *vqueue)
+{
+	struct ark_rx_queue *queue;
+
+	queue = vqueue;
+	if (queue == 0)
+		return;
+
+	ark_mpu_reset_stats(queue->mpu);
+	ark_udm_queue_stats_reset(queue->udm);
+}
+
+void
+eth_ark_udm_force_close(struct rte_eth_dev *dev)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	struct ark_rx_queue *queue;
+	uint32_t index;
+	uint16_t i;
+
+	if (!ark_udm_is_flushed(ark->udm.v)) {
+	/* restart the MPUs */
+	fprintf(stderr, "ARK: %s UDM not flushed\n", __func__);
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		queue = (struct ark_rx_queue *) dev->data->rx_queues[i];
+		if (queue == 0)
+		continue;
+
+		ark_mpu_start(queue->mpu);
+		/* Add some buffers */
+		index = 100000 + queue->seedIndex;
+		ark_mpu_set_producer(queue->mpu, index);
+	}
+	/* Wait to allow data to pass */
+	usleep(100);
+
+	ARK_DEBUG_TRACE("UDM forced flush attempt, stopped = %d\n",
+		ark_udm_is_flushed(ark->udm.v));
+	}
+	ark_udm_reset(ark->udm.v);
+
+}
+
+static void
+ark_ethdev_rx_dump(const char *name, struct ark_rx_queue *queue)
+{
+	if (queue == NULL)
+	return;
+	ARK_DEBUG_TRACE("RX QUEUE %d -- %s", queue->phys_qid, name);
+	ARK_DEBUG_TRACE(FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 "\n",
+	"queueSize", queue->queueSize,
+	"seedIndex", queue->seedIndex,
+	"prodIndex", queue->prodIndex, "consIndex", queue->consIndex);
+
+	ark_mpu_dump(queue->mpu, name, queue->phys_qid);
+	ark_mpu_dump_setup(queue->mpu, queue->phys_qid);
+	ark_udm_dump(queue->udm, name);
+	ark_udm_dump_setup(queue->udm, queue->phys_qid);
+
+}
+
+static void
+dump_mbuf_data(struct rte_mbuf *mbuf, uint16_t lo, uint16_t hi)
+{
+	uint16_t i, j;
+
+	fprintf(stderr, " MBUF: %p len %d, off: %d, seq: %u\n", mbuf,
+	mbuf->pkt_len, mbuf->data_off, mbuf->seqn);
+	for (i = lo; i < hi; i += 16) {
+		uint8_t *dp = RTE_PTR_ADD(mbuf->buf_addr, i);
+
+		fprintf(stderr, "  %6d:  ", i);
+		for (j = 0; j < 16; j++)
+			fprintf(stderr, " %02x", dp[j]);
+
+		fprintf(stderr, "\n");
+	}
+}
diff --git a/drivers/net/ark/ark_ethdev_tx.c b/drivers/net/ark/ark_ethdev_tx.c
new file mode 100644
index 0000000..457057e
--- /dev/null
+++ b/drivers/net/ark/ark_ethdev_tx.c
@@ -0,0 +1,479 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_global.h"
+#include "ark_mpu.h"
+#include "ark_ddm.h"
+#include "ark_ethdev.h"
+#include "ark_debug.h"
+
+#define ARK_TX_META_SIZE   32
+#define ARK_TX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_TX_META_SIZE)
+#define ARK_TX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
+#define ARK_TX_PAD_TO_60   1
+
+#ifdef RTE_LIBRTE_ARK_DEBUG_TX
+#define ARK_TX_DEBUG       1
+#define ARK_TX_DEBUG_JUMBO 1
+#else
+#define ARK_TX_DEBUG       0
+#define ARK_TX_DEBUG_JUMBO 0
+#endif
+
+/* ************************************************************************* */
+
+/* struct fixed in FPGA -- 16 bytes */
+
+/* TODO move to ark_ddm.h */
+struct ark_tx_meta {
+	uint64_t physaddr;
+	uint32_t delta_ns;
+	uint16_t data_len;		/* of this MBUF */
+#define   ARK_DDM_EOP   0x01
+#define   ARK_DDM_SOP   0x02
+	uint8_t flags;		/* bit 0 indicates last mbuf in chain. */
+	uint8_t reserved[1];
+};
+
+/* ************************************************************************* */
+struct ark_tx_queue {
+
+	struct ark_tx_meta *metaQ;
+	struct rte_mbuf **bufs;
+
+	/* handles for hw objects */
+	struct ark_mpu_t *mpu;
+	struct ark_ddm_t *ddm;
+
+	/* Stats HW tracks bytes and packets, need to count send errors */
+	uint64_t tx_errors;
+
+	uint32_t queueSize;
+	uint32_t queueMask;
+
+	/* 3 indexs to the paired data rings. */
+	uint32_t prodIndex;		/* where to put the next one */
+	uint32_t freeIndex;		/* mbuf has been freed */
+
+	// The queue Id is used to identify the HW Q
+	uint16_t phys_qid;
+	/* The queue Index within the dpdk device structures */
+	uint16_t queueIndex;
+
+	uint32_t pad[1];
+
+	/* second cache line - fields only used in slow path */
+	MARKER cacheline1 __rte_cache_min_aligned;
+	uint32_t consIndex;		/* hw is done, can be freed */
+} __rte_cache_aligned;
+
+/* Forward declarations */
+static uint32_t eth_ark_tx_jumbo(struct ark_tx_queue *queue,
+	struct rte_mbuf *mbuf);
+static int eth_ark_tx_hw_queue_config(struct ark_tx_queue *queue);
+static void free_completed_tx(struct ark_tx_queue *queue);
+
+static inline void
+ark_tx_hw_queue_stop(struct ark_tx_queue *queue)
+{
+	ark_mpu_stop(queue->mpu);
+}
+
+/* ************************************************************************* */
+static inline void
+eth_ark_tx_meta_from_mbuf(struct ark_tx_meta *meta,
+	const struct rte_mbuf *mbuf, uint8_t flags)
+{
+	meta->physaddr = rte_mbuf_data_dma_addr(mbuf);
+	meta->delta_ns = 0;
+	meta->data_len = rte_pktmbuf_data_len(mbuf);
+	meta->flags = flags;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_xmit_pkts_noop(void *vtxq __rte_unused,
+	struct rte_mbuf **tx_pkts __rte_unused, uint16_t nb_pkts __rte_unused)
+{
+	return 0;
+}
+
+/* ************************************************************************* */
+uint16_t
+eth_ark_xmit_pkts(void *vtxq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+	struct ark_tx_queue *queue;
+	struct rte_mbuf *mbuf;
+	struct ark_tx_meta *meta;
+
+	uint32_t idx;
+	uint32_t prodIndexLimit;
+	int stat;
+	uint16_t nb;
+
+	queue = (struct ark_tx_queue *) vtxq;
+
+	/* free any packets after the HW is done with them */
+	free_completed_tx(queue);
+
+	prodIndexLimit = queue->queueSize + queue->freeIndex;
+
+	for (nb = 0;
+		 (nb < nb_pkts) && (queue->prodIndex != prodIndexLimit);
+		 ++nb) {
+		mbuf = tx_pkts[nb];
+
+		if (ARK_TX_PAD_TO_60) {
+			if (unlikely(rte_pktmbuf_pkt_len(mbuf) < 60)) {
+				/* this packet even if it is small can be split,
+				 * be sure to add to the end
+				 */
+				uint16_t toAdd = 60 - rte_pktmbuf_pkt_len(mbuf);
+				char *appended = rte_pktmbuf_append(mbuf, toAdd);
+
+				if (appended == 0) {
+					/* This packet is in error, we cannot send it so just
+					 * count it and delete it.
+					 */
+					queue->tx_errors += 1;
+					rte_pktmbuf_free(mbuf);
+					continue;
+				}
+				memset(appended, 0, toAdd);
+			}
+		}
+
+		if (unlikely(mbuf->nb_segs != 1)) {
+			stat = eth_ark_tx_jumbo(queue, mbuf);
+			if (unlikely(stat != 0))
+				break;		/* Queue is full */
+		} else {
+			idx = queue->prodIndex & queue->queueMask;
+			queue->bufs[idx] = mbuf;
+			meta = &queue->metaQ[idx];
+			eth_ark_tx_meta_from_mbuf(meta, mbuf,
+									  ARK_DDM_SOP | ARK_DDM_EOP);
+			queue->prodIndex++;
+		}
+	}
+
+	if (ARK_TX_DEBUG) {
+		if (nb != nb_pkts) {
+			PMD_DRV_LOG(ERR,
+						"ARKP TX: Failure to send: req: %u sent: %u prod: %u cons: %u free: %u\n",
+						nb_pkts, nb, queue->prodIndex, queue->consIndex,
+						queue->freeIndex);
+			ark_mpu_dump(queue->mpu, "TX Failure MPU: ", queue->phys_qid);
+		}
+	}
+
+	/* let fpga know producer index.  */
+	if (likely(nb != 0))
+		ark_mpu_set_producer(queue->mpu, queue->prodIndex);
+
+	return nb;
+}
+
+/* ************************************************************************* */
+static uint32_t
+eth_ark_tx_jumbo(struct ark_tx_queue *queue, struct rte_mbuf *mbuf)
+{
+	struct rte_mbuf *next;
+	struct ark_tx_meta *meta;
+	uint32_t freeQueueSpace;
+	uint32_t idx;
+	uint8_t flags = ARK_DDM_SOP;
+
+	freeQueueSpace = queue->queueMask - (queue->prodIndex - queue->freeIndex);
+	if (unlikely(freeQueueSpace < mbuf->nb_segs)) {
+	return -1;
+	}
+
+	if (ARK_TX_DEBUG_JUMBO) {
+	PMD_DRV_LOG(ERR,
+		"ARKP  JUMBO TX len: %u segs: %u prod: %u cons: %u free: %u freeSpace: %u\n",
+		mbuf->pkt_len, mbuf->nb_segs, queue->prodIndex, queue->consIndex,
+		queue->freeIndex, freeQueueSpace);
+	}
+
+	while (mbuf != NULL) {
+	next = mbuf->next;
+
+	idx = queue->prodIndex & queue->queueMask;
+	queue->bufs[idx] = mbuf;
+	meta = &queue->metaQ[idx];
+
+	flags |= (next == NULL) ? ARK_DDM_EOP : 0;
+	eth_ark_tx_meta_from_mbuf(meta, mbuf, flags);
+	queue->prodIndex++;
+
+	flags &= ~ARK_DDM_SOP;	/* drop SOP flags */
+	mbuf = next;
+	}
+
+	return 0;
+}
+
+/* ************************************************************************* */
+int
+eth_ark_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t queue_idx,
+	uint16_t nb_desc,
+	unsigned int socket_id, const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct ark_adapter *ark = (struct ark_adapter *) dev->data->dev_private;
+	struct ark_tx_queue *queue;
+	int status;
+
+	/* TODO: divide the Q's evenly with the Vports */
+	int port = ark_get_port_id(dev, ark);
+	int qidx = port + queue_idx;	/* FIXME for multi queue */
+
+	if (!rte_is_power_of_2(nb_desc)) {
+		PMD_DRV_LOG(ERR,
+					"DPDK Arkville configuration queue size must be power of two %u (%s)\n",
+					nb_desc, __func__);
+		return -1;
+	}
+
+	/* TODO: We may already be setup, check here if there is to do return */
+	/* /\* Free memory prior to re-allocation if needed *\/ */
+	/* if (dev->data->tx_queues[queue_idx] != NULL) { */
+	/* 	dev->data->tx_queues[queue_idx] = NULL; */
+	/* } */
+
+	/* Allocate queue struct */
+	queue =
+		rte_zmalloc_socket("ArkTXQueue", sizeof(struct ark_tx_queue), 64,
+	socket_id);
+	if (queue == 0) {
+		PMD_DRV_LOG(ERR, "ARKP Failed to allocate tx queue memory in %s\n",
+					__func__);
+		return -ENOMEM;
+	}
+
+	/* we use zmalloc no need to initialize fields */
+	queue->queueSize = nb_desc;
+	queue->queueMask = nb_desc - 1;
+	queue->phys_qid = qidx;
+	queue->queueIndex = queue_idx;
+	dev->data->tx_queues[queue_idx] = queue;
+
+	queue->metaQ =
+	rte_zmalloc_socket("ArkTXQueue meta",
+	nb_desc * sizeof(struct ark_tx_meta), 64, socket_id);
+	queue->bufs =
+	rte_zmalloc_socket("ArkTXQueue bufs",
+	nb_desc * sizeof(struct rte_mbuf *), 64, socket_id);
+
+	if (queue->metaQ == 0 || queue->bufs == 0) {
+		PMD_DRV_LOG(ERR, "Failed to allocate queue memory in %s\n", __func__);
+		rte_free(queue->metaQ);
+		rte_free(queue->bufs);
+		rte_free(queue);
+		return -ENOMEM;
+	}
+
+	queue->ddm = RTE_PTR_ADD(ark->ddm.v, qidx * ARK_DDM_QOFFSET);
+	queue->mpu = RTE_PTR_ADD(ark->mputx.v, qidx * ARK_MPU_QOFFSET);
+
+	status = eth_ark_tx_hw_queue_config(queue);
+
+	if (unlikely(status != 0)) {
+		rte_free(queue->metaQ);
+		rte_free(queue->bufs);
+		rte_free(queue);
+		return -1;		/* ERROR CODE */
+	}
+
+	return 0;
+}
+
+/* ************************************************************************* */
+static int
+eth_ark_tx_hw_queue_config(struct ark_tx_queue *queue)
+{
+	phys_addr_t queueBase, ringBase, prodIndexAddr;
+	uint32_t writeInterval_ns;
+
+	/* Verify HW -- MPU */
+	if (ark_mpu_verify(queue->mpu, sizeof(struct ark_tx_meta)))
+		return -1;
+
+	queueBase = rte_malloc_virt2phy(queue);
+	ringBase = rte_malloc_virt2phy(queue->metaQ);
+	prodIndexAddr = queueBase + offsetof(struct ark_tx_queue, consIndex);
+
+	ark_mpu_stop(queue->mpu);
+	ark_mpu_reset(queue->mpu);
+
+	/* Stop and Reset and configure MPU */
+	ark_mpu_configure(queue->mpu, ringBase, queue->queueSize, 1);
+
+	/* Adjust the write interval based on queue size -- increase pcie traffic
+	 * when low mbuf count */
+	switch (queue->queueSize) {
+	case 128:
+	writeInterval_ns = 500;
+	break;
+	case 256:
+	writeInterval_ns = 500;
+	break;
+	case 512:
+	writeInterval_ns = 1000;
+	break;
+	default:
+	writeInterval_ns = 2000;
+	break;
+	}
+
+	// Completion address in UDM
+	ark_ddm_setup(queue->ddm, prodIndexAddr, writeInterval_ns);
+
+	return 0;
+}
+
+/* ************************************************************************* */
+void
+eth_ark_tx_queue_release(void *vtx_queue)
+{
+	struct ark_tx_queue *queue;
+
+	queue = (struct ark_tx_queue *) vtx_queue;
+
+	ark_tx_hw_queue_stop(queue);
+
+	queue->consIndex = queue->prodIndex;
+	free_completed_tx(queue);
+
+	rte_free(queue->metaQ);
+	rte_free(queue->bufs);
+	rte_free(queue);
+
+}
+
+/* ************************************************************************* */
+int
+eth_ark_tx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_tx_queue *queue;
+	int cnt = 0;
+
+	queue = dev->data->tx_queues[queue_id];
+
+	/* Wait for DDM to send out all packets. */
+	while (queue->consIndex != queue->prodIndex) {
+		usleep(100);
+		if (cnt++ > 10000)
+			return -1;
+	}
+
+	ark_mpu_stop(queue->mpu);
+	free_completed_tx(queue);
+
+	dev->data->tx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STOPPED;
+
+	return 0;
+}
+
+int
+eth_ark_tx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct ark_tx_queue *queue;
+
+	queue = dev->data->tx_queues[queue_id];
+	if (dev->data->tx_queue_state[queue_id] == RTE_ETH_QUEUE_STATE_STARTED)
+		return 0;
+
+	ark_mpu_start(queue->mpu);
+	dev->data->tx_queue_state[queue_id] = RTE_ETH_QUEUE_STATE_STARTED;
+
+	return 0;
+}
+
+/* ************************************************************************* */
+static void
+free_completed_tx(struct ark_tx_queue *queue)
+{
+	struct rte_mbuf *mbuf;
+	struct ark_tx_meta *meta;
+	uint32_t topIndex;
+
+	topIndex = queue->consIndex;	/* read once */
+	while (queue->freeIndex != topIndex) {
+		meta = &queue->metaQ[queue->freeIndex & queue->queueMask];
+		mbuf = queue->bufs[queue->freeIndex & queue->queueMask];
+
+		if (likely((meta->flags & ARK_DDM_SOP) != 0)) {
+			/* ref count of the mbuf is checked in this call. */
+			rte_pktmbuf_free(mbuf);
+		}
+		queue->freeIndex++;
+	}
+}
+
+/* ************************************************************************* */
+void
+eth_tx_queue_stats_get(void *vqueue, struct rte_eth_stats *stats)
+{
+	struct ark_tx_queue *queue;
+	struct ark_ddm_t *ddm;
+	uint64_t bytes, pkts;
+
+	queue = vqueue;
+	ddm = queue->ddm;
+
+	bytes = ark_ddm_queue_byte_count(ddm);
+	pkts = ark_ddm_queue_pkt_count(ddm);
+
+	stats->q_opackets[queue->queueIndex] = pkts;
+	stats->q_obytes[queue->queueIndex] = bytes;
+	stats->opackets += pkts;
+	stats->obytes += bytes;
+	stats->oerrors += queue->tx_errors;
+}
+
+void
+eth_tx_queue_stats_reset(void *vqueue)
+{
+	struct ark_tx_queue *queue;
+	struct ark_ddm_t *ddm;
+
+	queue = vqueue;
+	ddm = queue->ddm;
+
+	ark_ddm_queue_reset_stats(ddm);
+	queue->tx_errors = 0;
+}
diff --git a/drivers/net/ark/ark_ext.h b/drivers/net/ark/ark_ext.h
new file mode 100644
index 0000000..0b5b9ba
--- /dev/null
+++ b/drivers/net/ark/ark_ext.h
@@ -0,0 +1,71 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_EXT_H_
+#define _ARK_EXT_H_
+
+/*
+ Called post PMD init.  The implementation returns its private data that gets passed into
+ all other functions as user_data
+ The ARK extension implementation MUST implement this function
+*/
+void *dev_init(struct rte_eth_dev *dev, void *Abar, int port_id);
+
+/* Called during device shutdown */
+void dev_uninit(struct rte_eth_dev *dev, void *user_data);
+
+/* This call is optional and allows the extension to specify the number of supported ports. */
+uint8_t dev_get_port_count(struct rte_eth_dev *dev, void *user_data);
+
+/*
+   The following functions are optional and are directly mapped from the DPDK PMD ops
+   structure. Each function if implemented is called after the ARK PMD implementation executes.
+*/
+int dev_configure(struct rte_eth_dev *dev, void *user_data);
+int dev_start(struct rte_eth_dev *dev, void *user_data);
+void dev_stop(struct rte_eth_dev *dev, void *user_data);
+void dev_close(struct rte_eth_dev *dev, void *user_data);
+int link_update(struct rte_eth_dev *dev, int wait_to_complete,
+	void *user_data);
+int dev_set_link_up(struct rte_eth_dev *dev, void *user_data);
+int dev_set_link_down(struct rte_eth_dev *dev, void *user_data);
+void stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats,
+	void *user_data);
+void stats_reset(struct rte_eth_dev *dev, void *user_data);
+void mac_addr_add(struct rte_eth_dev *dev,
+	struct ether_addr *macadr, uint32_t index, uint32_t pool, void *user_data);
+void mac_addr_remove(struct rte_eth_dev *dev, uint32_t index, void *user_data);
+void mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
+	void *user_data);
+
+#endif
diff --git a/drivers/net/ark/ark_global.h b/drivers/net/ark/ark_global.h
new file mode 100644
index 0000000..78c61de
--- /dev/null
+++ b/drivers/net/ark/ark_global.h
@@ -0,0 +1,164 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_GLOBAL_H_
+#define _ARK_GLOBAL_H_
+
+#include <time.h>
+#include <assert.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_kvargs.h>
+#include <rte_dev.h>
+#include <rte_version.h>
+
+#include "ark_pktdir.h"
+#include "ark_pktgen.h"
+#include "ark_pktchkr.h"
+
+#define ETH_ARK_ARG_MAXLEN	64
+#define ARK_SYSCTRL_BASE  0x0
+#define ARK_PKTGEN_BASE   0x10000
+#define ARK_MPURx_BASE    0x20000
+#define ARK_UDM_BASE      0x30000
+#define ARK_MPUTx_BASE    0x40000
+#define ARK_DDM_BASE      0x60000
+#define ARK_CMAC_BASE     0x80000
+#define ARK_PKTDIR_BASE   0xA0000
+#define ARK_PKTCHKR_BASE  0x90000
+#define ARK_RCPACING_BASE 0xB0000
+#define ARK_EXTERNAL_BASE 0x100000
+#define ARK_MPU_QOFFSET   0x00100
+#define ARK_MAX_PORTS     8
+
+#define Offset8(n)     n
+#define Offset16(n)   (n/2)
+#define Offset32(n)   (n/4)
+#define Offset64(n)   (n/8)
+
+/*
+ * Structure to store private data for each PF/VF instance.
+ */
+#define DefPtr(type, name)	\
+  union type {      \
+	uint64_t *t64; \
+	uint32_t *t32; \
+	uint16_t *t16; \
+	uint8_t  *t8;  \
+	void     *v;  \
+  } name
+
+#define SetPtr(bar, ark, mem, off) {	    \
+  ark->mem.t64 = (uint64_t *)&ark->bar[off]; \
+  ark->mem.t32 = (uint32_t *)&ark->bar[off]; \
+  ark->mem.t16 = (uint16_t *)&ark->bar[off]; \
+  ark->mem.t8  = (uint8_t *)&ark->bar[off]; \
+  }
+
+struct ark_port {
+	struct rte_eth_dev *eth_dev;
+	int id;
+};
+
+struct ark_user_ext {
+	void *(*dev_init) (struct rte_eth_dev *, void *abar, int port_id);
+	void (*dev_uninit) (struct rte_eth_dev *, void *);
+	int (*dev_get_port_count) (struct rte_eth_dev *, void *);
+	int (*dev_configure) (struct rte_eth_dev *, void *);
+	int (*dev_start) (struct rte_eth_dev *, void *);
+	void (*dev_stop) (struct rte_eth_dev *, void *);
+	void (*dev_close) (struct rte_eth_dev *, void *);
+	int (*link_update) (struct rte_eth_dev *, int wait_to_complete, void *);
+	int (*dev_set_link_up) (struct rte_eth_dev *, void *);
+	int (*dev_set_link_down) (struct rte_eth_dev *, void *);
+	void (*stats_get) (struct rte_eth_dev *, struct rte_eth_stats *, void *);
+	void (*stats_reset) (struct rte_eth_dev *, void *);
+	void (*mac_addr_add) (struct rte_eth_dev *,
+	struct ether_addr *, uint32_t, uint32_t, void *);
+	void (*mac_addr_remove) (struct rte_eth_dev *, uint32_t, void *);
+	void (*mac_addr_set) (struct rte_eth_dev *, struct ether_addr *, void *);
+};
+
+struct ark_adapter {
+
+	/* User extension private data */
+	void *user_data;
+
+	/* Pointers to packet generator and checker */
+	int start_pg;
+	ArkPktGen_t pg;
+	ArkPktChkr_t pc;
+	ArkPktDir_t pd;
+
+	struct ark_port port[ARK_MAX_PORTS];
+	int num_ports;
+
+	/* Common for both PF and VF */
+	struct rte_eth_dev *eth_dev;
+
+	void *dHandle;
+	struct ark_user_ext user_ext;
+
+	/* Our Bar 0 */
+	uint8_t *bar0;
+
+	/* A Bar */
+	uint8_t *Abar;
+
+	/* Arkville demo block offsets */
+	 DefPtr(SysCtrl, sysctrl);
+	 DefPtr(PktGen, pktgen);
+	 DefPtr(MpuRx, mpurx);
+	 DefPtr(UDM, udm);
+	 DefPtr(MpuTx, mputx);
+	 DefPtr(DDM, ddm);
+	 DefPtr(CMAC, cmac);
+	 DefPtr(External, external);
+	 DefPtr(PktDir, pktdir);
+	 DefPtr(PktChkr, pktchkr);
+
+	int started;
+	uint16_t rxQueues;
+	uint16_t txQueues;
+
+	struct ark_rqpace_t *rqpacing;
+};
+
+typedef uint32_t *ark_t;
+
+#endif
diff --git a/drivers/net/ark/ark_mpu.c b/drivers/net/ark/ark_mpu.c
new file mode 100644
index 0000000..b206d59
--- /dev/null
+++ b/drivers/net/ark/ark_mpu.c
@@ -0,0 +1,168 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_debug.h"
+#include "ark_mpu.h"
+
+uint16_t
+ark_api_num_queues(struct ark_mpu_t *mpu)
+{
+	return mpu->hw.numQueues;
+}
+
+uint16_t
+ark_api_num_queues_per_port(struct ark_mpu_t *mpu, uint16_t ark_ports)
+{
+	return mpu->hw.numQueues / ark_ports;
+}
+
+int
+ark_mpu_verify(struct ark_mpu_t *mpu, uint32_t objSize)
+{
+	uint32_t version;
+
+	version = mpu->id.vernum & 0x0000FF00;
+	if ((mpu->id.idnum != 0x2055504d) || (mpu->hw.objSize != objSize)
+	|| version != 0x00003100) {
+	fprintf(stderr,
+		"   MPU module not found as expected %08x \"%c%c%c%c"
+		"%c%c%c%c\"\n", mpu->id.idnum, mpu->id.id[0], mpu->id.id[1],
+		mpu->id.id[2], mpu->id.id[3], mpu->id.ver[0], mpu->id.ver[1],
+		mpu->id.ver[2], mpu->id.ver[3]);
+	fprintf(stderr,
+		"   MPU HW numQueues: %u hwDepth %u, objSize: %u, objPerMRR: %u Expected size %u\n",
+		mpu->hw.numQueues, mpu->hw.hwDepth, mpu->hw.objSize,
+		mpu->hw.objPerMRR, objSize);
+	return -1;
+	}
+	return 0;
+}
+
+void
+ark_mpu_stop(struct ark_mpu_t *mpu)
+{
+	mpu->cfg.command = MPU_CMD_Stop;
+}
+
+void
+ark_mpu_start(struct ark_mpu_t *mpu)
+{
+	mpu->cfg.command = MPU_CMD_Run;	/* run state */
+}
+
+int
+ark_mpu_reset(struct ark_mpu_t *mpu)
+{
+
+	int cnt = 0;
+
+	mpu->cfg.command = MPU_CMD_Reset;	/* reset */
+
+	while (mpu->cfg.command != MPU_CMD_Idle) {
+	if (cnt++ > 1000)
+		break;
+	usleep(10);
+	}
+	if (mpu->cfg.command != MPU_CMD_Idle) {
+	mpu->cfg.command = MPU_CMD_ForceReset;	/* forced reset */
+	usleep(10);
+	}
+	ark_mpu_reset_stats(mpu);
+	return mpu->cfg.command != MPU_CMD_Idle;
+}
+
+void
+ark_mpu_reset_stats(struct ark_mpu_t *mpu)
+{
+	mpu->stats.pciRequest = 1;	/* reset stats */
+}
+
+int
+ark_mpu_configure(struct ark_mpu_t *mpu, phys_addr_t ring, uint32_t ringSize,
+	int isTx)
+{
+	ark_mpu_reset(mpu);
+
+	if (!rte_is_power_of_2(ringSize)) {
+	fprintf(stderr, "ARKP Invalid ring size for MPU %d\n", ringSize);
+	return -1;
+	}
+
+	mpu->cfg.ringBase = ring;
+	mpu->cfg.ringSize = ringSize;
+	mpu->cfg.ringMask = ringSize - 1;
+	mpu->cfg.minHostMove = isTx ? 1 : mpu->hw.objPerMRR;
+	mpu->cfg.minHWMove = mpu->hw.objPerMRR;
+	mpu->cfg.swProdIndex = 0;
+	mpu->cfg.hwConsIndex = 0;
+	return 0;
+}
+
+void
+ark_mpu_dump(struct ark_mpu_t *mpu, const char *code, uint16_t qid)
+{
+	/* DUMP to see that we have started */
+	ARK_DEBUG_TRACE
+		("ARKP MPU: %s Q: %3u swProd %u, hwCons: %u\n", code, qid,
+		 mpu->cfg.swProdIndex, mpu->cfg.hwConsIndex);
+	ARK_DEBUG_TRACE
+		("ARKP MPU: %s state: %d count %d, reserved %d data 0x%08x_%08x 0x%08x_%08x\n",
+		 code, mpu->debug.state, mpu->debug.count, mpu->debug.reserved,
+		 mpu->debug.peek[1], mpu->debug.peek[0], mpu->debug.peek[3],
+		 mpu->debug.peek[2]
+		 );
+	ARK_DEBUG_STATS
+		("ARKP MPU: %s Q: %3u" FMT_SU64 FMT_SU64 FMT_SU64 FMT_SU64
+		 FMT_SU64 FMT_SU64 FMT_SU64 "\n", code, qid,
+		 "PCI Request:", mpu->stats.pciRequest,
+		 "QueueEmpty", mpu->stats.qEmpty,
+		 "QueueQ1", mpu->stats.qQ1,
+		 "QueueQ2", mpu->stats.qQ2,
+		 "QueueQ3", mpu->stats.qQ3,
+		 "QueueQ4", mpu->stats.qQ4,
+		 "QueueFull", mpu->stats.qFull
+		 );
+}
+
+void
+ark_mpu_dump_setup(struct ark_mpu_t *mpu, uint16_t qId)
+{
+	ARK_DEBUG_TRACE
+		("MPU Setup Q: %u"
+		 FMT_SPTR "\n", qId,
+		 "ringBase", (void *) mpu->cfg.ringBase
+		 );
+
+}
diff --git a/drivers/net/ark/ark_mpu.h b/drivers/net/ark/ark_mpu.h
new file mode 100644
index 0000000..54b4b60
--- /dev/null
+++ b/drivers/net/ark/ark_mpu.h
@@ -0,0 +1,143 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_MPU_H_
+#define _ARK_MPU_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/*
+ * MPU hardware structures
+ */
+
+#define ARK_MPU_ID 0x00
+struct ark_mpu_id_t {
+	union {
+	char id[4];
+	uint32_t idnum;
+	};
+	union {
+	char ver[4];
+	uint32_t vernum;
+	};
+	uint32_t physId;
+	uint32_t mrrCode;
+};
+
+#define ARK_MPU_HW 0x010
+struct ark_mpu_hw_t {
+	uint16_t numQueues;
+	uint16_t reserved;
+	uint32_t hwDepth;
+	uint32_t objSize;
+	uint32_t objPerMRR;
+};
+
+#define ARK_MPU_CFG 0x040
+struct ark_mpu_cfg_t {
+	phys_addr_t ringBase;	/* phys_addr_t is a uint64_t */
+	uint32_t ringSize;
+	uint32_t ringMask;
+	uint32_t minHostMove;
+	uint32_t minHWMove;
+	volatile uint32_t swProdIndex;
+	volatile uint32_t hwConsIndex;
+	volatile uint32_t command;
+};
+enum ARK_MPU_COMMAND {
+	MPU_CMD_Idle = 1, MPU_CMD_Run = 2, MPU_CMD_Stop = 4, MPU_CMD_Reset =
+	8, MPU_CMD_ForceReset = 16, MPU_COMMAND_LIMIT = 0xFFFFFFFF
+};
+
+#define ARK_MPU_STATS 0x080
+struct ark_mpu_stats_t {
+	volatile uint64_t pciRequest;
+	volatile uint64_t qEmpty;
+	volatile uint64_t qQ1;
+	volatile uint64_t qQ2;
+	volatile uint64_t qQ3;
+	volatile uint64_t qQ4;
+	volatile uint64_t qFull;
+};
+
+#define ARK_MPU_DEBUG 0x0C0
+struct ark_mpu_debug_t {
+	volatile uint32_t state;
+	uint32_t reserved;
+	volatile uint32_t count;
+	volatile uint32_t take;
+	volatile uint32_t peek[4];
+};
+
+/*  Consolidated structure */
+struct ark_mpu_t {
+	struct ark_mpu_id_t id;
+	uint8_t reserved0[(ARK_MPU_HW - ARK_MPU_ID)
+					  - sizeof(struct ark_mpu_id_t)];
+	struct ark_mpu_hw_t hw;
+	uint8_t reserved1[(ARK_MPU_CFG - ARK_MPU_HW) -
+					  sizeof(struct ark_mpu_hw_t)];
+	struct ark_mpu_cfg_t cfg;
+	uint8_t reserved2[(ARK_MPU_STATS - ARK_MPU_CFG) -
+					  sizeof(struct ark_mpu_cfg_t)];
+	struct ark_mpu_stats_t stats;
+	uint8_t reserved3[(ARK_MPU_DEBUG - ARK_MPU_STATS) -
+					  sizeof(struct ark_mpu_stats_t)];
+	struct ark_mpu_debug_t debug;
+};
+
+uint16_t ark_api_num_queues(struct ark_mpu_t *mpu);
+uint16_t ark_api_num_queues_per_port(struct ark_mpu_t *mpu,
+	uint16_t ark_ports);
+int ark_mpu_verify(struct ark_mpu_t *mpu, uint32_t objSize);
+void ark_mpu_stop(struct ark_mpu_t *mpu);
+void ark_mpu_start(struct ark_mpu_t *mpu);
+int ark_mpu_reset(struct ark_mpu_t *mpu);
+int ark_mpu_configure(struct ark_mpu_t *mpu, phys_addr_t ring,
+	uint32_t ringSize, int isTx);
+
+void ark_mpu_dump(struct ark_mpu_t *mpu, const char *msg, uint16_t idx);
+void ark_mpu_dump_setup(struct ark_mpu_t *mpu, uint16_t qid);
+void ark_mpu_reset_stats(struct ark_mpu_t *mpu);
+
+static inline void
+ark_mpu_set_producer(struct ark_mpu_t *mpu, uint32_t idx)
+{
+	mpu->cfg.swProdIndex = idx;
+}
+
+// #define    ark_mpu_set_producer(MPU, IDX) {(MPU)->cfg.swProdIndex = (IDX);}
+
+#endif
diff --git a/drivers/net/ark/ark_pktchkr.c b/drivers/net/ark/ark_pktchkr.c
new file mode 100644
index 0000000..47b75a0
--- /dev/null
+++ b/drivers/net/ark/ark_pktchkr.c
@@ -0,0 +1,445 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <getopt.h>
+#include <sys/time.h>
+#include <locale.h>
+#include <unistd.h>
+
+#include "ark_pktchkr.h"
+#include "ark_debug.h"
+
+static int setArg(char *arg, char *val);
+static int ark_pmd_pktchkr_isGenForever(ArkPktChkr_t handle);
+
+#define ARK_MAX_STR_LEN 64
+union OptV {
+	int Int;
+	int Bool;
+	uint64_t Long;
+	char Str[ARK_MAX_STR_LEN];
+};
+
+enum OPType {
+	OTInt,
+	OTLong,
+	OTBool,
+	OTString
+};
+
+struct Options {
+	char opt[ARK_MAX_STR_LEN];
+	enum OPType t;
+	union OptV v;
+};
+
+static struct Options toptions[] = {
+	{{"configure"}, OTBool, {1} },
+	{{"port"}, OTInt, {0} },
+	{{"mac-dump"}, OTBool, {0} },
+	{{"dg-mode"}, OTBool, {1} },
+	{{"run"}, OTBool, {0} },
+	{{"stop"}, OTBool, {0} },
+	{{"dump"}, OTBool, {0} },
+	{{"enResync"}, OTBool, {0} },
+	{{"tuserErrVal"}, OTInt, {1} },
+	{{"genForever"}, OTBool, {0} },
+	{{"enSlavedStart"}, OTBool, {0} },
+	{{"varyLength"}, OTBool, {0} },
+	{{"incrPayload"}, OTInt, {0} },
+	{{"incrFirstByte"}, OTBool, {0} },
+	{{"insSeqNum"}, OTBool, {0} },
+	{{"insTimeStamp"}, OTBool, {1} },
+	{{"insUDPHdr"}, OTBool, {0} },
+	{{"numPkts"}, OTLong, .v.Long = 10000000000000L},
+	{{"payloadByte"}, OTInt, {0x55} },
+	{{"pktSpacing"}, OTInt, {60} },
+	{{"pktSizeMin"}, OTInt, {2005} },
+	{{"pktSizeMax"}, OTInt, {1514} },
+	{{"pktSizeIncr"}, OTInt, {1} },
+	{{"ethType"}, OTInt, {0x0800} },
+	{{"srcMACAddr"}, OTLong, .v.Long = 0xDC3CF6425060L},
+	{{"dstMACAddr"}, OTLong, .v.Long = 0x112233445566L},
+	{{"hdrDW0"}, OTInt, {0x0016e319} },
+	{{"hdrDW1"}, OTInt, {0x27150004} },
+	{{"hdrDW2"}, OTInt, {0x76967bda} },
+	{{"hdrDW3"}, OTInt, {0x08004500} },
+	{{"hdrDW4"}, OTInt, {0x005276ed} },
+	{{"hdrDW5"}, OTInt, {0x40004006} },
+	{{"hdrDW6"}, OTInt, {0x56cfc0a8} },
+	{{"startOffset"}, OTInt, {0} },
+	{{"dstIP"}, OTString, .v.Str = "169.254.10.240"},
+	{{"dstPort"}, OTInt, {65536} },
+	{{"srcPort"}, OTInt, {65536} },
+};
+
+ArkPktChkr_t
+ark_pmd_pktchkr_init(void *addr, int ord, int l2_mode)
+{
+	struct ArkPktChkrInst *inst =
+		rte_malloc("ArkPktChkrInst", sizeof(struct ArkPktChkrInst), 0);
+	inst->sregs = (struct ArkPktChkrStatRegs *) addr;
+	inst->cregs = (struct ArkPktChkrCtlRegs *) (((uint8_t *) addr) + 0x100);
+	inst->ordinal = ord;
+	inst->l2_mode = l2_mode;
+	return inst;
+}
+
+void
+ark_pmd_pktchkr_uninit(ArkPktChkr_t handle)
+{
+	rte_free(handle);
+}
+
+void
+ark_pmd_pktchkr_run(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->sregs->pktStartStop = 0;
+	inst->sregs->pktStartStop = 0x1;
+}
+
+int
+ark_pmd_pktchkr_stopped(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+	uint32_t r = inst->sregs->pktStartStop;
+
+	return (((r >> 16) & 1) == 1);
+}
+
+void
+ark_pmd_pktchkr_stop(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+	int waitCycle = 10;
+
+	inst->sregs->pktStartStop = 0;
+	while (!ark_pmd_pktchkr_stopped(handle) && (waitCycle > 0)) {
+	usleep(1000);
+	waitCycle--;
+	ARK_DEBUG_TRACE("Waiting for pktchk %d to stop...\n", inst->ordinal);
+	}
+	ARK_DEBUG_TRACE("pktchk %d stopped.\n", inst->ordinal);
+}
+
+int
+ark_pmd_pktchkr_isRunning(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+	uint32_t r = inst->sregs->pktStartStop;
+
+	return ((r & 1) == 1);
+}
+
+static void
+ark_pmd_pktchkr_setPktCtrl(ArkPktChkr_t handle, uint32_t genForever,
+	uint32_t varyLength, uint32_t incrPayload, uint32_t incrFirstByte,
+	uint32_t insSeqNum, uint32_t insUDPHdr, uint32_t enResync,
+	uint32_t tuserErrVal, uint32_t insTimeStamp)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+	uint32_t r = (tuserErrVal << 16) | (enResync << 0);
+
+	inst->sregs->pktCtrl = r;
+	if (!inst->l2_mode) {
+	insUDPHdr = 0;
+	}
+	r = (genForever << 24) | (varyLength << 16) |
+	(incrPayload << 12) | (incrFirstByte << 8) |
+	(insTimeStamp << 5) | (insSeqNum << 4) | insUDPHdr;
+	inst->cregs->pktCtrl = r;
+}
+
+static
+	int
+ark_pmd_pktchkr_isGenForever(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+	uint32_t r = inst->cregs->pktCtrl;
+
+	return (((r >> 24) & 1) == 1);
+}
+
+int
+ark_pmd_pktchkr_waitDone(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	if (ark_pmd_pktchkr_isGenForever(handle)) {
+	ARK_DEBUG_TRACE
+		("Error: waitDone will not terminate because genForever=1\n");
+	return -1;
+	}
+	int waitCycle = 10;
+
+	while (!ark_pmd_pktchkr_stopped(handle) && (waitCycle > 0)) {
+	usleep(1000);
+	waitCycle--;
+	ARK_DEBUG_TRACE
+		("Waiting for packet checker %d's internal pktgen to finish sending...\n",
+		inst->ordinal);
+	ARK_DEBUG_TRACE("pktchk %d's pktgen done.\n", inst->ordinal);
+	}
+	return 0;
+}
+
+int
+ark_pmd_pktchkr_getPktsSent(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	return inst->cregs->pktsSent;
+}
+
+void
+ark_pmd_pktchkr_setPayloadByte(ArkPktChkr_t handle, uint32_t b)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->pktPayload = b;
+}
+
+void
+ark_pmd_pktchkr_setPktSizeMin(ArkPktChkr_t handle, uint32_t x)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->pktSizeMin = x;
+}
+
+void
+ark_pmd_pktchkr_setPktSizeMax(ArkPktChkr_t handle, uint32_t x)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->pktSizeMax = x;
+}
+
+void
+ark_pmd_pktchkr_setPktSizeIncr(ArkPktChkr_t handle, uint32_t x)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->pktSizeIncr = x;
+}
+
+void
+ark_pmd_pktchkr_setNumPkts(ArkPktChkr_t handle, uint32_t x)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->numPkts = x;
+}
+
+void
+ark_pmd_pktchkr_setSrcMACAddr(ArkPktChkr_t handle, uint64_t macAddr)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->srcMACAddrH = (macAddr >> 32) & 0xffff;
+	inst->cregs->srcMACAddrL = macAddr & 0xffffffff;
+}
+
+void
+ark_pmd_pktchkr_setDstMACAddr(ArkPktChkr_t handle, uint64_t macAddr)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->dstMACAddrH = (macAddr >> 32) & 0xffff;
+	inst->cregs->dstMACAddrL = macAddr & 0xffffffff;
+}
+
+void
+ark_pmd_pktchkr_setEthType(ArkPktChkr_t handle, uint32_t x)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	inst->cregs->ethType = x;
+}
+
+void
+ark_pmd_pktchkr_setHdrDW(ArkPktChkr_t handle, uint32_t *hdr)
+{
+	uint32_t i;
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	for (i = 0; i < 7; i++) {
+	inst->cregs->hdrDW[i] = hdr[i];
+	}
+}
+
+void
+ark_pmd_pktchkr_dump_stats(ArkPktChkr_t handle)
+{
+	struct ArkPktChkrInst *inst = (struct ArkPktChkrInst *) handle;
+
+	fprintf(stderr, "pktsRcvd      = (%'u)\n", inst->sregs->pktsRcvd);
+	fprintf(stderr, "bytesRcvd     = (%'" PRIu64 ")\n",
+	inst->sregs->bytesRcvd);
+	fprintf(stderr, "pktsOK        = (%'u)\n", inst->sregs->pktsOK);
+	fprintf(stderr, "pktsMismatch  = (%'u)\n", inst->sregs->pktsMismatch);
+	fprintf(stderr, "pktsErr       = (%'u)\n", inst->sregs->pktsErr);
+	fprintf(stderr, "firstMismatch = (%'u)\n", inst->sregs->firstMismatch);
+	fprintf(stderr, "resyncEvents  = (%'u)\n", inst->sregs->resyncEvents);
+	fprintf(stderr, "pktsMissing   = (%'u)\n", inst->sregs->pktsMissing);
+	fprintf(stderr, "minLatency    = (%'u)\n", inst->sregs->minLatency);
+	fprintf(stderr, "maxLatency    = (%'u)\n", inst->sregs->maxLatency);
+}
+
+static struct Options *
+OPTIONS(const char *id)
+{
+	unsigned i;
+
+	for (i = 0; i < sizeof(toptions) / sizeof(struct Options); i++) {
+	if (strcmp(id, toptions[i].opt) == 0) {
+		return &toptions[i];
+	}
+	}
+	PMD_DRV_LOG(ERR,
+	"pktgen: Could not find requested option !!, option = %s\n", id);
+	return NULL;
+}
+
+static int
+setArg(char *arg, char *val)
+{
+	struct Options *o = OPTIONS(arg);
+
+	if (o) {
+	switch (o->t) {
+	case OTInt:
+	case OTBool:
+		o->v.Int = atoi(val);
+		break;
+	case OTLong:
+		o->v.Int = atoll(val);
+		break;
+	case OTString:
+		strncpy(o->v.Str, val, ARK_MAX_STR_LEN);
+		break;
+	}
+	return 1;
+	}
+	return 0;
+}
+
+/******
+ * Arg format = "opt0=v,optN=v ..."
+ ******/
+void
+ark_pmd_pktchkr_parse(char *args)
+{
+	char *argv, *v;
+	const char toks[] = "= \n\t\v\f\r";
+
+	argv = strtok(args, toks);
+	v = strtok(NULL, toks);
+	setArg(argv, v);
+	while (argv && v) {
+	argv = strtok(NULL, toks);
+	v = strtok(NULL, toks);
+	if (argv && v)
+		setArg(argv, v);
+	}
+}
+
+static int32_t parseIPV4string(char const *ipAddress);
+static int32_t
+parseIPV4string(char const *ipAddress)
+{
+	unsigned int ip[4];
+
+	if (4 != sscanf(ipAddress, "%u.%u.%u.%u", &ip[0], &ip[1], &ip[2], &ip[3]))
+	return 0;
+	return ip[3] + ip[2] * 0x100 + ip[1] * 0x10000ul + ip[0] * 0x1000000ul;
+}
+
+void
+ark_pmd_pktchkr_setup(ArkPktChkr_t handle)
+{
+	uint32_t hdr[7];
+	int32_t dstIp = parseIPV4string(OPTIONS("dstIP")->v.Str);
+
+	if (!OPTIONS("stop")->v.Bool && OPTIONS("configure")->v.Bool) {
+
+	ark_pmd_pktchkr_setPayloadByte(handle, OPTIONS("payloadByte")->v.Int);
+	ark_pmd_pktchkr_setSrcMACAddr(handle, OPTIONS("srcMACAddr")->v.Int);
+	ark_pmd_pktchkr_setDstMACAddr(handle, OPTIONS("dstMACAddr")->v.Long);
+
+	ark_pmd_pktchkr_setEthType(handle, OPTIONS("ethType")->v.Int);
+	if (OPTIONS("dg-mode")->v.Bool) {
+		hdr[0] = OPTIONS("hdrDW0")->v.Int;
+		hdr[1] = OPTIONS("hdrDW1")->v.Int;
+		hdr[2] = OPTIONS("hdrDW2")->v.Int;
+		hdr[3] = OPTIONS("hdrDW3")->v.Int;
+		hdr[4] = OPTIONS("hdrDW4")->v.Int;
+		hdr[5] = OPTIONS("hdrDW5")->v.Int;
+		hdr[6] = OPTIONS("hdrDW6")->v.Int;
+	} else {
+		hdr[0] = dstIp;
+		hdr[1] = OPTIONS("dstPort")->v.Int;
+		hdr[2] = OPTIONS("srcPort")->v.Int;
+		hdr[3] = 0;
+		hdr[4] = 0;
+		hdr[5] = 0;
+		hdr[6] = 0;
+	}
+	ark_pmd_pktchkr_setHdrDW(handle, hdr);
+	ark_pmd_pktchkr_setNumPkts(handle, OPTIONS("numPkts")->v.Int);
+	ark_pmd_pktchkr_setPktSizeMin(handle, OPTIONS("pktSizeMin")->v.Int);
+	ark_pmd_pktchkr_setPktSizeMax(handle, OPTIONS("pktSizeMax")->v.Int);
+	ark_pmd_pktchkr_setPktSizeIncr(handle, OPTIONS("pktSizeIncr")->v.Int);
+	ark_pmd_pktchkr_setPktCtrl(handle,
+		OPTIONS("genForever")->v.Bool,
+		OPTIONS("varyLength")->v.Bool,
+		OPTIONS("incrPayload")->v.Bool,
+		OPTIONS("incrFirstByte")->v.Bool,
+		OPTIONS("insSeqNum")->v.Int,
+		OPTIONS("insUDPHdr")->v.Bool,
+		OPTIONS("enResync")->v.Bool,
+		OPTIONS("tuserErrVal")->v.Int, OPTIONS("insTimeStamp")->v.Int);
+	}
+
+	if (OPTIONS("stop")->v.Bool)
+	ark_pmd_pktchkr_stop(handle);
+
+	if (OPTIONS("run")->v.Bool) {
+	ARK_DEBUG_TRACE("Starting packet checker on port %d\n",
+		OPTIONS("port")->v.Int);
+	ark_pmd_pktchkr_run(handle);
+	}
+
+}
diff --git a/drivers/net/ark/ark_pktchkr.h b/drivers/net/ark/ark_pktchkr.h
new file mode 100644
index 0000000..5c2a60d
--- /dev/null
+++ b/drivers/net/ark/ark_pktchkr.h
@@ -0,0 +1,114 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_PKTCHKR_H_
+#define _ARK_PKTCHKR_H_
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_eal.h>
+
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+
+#define ARK_PKTCHKR_BASE_ADR  0x90000
+
+typedef void *ArkPktChkr_t;
+
+struct ArkPktChkrStatRegs {
+	uint32_t r0;
+	uint32_t pktStartStop;
+	uint32_t pktCtrl;
+	uint32_t pktsRcvd;
+	uint64_t bytesRcvd;
+	uint32_t pktsOK;
+	uint32_t pktsMismatch;
+	uint32_t pktsErr;
+	uint32_t firstMismatch;
+	uint32_t resyncEvents;
+	uint32_t pktsMissing;
+	uint32_t minLatency;
+	uint32_t maxLatency;
+} __attribute__ ((packed));
+
+struct ArkPktChkrCtlRegs {
+	uint32_t pktCtrl;
+	uint32_t pktPayload;
+	uint32_t pktSizeMin;
+	uint32_t pktSizeMax;
+	uint32_t pktSizeIncr;
+	uint32_t numPkts;
+	uint32_t pktsSent;
+	uint32_t srcMACAddrL;
+	uint32_t srcMACAddrH;
+	uint32_t dstMACAddrL;
+	uint32_t dstMACAddrH;
+	uint32_t ethType;
+	uint32_t hdrDW[7];
+} __attribute__ ((packed));
+
+struct ArkPktChkrInst {
+	struct rte_eth_dev_info *dev_info;
+	volatile struct ArkPktChkrStatRegs *sregs;
+	volatile struct ArkPktChkrCtlRegs *cregs;
+	int l2_mode;
+	int ordinal;
+};
+
+/*  packet checker functions */
+ArkPktChkr_t ark_pmd_pktchkr_init(void *addr, int ord, int l2_mode);
+void ark_pmd_pktchkr_uninit(ArkPktChkr_t handle);
+void ark_pmd_pktchkr_run(ArkPktChkr_t handle);
+int ark_pmd_pktchkr_stopped(ArkPktChkr_t handle);
+void ark_pmd_pktchkr_stop(ArkPktChkr_t handle);
+int ark_pmd_pktchkr_isRunning(ArkPktChkr_t handle);
+int ark_pmd_pktchkr_getPktsSent(ArkPktChkr_t handle);
+void ark_pmd_pktchkr_setPayloadByte(ArkPktChkr_t handle, uint32_t b);
+void ark_pmd_pktchkr_setPktSizeMin(ArkPktChkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_setPktSizeMax(ArkPktChkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_setPktSizeIncr(ArkPktChkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_setNumPkts(ArkPktChkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_setSrcMACAddr(ArkPktChkr_t handle, uint64_t macAddr);
+void ark_pmd_pktchkr_setDstMACAddr(ArkPktChkr_t handle, uint64_t macAddr);
+void ark_pmd_pktchkr_setEthType(ArkPktChkr_t handle, uint32_t x);
+void ark_pmd_pktchkr_setHdrDW(ArkPktChkr_t handle, uint32_t *hdr);
+void ark_pmd_pktchkr_parse(char *args);
+void ark_pmd_pktchkr_setup(ArkPktChkr_t handle);
+void ark_pmd_pktchkr_dump_stats(ArkPktChkr_t handle);
+int ark_pmd_pktchkr_waitDone(ArkPktChkr_t handle);
+
+#endif
diff --git a/drivers/net/ark/ark_pktdir.c b/drivers/net/ark/ark_pktdir.c
new file mode 100644
index 0000000..e68ff4e
--- /dev/null
+++ b/drivers/net/ark/ark_pktdir.c
@@ -0,0 +1,79 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include "ark_global.h"
+
+ArkPktDir_t
+ark_pmd_pktdir_init(void *base)
+{
+	struct ArkPktDirInst *inst =
+	rte_malloc("ArkPktDirInst", sizeof(struct ArkPktDirInst), 0);
+	inst->regs = (struct ArkPktDirRegs *) base;
+	inst->regs->ctrl = 0x00110110;	/* POR state */
+	return inst;
+}
+
+void
+ark_pmd_pktdir_uninit(ArkPktDir_t handle)
+{
+	struct ArkPktDirInst *inst = (struct ArkPktDirInst *) handle;
+
+	rte_free(inst);
+}
+
+void
+ark_pmd_pktdir_setup(ArkPktDir_t handle, uint32_t v)
+{
+	struct ArkPktDirInst *inst = (struct ArkPktDirInst *) handle;
+
+	inst->regs->ctrl = v;
+}
+
+uint32_t
+ark_pmd_pktdir_status(ArkPktDir_t handle)
+{
+	struct ArkPktDirInst *inst = (struct ArkPktDirInst *) handle;
+
+	return inst->regs->ctrl;
+}
+
+uint32_t
+ark_pmd_pktdir_stallCnt(ArkPktDir_t handle)
+{
+	struct ArkPktDirInst *inst = (struct ArkPktDirInst *) handle;
+
+	return inst->regs->stallCnt;
+}
diff --git a/drivers/net/ark/ark_pktdir.h b/drivers/net/ark/ark_pktdir.h
new file mode 100644
index 0000000..6c0c634
--- /dev/null
+++ b/drivers/net/ark/ark_pktdir.h
@@ -0,0 +1,68 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_PKTDIR_H_
+#define _ARK_PKTDIR_H_
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_eal.h>
+
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+
+#define ARK_PKTDIR_BASE_ADR  0xA0000
+
+typedef void *ArkPktDir_t;
+
+struct ArkPktDirRegs {
+	uint32_t ctrl;
+	uint32_t status;
+	uint32_t stallCnt;
+} __attribute__ ((packed));
+
+struct ArkPktDirInst {
+	volatile struct ArkPktDirRegs *regs;
+};
+
+ArkPktDir_t ark_pmd_pktdir_init(void *base);
+void ark_pmd_pktdir_uninit(ArkPktDir_t handle);
+void ark_pmd_pktdir_setup(ArkPktDir_t handle, uint32_t v);
+uint32_t ark_pmd_pktdir_stallCnt(ArkPktDir_t handle);
+uint32_t ark_pmd_pktdir_status(ArkPktDir_t handle);
+
+#endif
diff --git a/drivers/net/ark/ark_pktgen.c b/drivers/net/ark/ark_pktgen.c
new file mode 100644
index 0000000..743aacf
--- /dev/null
+++ b/drivers/net/ark/ark_pktgen.c
@@ -0,0 +1,477 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <getopt.h>
+#include <sys/time.h>
+#include <locale.h>
+#include <unistd.h>
+
+#include "ark_pktgen.h"
+#include "ark_debug.h"
+
+#define ARK_MAX_STR_LEN 64
+union OptV {
+	int Int;
+	int Bool;
+	uint64_t Long;
+	char Str[ARK_MAX_STR_LEN];
+};
+
+enum OPType {
+	OTInt,
+	OTLong,
+	OTBool,
+	OTString
+};
+
+struct Options {
+	char opt[ARK_MAX_STR_LEN];
+	enum OPType t;
+	union OptV v;
+};
+
+static struct Options toptions[] = {
+	{{"configure"}, OTBool, {1} },
+	{{"dg-mode"}, OTBool, {1} },
+	{{"run"}, OTBool, {0} },
+	{{"pause"}, OTBool, {0} },
+	{{"reset"}, OTBool, {0} },
+	{{"dump"}, OTBool, {0} },
+	{{"genForever"}, OTBool, {0} },
+	{{"enSlavedStart"}, OTBool, {0} },
+	{{"varyLength"}, OTBool, {0} },
+	{{"incrPayload"}, OTBool, {0} },
+	{{"incrFirstByte"}, OTBool, {0} },
+	{{"insSeqNum"}, OTBool, {0} },
+	{{"insTimeStamp"}, OTBool, {1} },
+	{{"insUDPHdr"}, OTBool, {0} },
+	{{"numPkts"}, OTLong, .v.Long = 100000000},
+	{{"payloadByte"}, OTInt, {0x55} },
+	{{"pktSpacing"}, OTInt, {130} },
+	{{"pktSizeMin"}, OTInt, {2006} },
+	{{"pktSizeMax"}, OTInt, {1514} },
+	{{"pktSizeIncr"}, OTInt, {1} },
+	{{"ethType"}, OTInt, {0x0800} },
+	{{"srcMACAddr"}, OTLong, .v.Long = 0xDC3CF6425060L},
+	{{"dstMACAddr"}, OTLong, .v.Long = 0x112233445566L},
+	{{"hdrDW0"}, OTInt, {0x0016e319} },
+	{{"hdrDW1"}, OTInt, {0x27150004} },
+	{{"hdrDW2"}, OTInt, {0x76967bda} },
+	{{"hdrDW3"}, OTInt, {0x08004500} },
+	{{"hdrDW4"}, OTInt, {0x005276ed} },
+	{{"hdrDW5"}, OTInt, {0x40004006} },
+	{{"hdrDW6"}, OTInt, {0x56cfc0a8} },
+	{{"startOffset"}, OTInt, {0} },
+	{{"bytesPerCycle"}, OTInt, {10} },
+	{{"shaping"}, OTBool, {0} },
+	{{"dstIP"}, OTString, .v.Str = "169.254.10.240"},
+	{{"dstPort"}, OTInt, {65536} },
+	{{"srcPort"}, OTInt, {65536} },
+};
+
+ArkPktGen_t
+ark_pmd_pktgen_init(void *adr, int ord, int l2_mode)
+{
+	struct ArkPktGenInst *inst =
+		rte_malloc("ArkPktGenInstPMD", sizeof(struct ArkPktGenInst), 0);
+	inst->regs = (struct ArkPktGenRegs *) adr;
+	inst->ordinal = ord;
+	inst->l2_mode = l2_mode;
+	return inst;
+}
+
+void
+ark_pmd_pktgen_uninit(ArkPktGen_t handle)
+{
+	rte_free(handle);
+}
+
+void
+ark_pmd_pktgen_run(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->pktStartStop = 1;
+}
+
+uint32_t
+ark_pmd_pktgen_paused(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+	uint32_t r = inst->regs->pktStartStop;
+
+	return (((r >> 16) & 1) == 1);
+}
+
+void
+ark_pmd_pktgen_pause(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+	int cnt = 0;
+
+	inst->regs->pktStartStop = 0;
+
+	while (!ark_pmd_pktgen_paused(handle)) {
+		usleep(1000);
+		if (cnt++ > 100) {
+			PMD_DRV_LOG(ERR, "pktgen %d failed to pause.\n", inst->ordinal);
+			break;
+		}
+	}
+	ARK_DEBUG_TRACE("pktgen %d paused.\n", inst->ordinal);
+}
+
+void
+ark_pmd_pktgen_reset(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	if (!ark_pmd_pktgen_isRunning(handle) && !ark_pmd_pktgen_paused(handle)) {
+		ARK_DEBUG_TRACE
+			("pktgen %d is not running and is not paused. No need to reset.\n",
+			 inst->ordinal);
+		return;
+	}
+
+	if (ark_pmd_pktgen_isRunning(handle) && !ark_pmd_pktgen_paused(handle)) {
+		ARK_DEBUG_TRACE("pktgen %d is not paused. Pausing first.\n",
+						inst->ordinal);
+		ark_pmd_pktgen_pause(handle);
+	}
+
+	ARK_DEBUG_TRACE("Resetting pktgen %d.\n", inst->ordinal);
+	inst->regs->pktStartStop = (1 << 8);
+}
+
+uint32_t
+ark_pmd_pktgen_txDone(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+	uint32_t r = inst->regs->pktStartStop;
+
+	return (((r >> 24) & 1) == 1);
+}
+
+uint32_t
+ark_pmd_pktgen_isRunning(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+	uint32_t r = inst->regs->pktStartStop;
+
+	return ((r & 1) == 1);
+}
+
+uint32_t
+ark_pmd_pktgen_isGenForever(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+	uint32_t r = inst->regs->pktCtrl;
+
+	return (((r >> 24) & 1) == 1);
+}
+
+void
+ark_pmd_pktgen_waitDone(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	if (ark_pmd_pktgen_isGenForever(handle)) {
+	PMD_DRV_LOG(ERR, "waitDone will not terminate because genForever=1\n");
+	}
+	int waitCycle = 10;
+
+	while (!ark_pmd_pktgen_txDone(handle) && (waitCycle > 0)) {
+	usleep(1000);
+	waitCycle--;
+	ARK_DEBUG_TRACE("Waiting for pktgen %d to finish sending...\n",
+		inst->ordinal);
+	}
+	ARK_DEBUG_TRACE("pktgen %d done.\n", inst->ordinal);
+}
+
+uint32_t
+ark_pmd_pktgen_getPktsSent(ArkPktGen_t handle)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	return inst->regs->pktsSent;
+}
+
+void
+ark_pmd_pktgen_setPayloadByte(ArkPktGen_t handle, uint32_t b)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->pktPayload = b;
+}
+
+void
+ark_pmd_pktgen_setPktSpacing(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->pktSpacing = x;
+}
+
+void
+ark_pmd_pktgen_setPktSizeMin(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->pktSizeMin = x;
+}
+
+void
+ark_pmd_pktgen_setPktSizeMax(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->pktSizeMax = x;
+}
+
+void
+ark_pmd_pktgen_setPktSizeIncr(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->pktSizeIncr = x;
+}
+
+void
+ark_pmd_pktgen_setNumPkts(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->numPkts = x;
+}
+
+void
+ark_pmd_pktgen_setSrcMACAddr(ArkPktGen_t handle, uint64_t macAddr)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->srcMACAddrH = (macAddr >> 32) & 0xffff;
+	inst->regs->srcMACAddrL = macAddr & 0xffffffff;
+}
+
+void
+ark_pmd_pktgen_setDstMACAddr(ArkPktGen_t handle, uint64_t macAddr)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->dstMACAddrH = (macAddr >> 32) & 0xffff;
+	inst->regs->dstMACAddrL = macAddr & 0xffffffff;
+}
+
+void
+ark_pmd_pktgen_setEthType(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->ethType = x;
+}
+
+void
+ark_pmd_pktgen_setHdrDW(ArkPktGen_t handle, uint32_t *hdr)
+{
+	uint32_t i;
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	for (i = 0; i < 7; i++)
+		inst->regs->hdrDW[i] = hdr[i];
+}
+
+void
+ark_pmd_pktgen_setStartOffset(ArkPktGen_t handle, uint32_t x)
+{
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	inst->regs->startOffset = x;
+}
+
+static struct Options *
+OPTIONS(const char *id)
+{
+	unsigned i;
+
+	for (i = 0; i < sizeof(toptions) / sizeof(struct Options); i++) {
+		if (strcmp(id, toptions[i].opt) == 0)
+			return &toptions[i];
+	}
+
+	PMD_DRV_LOG
+		(ERR,
+		 "pktgen:  Could not find requested option !!, option = %s\n", id
+		 );
+	return NULL;
+}
+
+static int pmd_setArg(char *arg, char *val);
+static int
+pmd_setArg(char *arg, char *val)
+{
+	struct Options *o = OPTIONS(arg);
+
+	if (o) {
+	switch (o->t) {
+	case OTInt:
+	case OTBool:
+		o->v.Int = atoi(val);
+		break;
+	case OTLong:
+		o->v.Int = atoll(val);
+		break;
+	case OTString:
+		strncpy(o->v.Str, val, ARK_MAX_STR_LEN);
+		break;
+	}
+	return 1;
+	}
+	return 0;
+}
+
+/******
+ * Arg format = "opt0=v,optN=v ..."
+ ******/
+void
+ark_pmd_pktgen_parse(char *args)
+{
+	char *argv, *v;
+	const char toks[] = " =\n\t\v\f\r";
+
+	argv = strtok(args, toks);
+	v = strtok(NULL, toks);
+	pmd_setArg(argv, v);
+	while (argv && v) {
+	argv = strtok(NULL, toks);
+	v = strtok(NULL, toks);
+	if (argv && v)
+		pmd_setArg(argv, v);
+	}
+}
+
+static int32_t parseIPV4string(char const *ipAddress);
+static int32_t
+parseIPV4string(char const *ipAddress)
+{
+	unsigned int ip[4];
+
+	if (4 != sscanf(ipAddress, "%u.%u.%u.%u", &ip[0], &ip[1], &ip[2], &ip[3]))
+	return 0;
+	return ip[3] + ip[2] * 0x100 + ip[1] * 0x10000ul + ip[0] * 0x1000000ul;
+}
+
+static void
+ark_pmd_pktgen_setPktCtrl(ArkPktGen_t handle, uint32_t genForever,
+	uint32_t enSlavedStart, uint32_t varyLength, uint32_t incrPayload,
+	uint32_t incrFirstByte, uint32_t insSeqNum, uint32_t insUDPHdr,
+	uint32_t insTimeStamp)
+{
+	uint32_t r;
+	struct ArkPktGenInst *inst = (struct ArkPktGenInst *) handle;
+
+	if (!inst->l2_mode)
+		insUDPHdr = 0;
+
+	r = (genForever << 24) | (enSlavedStart << 20) | (varyLength << 16) |
+	(incrPayload << 12) | (incrFirstByte << 8) |
+	(insTimeStamp << 5) | (insSeqNum << 4) | insUDPHdr;
+
+	inst->regs->bytesPerCycle = OPTIONS("bytesPerCycle")->v.Int;
+	if (OPTIONS("shaping")->v.Bool)
+		r = r | (1 << 28);	/* enable shaping */
+
+
+	inst->regs->pktCtrl = r;
+}
+
+void
+ark_pmd_pktgen_setup(ArkPktGen_t handle)
+{
+	uint32_t hdr[7];
+	int32_t dstIp = parseIPV4string(OPTIONS("dstIP")->v.Str);
+
+	if (!OPTIONS("pause")->v.Bool && (!OPTIONS("reset")->v.Bool
+		&& (OPTIONS("configure")->v.Bool))) {
+
+	ark_pmd_pktgen_setPayloadByte(handle, OPTIONS("payloadByte")->v.Int);
+	ark_pmd_pktgen_setSrcMACAddr(handle, OPTIONS("srcMACAddr")->v.Int);
+	ark_pmd_pktgen_setDstMACAddr(handle, OPTIONS("dstMACAddr")->v.Long);
+	ark_pmd_pktgen_setEthType(handle, OPTIONS("ethType")->v.Int);
+
+	if (OPTIONS("dg-mode")->v.Bool) {
+		hdr[0] = OPTIONS("hdrDW0")->v.Int;
+		hdr[1] = OPTIONS("hdrDW1")->v.Int;
+		hdr[2] = OPTIONS("hdrDW2")->v.Int;
+		hdr[3] = OPTIONS("hdrDW3")->v.Int;
+		hdr[4] = OPTIONS("hdrDW4")->v.Int;
+		hdr[5] = OPTIONS("hdrDW5")->v.Int;
+		hdr[6] = OPTIONS("hdrDW6")->v.Int;
+	} else {
+		hdr[0] = dstIp;
+		hdr[1] = OPTIONS("dstPort")->v.Int;
+		hdr[2] = OPTIONS("srcPort")->v.Int;
+		hdr[3] = 0;
+		hdr[4] = 0;
+		hdr[5] = 0;
+		hdr[6] = 0;
+	}
+	ark_pmd_pktgen_setHdrDW(handle, hdr);
+	ark_pmd_pktgen_setNumPkts(handle, OPTIONS("numPkts")->v.Int);
+	ark_pmd_pktgen_setPktSizeMin(handle, OPTIONS("pktSizeMin")->v.Int);
+	ark_pmd_pktgen_setPktSizeMax(handle, OPTIONS("pktSizeMax")->v.Int);
+	ark_pmd_pktgen_setPktSizeIncr(handle, OPTIONS("pktSizeIncr")->v.Int);
+	ark_pmd_pktgen_setPktSpacing(handle, OPTIONS("pktSpacing")->v.Int);
+	ark_pmd_pktgen_setStartOffset(handle, OPTIONS("startOffset")->v.Int);
+	ark_pmd_pktgen_setPktCtrl(handle,
+		OPTIONS("genForever")->v.Bool,
+		OPTIONS("enSlavedStart")->v.Bool,
+		OPTIONS("varyLength")->v.Bool,
+		OPTIONS("incrPayload")->v.Bool,
+		OPTIONS("incrFirstByte")->v.Bool,
+		OPTIONS("insSeqNum")->v.Int,
+		OPTIONS("insUDPHdr")->v.Bool, OPTIONS("insTimeStamp")->v.Int);
+	}
+
+	if (OPTIONS("pause")->v.Bool)
+	ark_pmd_pktgen_pause(handle);
+
+	if (OPTIONS("reset")->v.Bool)
+	ark_pmd_pktgen_reset(handle);
+
+	if (OPTIONS("run")->v.Bool) {
+	ARK_DEBUG_TRACE("Starting packet generator on port %d\n",
+		OPTIONS("port")->v.Int);
+	ark_pmd_pktgen_run(handle);
+	}
+}
diff --git a/drivers/net/ark/ark_pktgen.h b/drivers/net/ark/ark_pktgen.h
new file mode 100644
index 0000000..3d81d60
--- /dev/null
+++ b/drivers/net/ark/ark_pktgen.h
@@ -0,0 +1,106 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_PKTGEN_H_
+#define _ARK_PKTGEN_H_
+
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_eal.h>
+
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+
+#define ARK_PKTGEN_BASE_ADR  0x10000
+
+typedef void *ArkPktGen_t;
+
+struct ArkPktGenRegs {
+	uint32_t r0;
+	volatile uint32_t pktStartStop;
+	volatile uint32_t pktCtrl;
+	uint32_t pktPayload;
+	uint32_t pktSpacing;
+	uint32_t pktSizeMin;
+	uint32_t pktSizeMax;
+	uint32_t pktSizeIncr;
+	volatile uint32_t numPkts;
+	volatile uint32_t pktsSent;
+	uint32_t srcMACAddrL;
+	uint32_t srcMACAddrH;
+	uint32_t dstMACAddrL;
+	uint32_t dstMACAddrH;
+	uint32_t ethType;
+	uint32_t hdrDW[7];
+	uint32_t startOffset;
+	uint32_t bytesPerCycle;
+} __attribute__ ((packed));
+
+struct ArkPktGenInst {
+	struct rte_eth_dev_info *dev_info;
+	struct ArkPktGenRegs *regs;
+	int l2_mode;
+	int ordinal;
+};
+
+/*  packet generator functions */
+ArkPktGen_t ark_pmd_pktgen_init(void *, int ord, int l2_mode);
+void ark_pmd_pktgen_uninit(ArkPktGen_t handle);
+void ark_pmd_pktgen_run(ArkPktGen_t handle);
+void ark_pmd_pktgen_pause(ArkPktGen_t handle);
+uint32_t ark_pmd_pktgen_paused(ArkPktGen_t handle);
+uint32_t ark_pmd_pktgen_isGenForever(ArkPktGen_t handle);
+uint32_t ark_pmd_pktgen_isRunning(ArkPktGen_t handle);
+uint32_t ark_pmd_pktgen_txDone(ArkPktGen_t handle);
+void ark_pmd_pktgen_reset(ArkPktGen_t handle);
+void ark_pmd_pktgen_waitDone(ArkPktGen_t handle);
+uint32_t ark_pmd_pktgen_getPktsSent(ArkPktGen_t handle);
+void ark_pmd_pktgen_setPayloadByte(ArkPktGen_t handle, uint32_t b);
+void ark_pmd_pktgen_setPktSpacing(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_setPktSizeMin(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_setPktSizeMax(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_setPktSizeIncr(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_setNumPkts(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_setSrcMACAddr(ArkPktGen_t handle, uint64_t macAddr);
+void ark_pmd_pktgen_setDstMACAddr(ArkPktGen_t handle, uint64_t macAddr);
+void ark_pmd_pktgen_setEthType(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_setHdrDW(ArkPktGen_t handle, uint32_t *hdr);
+void ark_pmd_pktgen_setStartOffset(ArkPktGen_t handle, uint32_t x);
+void ark_pmd_pktgen_parse(char *argv);
+void ark_pmd_pktgen_setup(ArkPktGen_t handle);
+
+#endif
diff --git a/drivers/net/ark/ark_rqp.c b/drivers/net/ark/ark_rqp.c
new file mode 100644
index 0000000..468c6d4
--- /dev/null
+++ b/drivers/net/ark/ark_rqp.c
@@ -0,0 +1,93 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_rqp.h"
+#include "ark_debug.h"
+
+/* ************************************************************************* */
+void
+ark_rqp_stats_reset(struct ark_rqpace_t *rqp)
+{
+	rqp->statsClear = 1;
+	/* POR 992 */
+	/* rqp->cpld_max = 992; */
+	/* POR 64 */
+	/* rqp->cplh_max = 64; */
+
+}
+
+/* ************************************************************************* */
+void
+ark_rqp_dump(struct ark_rqpace_t *rqp)
+{
+	if (rqp->errCountOther != 0)
+	fprintf
+		(stderr,
+		"ARKP RQP Errors noted: ctrl: %d cplh_hmax %d cpld_max %d"
+		 FMT_SU32
+		FMT_SU32 "\n",
+		 rqp->ctrl, rqp->cplh_max, rqp->cpld_max,
+		"Error Count", rqp->errCnt,
+		 "Error General", rqp->errCountOther);
+
+	ARK_DEBUG_STATS
+		("ARKP RQP Dump: ctrl: %d cplh_hmax %d cpld_max %d" FMT_SU32
+		 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32
+		 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32
+		 FMT_SU32 FMT_SU32 FMT_SU32
+		 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 "\n",
+		 rqp->ctrl, rqp->cplh_max, rqp->cpld_max,
+		 "Error Count", rqp->errCnt,
+		 "Error General", rqp->errCountOther,
+		 "stallPS", rqp->stallPS,
+		 "stallPS Min", rqp->stallPSMin,
+		 "stallPS Max", rqp->stallPSMax,
+		 "reqPS", rqp->reqPS,
+		 "reqPS Min", rqp->reqPSMin,
+		 "reqPS Max", rqp->reqPSMax,
+		 "reqDWPS", rqp->reqDWPS,
+		 "reqDWPS Min", rqp->reqDWPSMin,
+		 "reqDWPS Max", rqp->reqDWPSMax,
+		 "cplPS", rqp->cplPS,
+		 "cplPS Min", rqp->cplPSMin,
+		 "cplPS Max", rqp->cplPSMax,
+		 "cplDWPS", rqp->cplDWPS,
+		 "cplDWPS Min", rqp->cplDWPSMin,
+		 "cplDWPS Max", rqp->cplDWPSMax,
+		 "cplh pending", rqp->cplh_pending,
+		 "cpld pending", rqp->cpld_pending,
+		 "cplh pending max", rqp->cplh_pending_max,
+		 "cpld pending max", rqp->cpld_pending_max);
+}
diff --git a/drivers/net/ark/ark_rqp.h b/drivers/net/ark/ark_rqp.h
new file mode 100644
index 0000000..1c11c41
--- /dev/null
+++ b/drivers/net/ark/ark_rqp.h
@@ -0,0 +1,75 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_RQP_H_
+#define _ARK_RQP_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/*
+ * RQ Pacing core hardware structure
+ */
+struct ark_rqpace_t {
+	volatile uint32_t ctrl;
+	volatile uint32_t statsClear;
+	volatile uint32_t cplh_max;
+	volatile uint32_t cpld_max;
+	volatile uint32_t errCnt;
+	volatile uint32_t stallPS;
+	volatile uint32_t stallPSMin;
+	volatile uint32_t stallPSMax;
+	volatile uint32_t reqPS;
+	volatile uint32_t reqPSMin;
+	volatile uint32_t reqPSMax;
+	volatile uint32_t reqDWPS;
+	volatile uint32_t reqDWPSMin;
+	volatile uint32_t reqDWPSMax;
+	volatile uint32_t cplPS;
+	volatile uint32_t cplPSMin;
+	volatile uint32_t cplPSMax;
+	volatile uint32_t cplDWPS;
+	volatile uint32_t cplDWPSMin;
+	volatile uint32_t cplDWPSMax;
+	volatile uint32_t cplh_pending;
+	volatile uint32_t cpld_pending;
+	volatile uint32_t cplh_pending_max;
+	volatile uint32_t cpld_pending_max;
+	volatile uint32_t errCountOther;
+};
+
+void ark_rqp_dump(struct ark_rqpace_t *rqp);
+void ark_rqp_stats_reset(struct ark_rqpace_t *rqp);
+
+#endif
diff --git a/drivers/net/ark/ark_udm.c b/drivers/net/ark/ark_udm.c
new file mode 100644
index 0000000..a239c54
--- /dev/null
+++ b/drivers/net/ark/ark_udm.c
@@ -0,0 +1,221 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include "ark_debug.h"
+#include "ark_udm.h"
+
+int
+ark_udm_verify(struct ark_udm_t *udm)
+{
+	if (sizeof(struct ark_udm_t) != ARK_UDM_EXPECT_SIZE) {
+		fprintf(stderr, "  UDM structure looks incorrect %#x vs %#lx\n",
+				ARK_UDM_EXPECT_SIZE, sizeof(struct ark_udm_t));
+		return -1;
+	}
+
+	if (udm->setup.const0 != ARK_UDM_CONST) {
+		fprintf(stderr, "  UDM module not found as expected 0x%08x\n",
+				udm->setup.const0);
+		return -1;
+	}
+	return 0;
+}
+
+int
+ark_udm_stop(struct ark_udm_t *udm, const int wait)
+{
+	int cnt = 0;
+
+	udm->cfg.command = 2;
+
+	while (wait && (udm->cfg.stopFlushed & 0x01) == 0) {
+	if (cnt++ > 1000)
+		return 1;
+
+	usleep(10);
+	}
+	return 0;
+}
+
+int
+ark_udm_reset(struct ark_udm_t *udm)
+{
+	int status;
+
+	status = ark_udm_stop(udm, 1);
+	if (status != 0) {
+		ARK_DEBUG_TRACE("ARKP: %s  stop failed  doing forced reset\n",
+						__func__);
+		udm->cfg.command = 4;
+		usleep(10);
+		udm->cfg.command = 3;
+		status = ark_udm_stop(udm, 0);
+		ARK_DEBUG_TRACE
+			("ARKP: %s  stop status %d post failure and forced reset\n",
+			 __func__, status);
+	} else {
+		udm->cfg.command = 3;
+	}
+
+	return status;
+}
+
+void
+ark_udm_start(struct ark_udm_t *udm)
+{
+	udm->cfg.command = 1;
+}
+
+void
+ark_udm_stats_reset(struct ark_udm_t *udm)
+{
+	udm->pcibp.pci_clear = 1;
+	udm->tlp_ps.tlp_clear = 1;
+}
+
+void
+ark_udm_configure(struct ark_udm_t *udm, uint32_t headroom, uint32_t dataroom,
+	uint32_t write_interval_ns)
+{
+	/* headroom and data room are in DWs in the UDM */
+	udm->cfg.dataroom = dataroom / 4;
+	udm->cfg.headroom = headroom / 4;
+
+	/* 4 NS period ns */
+	udm->rt_cfg.writeInterval = write_interval_ns / 4;
+}
+
+void
+ark_udm_write_addr(struct ark_udm_t *udm, phys_addr_t addr)
+{
+	udm->rt_cfg.hwProdAddr = addr;
+}
+
+int
+ark_udm_is_flushed(struct ark_udm_t *udm)
+{
+	return (udm->cfg.stopFlushed & 0x01) != 0;
+}
+
+uint64_t
+ark_udm_dropped(struct ark_udm_t *udm)
+{
+	return udm->qstats.qPktDrop;
+}
+
+uint64_t
+ark_udm_bytes(struct ark_udm_t *udm)
+{
+	return udm->qstats.qByteCount;
+}
+
+uint64_t
+ark_udm_packets(struct ark_udm_t *udm)
+{
+	return udm->qstats.qFFPacketCount;
+}
+
+void
+ark_udm_dump_stats(struct ark_udm_t *udm, const char *msg)
+{
+	ARK_DEBUG_STATS("ARKP UDM Stats: %s" FMT_SU64 FMT_SU64 FMT_SU64 FMT_SU64
+	FMT_SU64 "\n", msg, "Pkts Received", udm->stats.rxPacketCount,
+	"Pkts Finalized", udm->stats.rxSentPackets, "Pkts Dropped",
+	udm->tlp.pkt_drop, "Bytes Count", udm->stats.rxByteCount, "MBuf Count",
+	udm->stats.rxMBufCount);
+}
+
+void
+ark_udm_dump_queue_stats(struct ark_udm_t *udm, const char *msg, uint16_t qid)
+{
+	ARK_DEBUG_STATS
+		("ARKP UDM Queue %3u Stats: %s"
+		 FMT_SU64 FMT_SU64
+		 FMT_SU64 FMT_SU64
+		 FMT_SU64 "\n",
+		 qid, msg,
+		 "Pkts Received", udm->qstats.qPacketCount,
+		 "Pkts Finalized", udm->qstats.qFFPacketCount,
+		 "Pkts Dropped", udm->qstats.qPktDrop,
+		 "Bytes Count", udm->qstats.qByteCount,
+		 "MBuf Count", udm->qstats.qMbufCount);
+}
+
+void
+ark_udm_dump(struct ark_udm_t *udm, const char *msg)
+{
+	ARK_DEBUG_TRACE("ARKP UDM Dump: %s Stopped: %d\n", msg,
+	udm->cfg.stopFlushed);
+}
+
+void
+ark_udm_dump_setup(struct ark_udm_t *udm, uint16_t qId)
+{
+	ARK_DEBUG_TRACE
+		("UDM Setup Q: %u"
+		 FMT_SPTR FMT_SU32 "\n",
+		 qId,
+		 "hwProdAddr", (void *) udm->rt_cfg.hwProdAddr,
+		 "prodIdx", udm->rt_cfg.prodIdx);
+}
+
+void
+ark_udm_dump_perf(struct ark_udm_t *udm, const char *msg)
+{
+	struct ark_udm_pcibp_t *bp = &udm->pcibp;
+
+	ARK_DEBUG_STATS
+		("ARKP UDM Performance %s"
+		 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 FMT_SU32 "\n",
+		 msg,
+		 "PCI Empty", bp->pci_empty,
+		 "PCI Q1", bp->pci_q1,
+		 "PCI Q2", bp->pci_q2,
+		 "PCI Q3", bp->pci_q3,
+		 "PCI Q4", bp->pci_q4,
+		 "PCI Full", bp->pci_full);
+}
+
+void
+ark_udm_queue_stats_reset(struct ark_udm_t *udm)
+{
+	udm->qstats.qByteCount = 1;
+}
+
+void
+ark_udm_queue_enable(struct ark_udm_t *udm, int enable)
+{
+	udm->qstats.qEnable = enable ? 1 : 0;
+}
diff --git a/drivers/net/ark/ark_udm.h b/drivers/net/ark/ark_udm.h
new file mode 100644
index 0000000..89dcf8a
--- /dev/null
+++ b/drivers/net/ark/ark_udm.h
@@ -0,0 +1,175 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2015-2017 Atomic Rules LLC
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _ARK_UDM_H_
+#define _ARK_UDM_H_
+
+#include <stdint.h>
+
+#include <rte_memory.h>
+
+/*
+ * UDM hardware structures
+ */
+
+#define ARK_RX_WRITE_TIME_NS 2500
+#define ARK_UDM_SETUP 0
+#define ARK_UDM_CONST 0xBACECACE
+struct ark_udm_setup_t {
+	uint32_t r0;
+	uint32_t r4;
+	volatile uint32_t cycleCount;
+	uint32_t const0;
+};
+
+#define ARK_UDM_CFG 0x010
+struct ark_udm_cfg_t {
+	volatile uint32_t stopFlushed;	/* RO */
+	volatile uint32_t command;
+	uint32_t dataroom;
+	uint32_t headroom;
+};
+
+typedef enum {
+	ARK_UDM_START = 0x1,
+	ARK_UDM_STOP = 0x2,
+	ARK_UDM_RESET = 0x3
+} ArkUdmCommands;
+
+#define ARK_UDM_STATS 0x020
+struct ark_udm_stats_t {
+	volatile uint64_t rxByteCount;
+	volatile uint64_t rxPacketCount;
+	volatile uint64_t rxMBufCount;
+	volatile uint64_t rxSentPackets;
+};
+
+#define ARK_UDM_PQ 0x040
+struct ark_udm_queue_stats_t {
+	volatile uint64_t qByteCount;
+	volatile uint64_t qPacketCount;	/* includes drops */
+	volatile uint64_t qMbufCount;
+	volatile uint64_t qFFPacketCount;
+	volatile uint64_t qPktDrop;
+	uint32_t qEnable;
+};
+
+#define ARK_UDM_TLP 0x0070
+struct ark_udm_tlp_t {
+	volatile uint64_t pkt_drop;	/* global */
+	volatile uint32_t tlp_q1;
+	volatile uint32_t tlp_q2;
+	volatile uint32_t tlp_q3;
+	volatile uint32_t tlp_q4;
+	volatile uint32_t tlp_full;
+};
+
+#define ARK_UDM_PCIBP 0x00a0
+struct ark_udm_pcibp_t {
+	volatile uint32_t pci_clear;
+	volatile uint32_t pci_empty;
+	volatile uint32_t pci_q1;
+	volatile uint32_t pci_q2;
+	volatile uint32_t pci_q3;
+	volatile uint32_t pci_q4;
+	volatile uint32_t pci_full;
+};
+
+#define ARK_UDM_TLP_PS 0x00bc
+struct ark_udm_tlp_ps_t {
+	volatile uint32_t tlp_clear;
+	volatile uint32_t tlp_ps_min;
+	volatile uint32_t tlp_ps_max;
+	volatile uint32_t tlp_full_ps_min;
+	volatile uint32_t tlp_full_ps_max;
+	volatile uint32_t tlp_dw_ps_min;
+	volatile uint32_t tlp_dw_ps_max;
+	volatile uint32_t tlp_pldw_ps_min;
+	volatile uint32_t tlp_pldw_ps_max;
+};
+
+#define ARK_UDM_RT_CFG 0x00E0
+struct ark_udm_rt_cfg_t {
+	phys_addr_t hwProdAddr;
+	uint32_t writeInterval;	/* 4ns cycles */
+	volatile uint32_t prodIdx;	/* RO */
+};
+
+/*  Consolidated structure */
+struct ark_udm_t {
+	struct ark_udm_setup_t setup;
+	struct ark_udm_cfg_t cfg;
+	struct ark_udm_stats_t stats;
+	struct ark_udm_queue_stats_t qstats;
+	uint8_t reserved1[(ARK_UDM_TLP - ARK_UDM_PQ) -
+					  sizeof(struct ark_udm_queue_stats_t)];
+	struct ark_udm_tlp_t tlp;
+	uint8_t reserved2[(ARK_UDM_PCIBP - ARK_UDM_TLP) -
+					  sizeof(struct ark_udm_tlp_t)];
+	struct ark_udm_pcibp_t pcibp;
+	struct ark_udm_tlp_ps_t tlp_ps;
+	struct ark_udm_rt_cfg_t rt_cfg;
+	int8_t reserved3[(0x100 - ARK_UDM_RT_CFG) -
+					  sizeof(struct ark_udm_rt_cfg_t)];
+};
+
+#define ARK_UDM_EXPECT_SIZE (0x00fc + 4)
+#define ARK_UDM_QOFFSET ARK_UDM_EXPECT_SIZE
+
+int ark_udm_verify(struct ark_udm_t *udm);
+int ark_udm_stop(struct ark_udm_t *udm, int wait);
+void ark_udm_start(struct ark_udm_t *udm);
+int ark_udm_reset(struct ark_udm_t *udm);
+void ark_udm_configure(struct ark_udm_t *udm,
+					   uint32_t headroom,
+					   uint32_t dataroom,
+					   uint32_t write_interval_ns);
+void ark_udm_write_addr(struct ark_udm_t *udm, phys_addr_t addr);
+void ark_udm_stats_reset(struct ark_udm_t *udm);
+void ark_udm_dump_stats(struct ark_udm_t *udm, const char *msg);
+void ark_udm_dump_queue_stats(struct ark_udm_t *udm, const char *msg,
+							  uint16_t qid);
+void ark_udm_dump(struct ark_udm_t *udm, const char *msg);
+void ark_udm_dump_perf(struct ark_udm_t *udm, const char *msg);
+void ark_udm_dump_setup(struct ark_udm_t *udm, uint16_t qId);
+int ark_udm_is_flushed(struct ark_udm_t *udm);
+
+/* Per queue data */
+uint64_t ark_udm_dropped(struct ark_udm_t *udm);
+uint64_t ark_udm_bytes(struct ark_udm_t *udm);
+uint64_t ark_udm_packets(struct ark_udm_t *udm);
+
+void ark_udm_queue_stats_reset(struct ark_udm_t *udm);
+void ark_udm_queue_enable(struct ark_udm_t *udm, int enable);
+
+#endif
diff --git a/drivers/net/ark/rte_pmd_ark_version.map b/drivers/net/ark/rte_pmd_ark_version.map
new file mode 100644
index 0000000..7f84780
--- /dev/null
+++ b/drivers/net/ark/rte_pmd_ark_version.map
@@ -0,0 +1,4 @@
+DPDK_2.0 {
+	 local: *;
+
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 0e0b600..da23898 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -104,6 +104,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD)      += -lrte_pmd_bnx2x -lz
 _LDLIBS-$(CONFIG_RTE_LIBRTE_BNXT_PMD)       += -lrte_pmd_bnxt
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
-- 
1.9.1

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v2] mk: Provide option to set Major ABI version
  2017-03-17  8:27  4%             ` Christian Ehrhardt
@ 2017-03-17  9:16  4%               ` Jan Blunck
  0 siblings, 0 replies; 200+ results
From: Jan Blunck @ 2017-03-17  9:16 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: Thomas Monjalon, dev, cjcollier, ricardo.salveti, Luca Boccassi

On Fri, Mar 17, 2017 at 9:27 AM, Christian Ehrhardt
<christian.ehrhardt@canonical.com> wrote:
>
> On Thu, Mar 16, 2017 at 6:19 PM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
>>
>> Not sure about how it can be used in distributions, but it does not hurt
>> to provide the config option.
>> Are you going to link applications against a fixed DPDK version for
>> every libraries?
>
>
> Kind of yes - we can still update "inside" a major version e.g. stable
> releases just fine.
> That helps a lot transitioning from one to the next release.

Exactly. It enables me to roll out a new major release without being
blocked on the (external) consumers to pick up the changes. Now I can
do this one by one with reduced risk because the runtimes are clearly
separated.

> And I already heard that even other downstreams are using it to simplify
> their dependencies.
>

Downstream?! Pha! ;)

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] mk: Provide option to set Major ABI version
  2017-03-16 17:19  4%           ` Thomas Monjalon
@ 2017-03-17  8:27  4%             ` Christian Ehrhardt
  2017-03-17  9:16  4%               ` Jan Blunck
  0 siblings, 1 reply; 200+ results
From: Christian Ehrhardt @ 2017-03-17  8:27 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Jan Blunck, cjcollier, ricardo.salveti, Luca Boccassi

On Thu, Mar 16, 2017 at 6:19 PM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> Not sure about how it can be used in distributions, but it does not hurt
> to provide the config option.
> Are you going to link applications against a fixed DPDK version for
> every libraries?
>

Kind of yes - we can still update "inside" a major version e.g. stable
releases just fine.
That helps a lot transitioning from one to the next release.

I have an RFC patch for the Debian packaging making use of it this function
that we will likely refresh and pick up on our next merge of a DPDK version.
And I already heard that even other downstreams are using it to simplify
their dependencies.

Applied, thanks
>

Thanks !

-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back
  2017-03-17  0:08  0%               ` Wiles, Keith
@ 2017-03-17  0:15  0%                 ` O'Driscoll, Tim
  0 siblings, 0 replies; 200+ results
From: O'Driscoll, Tim @ 2017-03-17  0:15 UTC (permalink / raw)
  To: Wiles, Keith, Vincent JARDIN
  Cc: Stephen Hemminger, Legacy, Allain (Wind River),
	Yigit, Ferruh, dev, Jolliffe, Ian (Wind River),
	Richardson, Bruce, Mcnamara, John, thomas.monjalon, jerin.jacob,
	3chas3, stefanha, Markus Armbruster

> From: Wiles, Keith
> 
> 
> > On Mar 17, 2017, at 7:41 AM, Vincent JARDIN <vincent.jardin@6wind.com>
> wrote:
> >
> > Let's be back to 2014 with Qemu's thoughts on it,
> > +Stefan
> > https://lists.linuxfoundation.org/pipermail/virtualization/2014-
> June/026767.html
> >
> > and
> > +Markus
> > https://lists.linuxfoundation.org/pipermail/virtualization/2014-
> June/026713.html
> >
> >> 6. Device models belong into QEMU
> >>
> >>   Say you build an actual interface on top of ivshmem.  Then ivshmem
> in
> >>   QEMU together with the supporting host code outside QEMU (see 3.)
> and
> >>   the lower layer of the code using it in guests (kernel + user
> space)
> >>   provide something that to me very much looks like a device model.
> >>
> >>   Device models belong into QEMU.  It's what QEMU does.
> >
> >
> > Le 17/03/2017 à 00:17, Stephen Hemminger a écrit :
> >> On Wed, 15 Mar 2017 04:10:56 +0000
> >> "O'Driscoll, Tim" <tim.odriscoll@intel.com> wrote:
> >>
> >>> I've included a couple of specific comments inline below, and a
> general comment here.
> >>>
> >>> We have somebody proposing to add a new driver to DPDK. It's
> standalone and doesn't affect any of the core libraries. They're willing
> to maintain the driver and have included a patch to update the
> maintainers file. They've also included the relevant documentation
> changes. I haven't seen any negative comment on the patches themselves
> except for a request from John McNamara for an update to the Release
> Notes that was addressed in a later version. I think we should be
> welcoming this into DPDK rather than questioning/rejecting it.
> >>>
> >>> I'd suggest that this is a good topic for the next Tech Board
> meeting.
> >>
> >> This is a virtualization driver for supporting DPDK on platform that
> provides an alternative
> >> virtual network driver. I see no reason it shouldn't be part of DPDK.
> Given the unstable
> >> ABI for drivers, supporting out of tree DPDK drivers is difficult.
> The DPDK should try
> >> to be inclusive and support as many environments as possible.
> 
> 
> +2!! for Stephen’s comment.

+1 (I'm only half as important as Keith :-)

I don't think there's any doubt over the benefit of virtio and the fact that it should be our preferred solution. I'm sure everybody agrees on that. The issue is whether we should block alternative solutions. I don't think we should.

> 
> >>
> >
> > On Qemu mailing list, back to 2014, I did push to build models of
> devices over ivshmem, like AVP, but folks did not want that we abuse of
> it. The Qemu community wants that we avoid unfocusing. So, by being too
> much inclusive, we abuse of the Qemu's capabilities.
> >
> > So, because of being "inclusive", we should allow any cases, it is not
> a proper way to make sure that virtio gets all the focuses it deserves.
> >
> >
> 
> Regards,
> Keith


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back
  2017-03-17  0:11  0%               ` Wiles, Keith
@ 2017-03-17  0:14  0%                 ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2017-03-17  0:14 UTC (permalink / raw)
  To: Wiles, Keith
  Cc: Vincent JARDIN, O'Driscoll, Tim, Legacy, Allain (Wind River),
	Yigit, Ferruh, dev, Jolliffe, Ian (Wind River),
	Richardson, Bruce, Mcnamara, John, thomas.monjalon, jerin.jacob,
	3chas3, stefanha, Markus Armbruster

On Fri, 17 Mar 2017 00:11:10 +0000
"Wiles, Keith" <keith.wiles@intel.com> wrote:

> > On Mar 17, 2017, at 7:41 AM, Vincent JARDIN <vincent.jardin@6wind.com> wrote:
> > 
> > Let's be back to 2014 with Qemu's thoughts on it,
> > +Stefan
> > https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026767.html
> > 
> > and
> > +Markus
> > https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026713.html
> >   
> >> 6. Device models belong into QEMU
> >> 
> >>   Say you build an actual interface on top of ivshmem.  Then ivshmem in
> >>   QEMU together with the supporting host code outside QEMU (see 3.) and
> >>   the lower layer of the code using it in guests (kernel + user space)
> >>   provide something that to me very much looks like a device model.
> >> 
> >>   Device models belong into QEMU.  It's what QEMU does.  
> > 
> > 
> > Le 17/03/2017 à 00:17, Stephen Hemminger a écrit :  
> >> On Wed, 15 Mar 2017 04:10:56 +0000
> >> "O'Driscoll, Tim" <tim.odriscoll@intel.com> wrote:
> >>   
> >>> I've included a couple of specific comments inline below, and a general comment here.
> >>> 
> >>> We have somebody proposing to add a new driver to DPDK. It's standalone and doesn't affect any of the core libraries. They're willing to maintain the driver and have included a patch to update the maintainers file. They've also included the relevant documentation changes. I haven't seen any negative comment on the patches themselves except for a request from John McNamara for an update to the Release Notes that was addressed in a later version. I think we should be welcoming this into DPDK rather than questioning/rejecting it.
> >>> 
> >>> I'd suggest that this is a good topic for the next Tech Board meeting.  
> >> 
> >> This is a virtualization driver for supporting DPDK on platform that provides an alternative
> >> virtual network driver. I see no reason it shouldn't be part of DPDK. Given the unstable
> >> ABI for drivers, supporting out of tree DPDK drivers is difficult. The DPDK should try
> >> to be inclusive and support as many environments as possible.
> >>   
> > 
> > On Qemu mailing list, back to 2014, I did push to build models of devices over ivshmem, like AVP, but folks did not want that we abuse of it. The Qemu community wants that we avoid unfocusing. So, by being too much inclusive, we abuse of the Qemu's capabilities.
> > 
> > So, because of being "inclusive", we should allow any cases, it is not a proper way to make sure that virtio gets all the focuses it deserves.
> > 
> >   
> 
> Why are we bring QEMU in to the picture it does not make a lot of sense to me. Stephen stated it well above and I hope my comments were stating the same conclusion. I do not see your real reasons for not allowing this driver into DPDK, it seems like some other hidden agenda is at play here, but I am a paranoid person :-)

I am thinking of people already using Windriver systems. One can argue all you want that
they should be using QEMU/KVM/Virtio or 6Wind Virtual Accelerator but it is not the
role of DPDK to be used to influence customers architecture decisions.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back
  2017-03-16 23:41  0%             ` [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back Vincent JARDIN
  2017-03-17  0:08  0%               ` Wiles, Keith
@ 2017-03-17  0:11  0%               ` Wiles, Keith
  2017-03-17  0:14  0%                 ` Stephen Hemminger
  1 sibling, 1 reply; 200+ results
From: Wiles, Keith @ 2017-03-17  0:11 UTC (permalink / raw)
  To: Vincent JARDIN
  Cc: Stephen Hemminger, O'Driscoll, Tim, Legacy,
	Allain (Wind River),
	Yigit, Ferruh, dev, Jolliffe, Ian (Wind River),
	Richardson, Bruce, Mcnamara, John, thomas.monjalon, jerin.jacob,
	3chas3, stefanha, Markus Armbruster


> On Mar 17, 2017, at 7:41 AM, Vincent JARDIN <vincent.jardin@6wind.com> wrote:
> 
> Let's be back to 2014 with Qemu's thoughts on it,
> +Stefan
> https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026767.html
> 
> and
> +Markus
> https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026713.html
> 
>> 6. Device models belong into QEMU
>> 
>>   Say you build an actual interface on top of ivshmem.  Then ivshmem in
>>   QEMU together with the supporting host code outside QEMU (see 3.) and
>>   the lower layer of the code using it in guests (kernel + user space)
>>   provide something that to me very much looks like a device model.
>> 
>>   Device models belong into QEMU.  It's what QEMU does.
> 
> 
> Le 17/03/2017 à 00:17, Stephen Hemminger a écrit :
>> On Wed, 15 Mar 2017 04:10:56 +0000
>> "O'Driscoll, Tim" <tim.odriscoll@intel.com> wrote:
>> 
>>> I've included a couple of specific comments inline below, and a general comment here.
>>> 
>>> We have somebody proposing to add a new driver to DPDK. It's standalone and doesn't affect any of the core libraries. They're willing to maintain the driver and have included a patch to update the maintainers file. They've also included the relevant documentation changes. I haven't seen any negative comment on the patches themselves except for a request from John McNamara for an update to the Release Notes that was addressed in a later version. I think we should be welcoming this into DPDK rather than questioning/rejecting it.
>>> 
>>> I'd suggest that this is a good topic for the next Tech Board meeting.
>> 
>> This is a virtualization driver for supporting DPDK on platform that provides an alternative
>> virtual network driver. I see no reason it shouldn't be part of DPDK. Given the unstable
>> ABI for drivers, supporting out of tree DPDK drivers is difficult. The DPDK should try
>> to be inclusive and support as many environments as possible.
>> 
> 
> On Qemu mailing list, back to 2014, I did push to build models of devices over ivshmem, like AVP, but folks did not want that we abuse of it. The Qemu community wants that we avoid unfocusing. So, by being too much inclusive, we abuse of the Qemu's capabilities.
> 
> So, because of being "inclusive", we should allow any cases, it is not a proper way to make sure that virtio gets all the focuses it deserves.
> 
> 

Why are we bring QEMU in to the picture it does not make a lot of sense to me. Stephen stated it well above and I hope my comments were stating the same conclusion. I do not see your real reasons for not allowing this driver into DPDK, it seems like some other hidden agenda is at play here, but I am a paranoid person :-)


Regards,
Keith


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back
  2017-03-16 23:41  0%             ` [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back Vincent JARDIN
@ 2017-03-17  0:08  0%               ` Wiles, Keith
  2017-03-17  0:15  0%                 ` O'Driscoll, Tim
  2017-03-17  0:11  0%               ` Wiles, Keith
  1 sibling, 1 reply; 200+ results
From: Wiles, Keith @ 2017-03-17  0:08 UTC (permalink / raw)
  To: Vincent JARDIN
  Cc: Stephen Hemminger, O'Driscoll, Tim, Legacy,
	Allain (Wind River),
	Yigit, Ferruh, dev, Jolliffe, Ian (Wind River),
	Richardson, Bruce, Mcnamara, John, thomas.monjalon, jerin.jacob,
	3chas3, stefanha, Markus Armbruster


> On Mar 17, 2017, at 7:41 AM, Vincent JARDIN <vincent.jardin@6wind.com> wrote:
> 
> Let's be back to 2014 with Qemu's thoughts on it,
> +Stefan
> https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026767.html
> 
> and
> +Markus
> https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026713.html
> 
>> 6. Device models belong into QEMU
>> 
>>   Say you build an actual interface on top of ivshmem.  Then ivshmem in
>>   QEMU together with the supporting host code outside QEMU (see 3.) and
>>   the lower layer of the code using it in guests (kernel + user space)
>>   provide something that to me very much looks like a device model.
>> 
>>   Device models belong into QEMU.  It's what QEMU does.
> 
> 
> Le 17/03/2017 à 00:17, Stephen Hemminger a écrit :
>> On Wed, 15 Mar 2017 04:10:56 +0000
>> "O'Driscoll, Tim" <tim.odriscoll@intel.com> wrote:
>> 
>>> I've included a couple of specific comments inline below, and a general comment here.
>>> 
>>> We have somebody proposing to add a new driver to DPDK. It's standalone and doesn't affect any of the core libraries. They're willing to maintain the driver and have included a patch to update the maintainers file. They've also included the relevant documentation changes. I haven't seen any negative comment on the patches themselves except for a request from John McNamara for an update to the Release Notes that was addressed in a later version. I think we should be welcoming this into DPDK rather than questioning/rejecting it.
>>> 
>>> I'd suggest that this is a good topic for the next Tech Board meeting.
>> 
>> This is a virtualization driver for supporting DPDK on platform that provides an alternative
>> virtual network driver. I see no reason it shouldn't be part of DPDK. Given the unstable
>> ABI for drivers, supporting out of tree DPDK drivers is difficult. The DPDK should try
>> to be inclusive and support as many environments as possible.


+2!! for Stephen’s comment.

>> 
> 
> On Qemu mailing list, back to 2014, I did push to build models of devices over ivshmem, like AVP, but folks did not want that we abuse of it. The Qemu community wants that we avoid unfocusing. So, by being too much inclusive, we abuse of the Qemu's capabilities.
> 
> So, because of being "inclusive", we should allow any cases, it is not a proper way to make sure that virtio gets all the focuses it deserves.
> 
> 

Regards,
Keith


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back
  2017-03-16 23:17  3%           ` Stephen Hemminger
@ 2017-03-16 23:41  0%             ` Vincent JARDIN
  2017-03-17  0:08  0%               ` Wiles, Keith
  2017-03-17  0:11  0%               ` Wiles, Keith
  0 siblings, 2 replies; 200+ results
From: Vincent JARDIN @ 2017-03-16 23:41 UTC (permalink / raw)
  To: Stephen Hemminger, O'Driscoll, Tim, Legacy, Allain (Wind River)
  Cc: Yigit, Ferruh, dev, Jolliffe, Ian (Wind River),
	Richardson, Bruce, Mcnamara, John, Wiles, Keith, thomas.monjalon,
	jerin.jacob, 3chas3, stefanha, Markus Armbruster

Let's be back to 2014 with Qemu's thoughts on it,
+Stefan
 
https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026767.html

and
+Markus
 
https://lists.linuxfoundation.org/pipermail/virtualization/2014-June/026713.html

> 6. Device models belong into QEMU
>
>    Say you build an actual interface on top of ivshmem.  Then ivshmem in
>    QEMU together with the supporting host code outside QEMU (see 3.) and
>    the lower layer of the code using it in guests (kernel + user space)
>    provide something that to me very much looks like a device model.
>
>    Device models belong into QEMU.  It's what QEMU does.


Le 17/03/2017 à 00:17, Stephen Hemminger a écrit :
> On Wed, 15 Mar 2017 04:10:56 +0000
> "O'Driscoll, Tim" <tim.odriscoll@intel.com> wrote:
>
>> I've included a couple of specific comments inline below, and a general comment here.
>>
>> We have somebody proposing to add a new driver to DPDK. It's standalone and doesn't affect any of the core libraries. They're willing to maintain the driver and have included a patch to update the maintainers file. They've also included the relevant documentation changes. I haven't seen any negative comment on the patches themselves except for a request from John McNamara for an update to the Release Notes that was addressed in a later version. I think we should be welcoming this into DPDK rather than questioning/rejecting it.
>>
>> I'd suggest that this is a good topic for the next Tech Board meeting.
>
> This is a virtualization driver for supporting DPDK on platform that provides an alternative
> virtual network driver. I see no reason it shouldn't be part of DPDK. Given the unstable
> ABI for drivers, supporting out of tree DPDK drivers is difficult. The DPDK should try
> to be inclusive and support as many environments as possible.
>

On Qemu mailing list, back to 2014, I did push to build models of 
devices over ivshmem, like AVP, but folks did not want that we abuse of 
it. The Qemu community wants that we avoid unfocusing. So, by being too 
much inclusive, we abuse of the Qemu's capabilities.

So, because of being "inclusive", we should allow any cases, it is not a 
proper way to make sure that virtio gets all the focuses it deserves.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio?
  @ 2017-03-16 23:17  3%           ` Stephen Hemminger
  2017-03-16 23:41  0%             ` [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back Vincent JARDIN
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2017-03-16 23:17 UTC (permalink / raw)
  To: O'Driscoll, Tim
  Cc: Vincent JARDIN, Legacy, Allain (Wind River),
	Yigit, Ferruh, dev, Jolliffe, Ian (Wind River),
	Richardson, Bruce, Mcnamara, John, Wiles, Keith, thomas.monjalon,
	jerin.jacob, 3chas3

On Wed, 15 Mar 2017 04:10:56 +0000
"O'Driscoll, Tim" <tim.odriscoll@intel.com> wrote:

> I've included a couple of specific comments inline below, and a general comment here.
> 
> We have somebody proposing to add a new driver to DPDK. It's standalone and doesn't affect any of the core libraries. They're willing to maintain the driver and have included a patch to update the maintainers file. They've also included the relevant documentation changes. I haven't seen any negative comment on the patches themselves except for a request from John McNamara for an update to the Release Notes that was addressed in a later version. I think we should be welcoming this into DPDK rather than questioning/rejecting it.
> 
> I'd suggest that this is a good topic for the next Tech Board meeting.

This is a virtualization driver for supporting DPDK on platform that provides an alternative
virtual network driver. I see no reason it shouldn't be part of DPDK. Given the unstable
ABI for drivers, supporting out of tree DPDK drivers is difficult. The DPDK should try
to be inclusive and support as many environments as possible.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2] mk: Provide option to set Major ABI version
  2017-03-01 14:35  4%         ` Jan Blunck
@ 2017-03-16 17:19  4%           ` Thomas Monjalon
  2017-03-17  8:27  4%             ` Christian Ehrhardt
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2017-03-16 17:19 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: dev, Jan Blunck, cjcollier, ricardo.salveti, Luca Boccassi

2017-03-01 15:35, Jan Blunck:
> On Wed, Mar 1, 2017 at 10:34 AM, Christian Ehrhardt
> <christian.ehrhardt@canonical.com> wrote:
> > Downstreams might want to provide different DPDK releases at the same
> > time to support multiple consumers of DPDK linked against older and newer
> > sonames.
> >
> > Also due to the interdependencies that DPDK libraries can have applications
> > might end up with an executable space in which multiple versions of a
> > library are mapped by ld.so.
> >
> > Think of LibA that got an ABI bump and LibB that did not get an ABI bump
> > but is depending on LibA.
> >
> >     Application
> >     \-> LibA.old
> >     \-> LibB.new -> LibA.new
> >
> > That is a conflict which can be avoided by setting CONFIG_RTE_MAJOR_ABI.
> > If set CONFIG_RTE_MAJOR_ABI overwrites any LIBABIVER value.
> > An example might be ``CONFIG_RTE_MAJOR_ABI=16.11`` which will make all
> > libraries librte<?>.so.16.11 instead of librte<?>.so.<LIBABIVER>.
[...]
> >
> > Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> 
> Reviewed-by: Jan Blunck <jblunck@infradead.org>
> Tested-by: Jan Blunck <jblunck@infradead.org>

Not sure about how it can be used in distributions, but it does not hurt
to provide the config option.
Are you going to link applications against a fixed DPDK version for
every libraries?

Applied, thanks

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v4 04/17] net/avp: add PMD version map file
  2017-03-13 19:16  3%       ` [dpdk-dev] [PATCH v4 04/17] net/avp: add PMD version map file Allain Legacy
@ 2017-03-16 14:52  0%         ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2017-03-16 14:52 UTC (permalink / raw)
  To: Allain Legacy
  Cc: dev, ian.jolliffe, bruce.richardson, john.mcnamara, keith.wiles,
	thomas.monjalon, vincent.jardin, jerin.jacob, stephen, 3chas3

On 3/13/2017 7:16 PM, Allain Legacy wrote:
> Adds a default ABI version file for the AVP PMD.
> 
> Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
> Signed-off-by: Matt Peters <matt.peters@windriver.com>
> ---
>  drivers/net/avp/rte_pmd_avp_version.map | 4 ++++
>  1 file changed, 4 insertions(+)
>  create mode 100644 drivers/net/avp/rte_pmd_avp_version.map
> 
> diff --git a/drivers/net/avp/rte_pmd_avp_version.map b/drivers/net/avp/rte_pmd_avp_version.map
> new file mode 100644
> index 0000000..af8f3f4
> --- /dev/null
> +++ b/drivers/net/avp/rte_pmd_avp_version.map
> @@ -0,0 +1,4 @@
> +DPDK_17.05 {
> +
> +    local: *;
> +};
> 

Hi Allain,

Instead of adding files per patch, may I suggest different ordering:
First add skeleton files in a patch, later add functional pieces one by
one, like:

Merge patch 1/17, 3/17, this patch (4/17), 6/17 (removing SYMLINK), into
single patch and make it AVP first patch. This will be skeleton patch.

Second patch can be introducing public headers (2/17) and updating
Makefile to include them.

Third, debug log patch (5/17)

Patch 7/17 and later can stay same.

What do you think?

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v10 08/18] lib: add symbol versioning to distributor
  2017-03-15  6:19  2%             ` [dpdk-dev] [PATCH v10 0/18] distributor library performance enhancements David Hunt
  2017-03-15  6:19  1%               ` [dpdk-dev] [PATCH v10 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-15  6:19  2%               ` David Hunt
  1 sibling, 0 replies; 200+ results
From: David Hunt @ 2017-03-15  6:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Also bumped up the ABI version number in the Makefile

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_distributor/Makefile                    |  2 +-
 lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
 lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
 lib/librte_distributor/rte_distributor_v20.c       | 10 +++
 lib/librte_distributor/rte_distributor_version.map | 14 ++++
 5 files changed, 162 insertions(+), 10 deletions(-)
 create mode 100644 lib/librte_distributor/rte_distributor_v1705.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 2b28eff..2f05cf3 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
 
 EXPORT_MAP := rte_distributor_version.map
 
-LIBABIVER := 1
+LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 6e1debf..06df13d 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -36,6 +36,7 @@
 #include <rte_mbuf.h>
 #include <rte_memory.h>
 #include <rte_cycles.h>
+#include <rte_compat.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
 #include <rte_string_fns.h>
@@ -44,6 +45,7 @@
 #include "rte_distributor_private.h"
 #include "rte_distributor.h"
 #include "rte_distributor_v20.h"
+#include "rte_distributor_v1705.h"
 
 TAILQ_HEAD(rte_dist_burst_list, rte_distributor);
 
@@ -57,7 +59,7 @@ EAL_REGISTER_TAILQ(rte_dist_burst_tailq)
 /**** Burst Packet APIs called by workers ****/
 
 void
-rte_distributor_request_pkt(struct rte_distributor *d,
+rte_distributor_request_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **oldpkt,
 		unsigned int count)
 {
@@ -102,9 +104,14 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 */
 	*retptr64 |= RTE_DISTRIB_GET_BUF;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_request_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(void rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt,
+		unsigned int count),
+		rte_distributor_request_pkt_v1705);
 
 int
-rte_distributor_poll_pkt(struct rte_distributor *d,
+rte_distributor_poll_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **pkts)
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
@@ -138,9 +145,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_poll_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **pkts),
+		rte_distributor_poll_pkt_v1705);
 
 int
-rte_distributor_get_pkt(struct rte_distributor *d,
+rte_distributor_get_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **pkts,
 		struct rte_mbuf **oldpkt, unsigned int return_count)
 {
@@ -168,9 +179,14 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	}
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_get_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **pkts,
+		struct rte_mbuf **oldpkt, unsigned int return_count),
+		rte_distributor_get_pkt_v1705);
 
 int
-rte_distributor_return_pkt(struct rte_distributor *d,
+rte_distributor_return_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **oldpkt, int num)
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
@@ -197,6 +213,10 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 	return 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_return_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_return_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt, int num),
+		rte_distributor_return_pkt_v1705);
 
 /**** APIs called on distributor core ***/
 
@@ -342,7 +362,7 @@ release(struct rte_distributor *d, unsigned int wkr)
 
 /* process a set of packets to distribute them to workers */
 int
-rte_distributor_process(struct rte_distributor *d,
+rte_distributor_process_v1705(struct rte_distributor *d,
 		struct rte_mbuf **mbufs, unsigned int num_mbufs)
 {
 	unsigned int next_idx = 0;
@@ -476,10 +496,14 @@ rte_distributor_process(struct rte_distributor *d,
 
 	return num_mbufs;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_process, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs),
+		rte_distributor_process_v1705);
 
 /* return to the caller, packets returned from workers */
 int
-rte_distributor_returned_pkts(struct rte_distributor *d,
+rte_distributor_returned_pkts_v1705(struct rte_distributor *d,
 		struct rte_mbuf **mbufs, unsigned int max_mbufs)
 {
 	struct rte_distributor_returned_pkts *returns = &d->returns;
@@ -504,6 +528,10 @@ rte_distributor_returned_pkts(struct rte_distributor *d,
 
 	return retval;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_returned_pkts, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs),
+		rte_distributor_returned_pkts_v1705);
 
 /*
  * Return the number of packets in-flight in a distributor, i.e. packets
@@ -525,7 +553,7 @@ total_outstanding(const struct rte_distributor *d)
  * queued up.
  */
 int
-rte_distributor_flush(struct rte_distributor *d)
+rte_distributor_flush_v1705(struct rte_distributor *d)
 {
 	const unsigned int flushed = total_outstanding(d);
 	unsigned int wkr;
@@ -549,10 +577,13 @@ rte_distributor_flush(struct rte_distributor *d)
 
 	return flushed;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_flush, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_flush(struct rte_distributor *d),
+		rte_distributor_flush_v1705);
 
 /* clears the internal returns array in the distributor */
 void
-rte_distributor_clear_returns(struct rte_distributor *d)
+rte_distributor_clear_returns_v1705(struct rte_distributor *d)
 {
 	unsigned int wkr;
 
@@ -565,10 +596,13 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 	for (wkr = 0; wkr < d->num_workers; wkr++)
 		d->bufs[wkr].retptr64[0] = 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_clear_returns, _v1705, 17.05);
+MAP_STATIC_SYMBOL(void rte_distributor_clear_returns(struct rte_distributor *d),
+		rte_distributor_clear_returns_v1705);
 
 /* creates a distributor instance */
 struct rte_distributor *
-rte_distributor_create(const char *name,
+rte_distributor_create_v1705(const char *name,
 		unsigned int socket_id,
 		unsigned int num_workers,
 		unsigned int alg_type)
@@ -638,3 +672,8 @@ rte_distributor_create(const char *name,
 
 	return d;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_create, _v1705, 17.05);
+MAP_STATIC_SYMBOL(struct rte_distributor *rte_distributor_create(
+		const char *name, unsigned int socket_id,
+		unsigned int num_workers, unsigned int alg_type),
+		rte_distributor_create_v1705);
diff --git a/lib/librte_distributor/rte_distributor_v1705.h b/lib/librte_distributor/rte_distributor_v1705.h
new file mode 100644
index 0000000..81b2691
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v1705.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V1705_H_
+#define _RTE_DISTRIB_V1705_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct rte_distributor *
+rte_distributor_create_v1705(const char *name, unsigned int socket_id,
+		unsigned int num_workers,
+		unsigned int alg_type);
+
+int
+rte_distributor_process_v1705(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+int
+rte_distributor_returned_pkts_v1705(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+int
+rte_distributor_flush_v1705(struct rte_distributor *d);
+
+void
+rte_distributor_clear_returns_v1705(struct rte_distributor *d);
+
+int
+rte_distributor_get_pkt_v1705(struct rte_distributor *d,
+	unsigned int worker_id, struct rte_mbuf **pkts,
+	struct rte_mbuf **oldpkt, unsigned int retcount);
+
+int
+rte_distributor_return_pkt_v1705(struct rte_distributor *d,
+	unsigned int worker_id, struct rte_mbuf **oldpkt, int num);
+
+void
+rte_distributor_request_pkt_v1705(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt,
+		unsigned int count);
+
+int
+rte_distributor_poll_pkt_v1705(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **mbufs);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_distributor/rte_distributor_v20.c b/lib/librte_distributor/rte_distributor_v20.c
index 1f406c5..bb6c5d7 100644
--- a/lib/librte_distributor/rte_distributor_v20.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -38,6 +38,7 @@
 #include <rte_memory.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
+#include <rte_compat.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
 #include "rte_distributor_v20.h"
@@ -63,6 +64,7 @@ rte_distributor_request_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	buf->bufptr64 = req;
 }
+VERSION_SYMBOL(rte_distributor_request_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
@@ -76,6 +78,7 @@ rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
 	int64_t ret = buf->bufptr64 >> RTE_DISTRIB_FLAG_BITS;
 	return (struct rte_mbuf *)((uintptr_t)ret);
 }
+VERSION_SYMBOL(rte_distributor_poll_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
@@ -87,6 +90,7 @@ rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	return ret;
 }
+VERSION_SYMBOL(rte_distributor_get_pkt, _v20, 2.0);
 
 int
 rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
@@ -98,6 +102,7 @@ rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
 	buf->bufptr64 = req;
 	return 0;
 }
+VERSION_SYMBOL(rte_distributor_return_pkt, _v20, 2.0);
 
 /**** APIs called on distributor core ***/
 
@@ -314,6 +319,7 @@ rte_distributor_process_v20(struct rte_distributor_v20 *d,
 	d->returns.count = ret_count;
 	return num_mbufs;
 }
+VERSION_SYMBOL(rte_distributor_process, _v20, 2.0);
 
 /* return to the caller, packets returned from workers */
 int
@@ -334,6 +340,7 @@ rte_distributor_returned_pkts_v20(struct rte_distributor_v20 *d,
 
 	return retval;
 }
+VERSION_SYMBOL(rte_distributor_returned_pkts, _v20, 2.0);
 
 /* return the number of packets in-flight in a distributor, i.e. packets
  * being workered on or queued up in a backlog. */
@@ -362,6 +369,7 @@ rte_distributor_flush_v20(struct rte_distributor_v20 *d)
 
 	return flushed;
 }
+VERSION_SYMBOL(rte_distributor_flush, _v20, 2.0);
 
 /* clears the internal returns array in the distributor */
 void
@@ -372,6 +380,7 @@ rte_distributor_clear_returns_v20(struct rte_distributor_v20 *d)
 	memset(d->returns.mbufs, 0, sizeof(d->returns.mbufs));
 #endif
 }
+VERSION_SYMBOL(rte_distributor_clear_returns, _v20, 2.0);
 
 /* creates a distributor instance */
 struct rte_distributor_v20 *
@@ -415,3 +424,4 @@ rte_distributor_create_v20(const char *name,
 
 	return d;
 }
+VERSION_SYMBOL(rte_distributor_create, _v20, 2.0);
diff --git a/lib/librte_distributor/rte_distributor_version.map b/lib/librte_distributor/rte_distributor_version.map
index 73fdc43..3a285b3 100644
--- a/lib/librte_distributor/rte_distributor_version.map
+++ b/lib/librte_distributor/rte_distributor_version.map
@@ -13,3 +13,17 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_17.05 {
+	global:
+
+	rte_distributor_clear_returns;
+	rte_distributor_create;
+	rte_distributor_flush;
+	rte_distributor_get_pkt;
+	rte_distributor_poll_pkt;
+	rte_distributor_process;
+	rte_distributor_request_pkt;
+	rte_distributor_return_pkt;
+	rte_distributor_returned_pkts;
+} DPDK_2.0;
-- 
2.7.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v10 01/18] lib: rename legacy distributor lib files
  2017-03-15  6:19  2%             ` [dpdk-dev] [PATCH v10 0/18] distributor library performance enhancements David Hunt
@ 2017-03-15  6:19  1%               ` David Hunt
  2017-03-20 10:08  2%                 ` [dpdk-dev] [PATCH v11 0/18] distributor lib performance enhancements David Hunt
  2017-03-15  6:19  2%               ` [dpdk-dev] [PATCH v10 " David Hunt
  1 sibling, 1 reply; 200+ results
From: David Hunt @ 2017-03-15  6:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Move files out of the way so that we can replace with new
versions of the distributor libtrary. Files are named in
such a way as to match the symbol versioning that we will
apply for backward ABI compatibility.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_distributor/Makefile                    |   3 +-
 lib/librte_distributor/rte_distributor.h           | 210 +-----------------
 .../{rte_distributor.c => rte_distributor_v20.c}   |   2 +-
 lib/librte_distributor/rte_distributor_v20.h       | 247 +++++++++++++++++++++
 4 files changed, 251 insertions(+), 211 deletions(-)
 rename lib/librte_distributor/{rte_distributor.c => rte_distributor_v20.c} (99%)
 create mode 100644 lib/librte_distributor/rte_distributor_v20.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 4c9af17..b314ca6 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -42,10 +42,11 @@ EXPORT_MAP := rte_distributor_version.map
 LIBABIVER := 1
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor.c
+SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include := rte_distributor.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include += rte_distributor_v20.h
 
 # this lib needs eal
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += lib/librte_eal
diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 7d36bc8..e41d522 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -34,214 +34,6 @@
 #ifndef _RTE_DISTRIBUTE_H_
 #define _RTE_DISTRIBUTE_H_
 
-/**
- * @file
- * RTE distributor
- *
- * The distributor is a component which is designed to pass packets
- * one-at-a-time to workers, with dynamic load balancing.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
-
-struct rte_distributor;
-struct rte_mbuf;
-
-/**
- * Function to create a new distributor instance
- *
- * Reserves the memory needed for the distributor operation and
- * initializes the distributor to work with the configured number of workers.
- *
- * @param name
- *   The name to be given to the distributor instance.
- * @param socket_id
- *   The NUMA node on which the memory is to be allocated
- * @param num_workers
- *   The maximum number of workers that will request packets from this
- *   distributor
- * @return
- *   The newly created distributor instance
- */
-struct rte_distributor *
-rte_distributor_create(const char *name, unsigned socket_id,
-		unsigned num_workers);
-
-/*  *** APIS to be called on the distributor lcore ***  */
-/*
- * The following APIs are the public APIs which are designed for use on a
- * single lcore which acts as the distributor lcore for a given distributor
- * instance. These functions cannot be called on multiple cores simultaneously
- * without using locking to protect access to the internals of the distributor.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * Process a set of packets by distributing them among workers that request
- * packets. The distributor will ensure that no two packets that have the
- * same flow id, or tag, in the mbuf will be procesed at the same time.
- *
- * The user is advocated to set tag for each mbuf before calling this function.
- * If user doesn't set the tag, the tag value can be various values depending on
- * driver implementation and configuration.
- *
- * This is not multi-thread safe and should only be called on a single lcore.
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs to be distributed
- * @param num_mbufs
- *   The number of mbufs in the mbufs array
- * @return
- *   The number of mbufs processed.
- */
-int
-rte_distributor_process(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned num_mbufs);
-
-/**
- * Get a set of mbufs that have been returned to the distributor by workers
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs pointer array to be filled in
- * @param max_mbufs
- *   The size of the mbufs array
- * @return
- *   The number of mbufs returned in the mbufs array.
- */
-int
-rte_distributor_returned_pkts(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned max_mbufs);
-
-/**
- * Flush the distributor component, so that there are no in-flight or
- * backlogged packets awaiting processing
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @return
- *   The number of queued/in-flight packets that were completed by this call.
- */
-int
-rte_distributor_flush(struct rte_distributor *d);
-
-/**
- * Clears the array of returned packets used as the source for the
- * rte_distributor_returned_pkts() API call.
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- */
-void
-rte_distributor_clear_returns(struct rte_distributor *d);
-
-/*  *** APIS to be called on the worker lcores ***  */
-/*
- * The following APIs are the public APIs which are designed for use on
- * multiple lcores which act as workers for a distributor. Each lcore should use
- * a unique worker id when requesting packets.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * API called by a worker to get a new packet to process. Any previous packet
- * given to the worker is assumed to have completed processing, and may be
- * optionally returned to the distributor via the oldpkt parameter.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- *
- * @return
- *   A new packet to be processed by the worker thread.
- */
-struct rte_mbuf *
-rte_distributor_get_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to return a completed packet without requesting a
- * new packet, for example, because a worker thread is shutting down
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param mbuf
- *   The previous packet being processed by the worker
- */
-int
-rte_distributor_return_pkt(struct rte_distributor *d, unsigned worker_id,
-		struct rte_mbuf *mbuf);
-
-/**
- * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
- * processing, and may be optionally returned to the distributor via
- * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt(), this function does not wait for a new
- * packet to be provided by the distributor.
- *
- * NOTE: after calling this function, rte_distributor_poll_pkt() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt()
- * API should *not* be used to try and retrieve the new packet.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- */
-void
-rte_distributor_request_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to check for a new packet that was previously
- * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
- * not yet been fulfilled by the distributor.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- *
- * @return
- *   A new packet to be processed by the worker thread, or NULL if no
- *   packet is yet available.
- */
-struct rte_mbuf *
-rte_distributor_poll_pkt(struct rte_distributor *d,
-		unsigned worker_id);
-
-#ifdef __cplusplus
-}
-#endif
+#include <rte_distributor_v20.h>
 
 #endif
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor_v20.c
similarity index 99%
rename from lib/librte_distributor/rte_distributor.c
rename to lib/librte_distributor/rte_distributor_v20.c
index f3f778c..b890947 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -40,7 +40,7 @@
 #include <rte_errno.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
-#include "rte_distributor.h"
+#include "rte_distributor_v20.h"
 
 #define NO_FLAGS 0
 #define RTE_DISTRIB_PREFIX "DT_"
diff --git a/lib/librte_distributor/rte_distributor_v20.h b/lib/librte_distributor/rte_distributor_v20.h
new file mode 100644
index 0000000..b69aa27
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v20.h
@@ -0,0 +1,247 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V20_H_
+#define _RTE_DISTRIB_V20_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
+
+struct rte_distributor;
+struct rte_mbuf;
+
+/**
+ * Function to create a new distributor instance
+ *
+ * Reserves the memory needed for the distributor operation and
+ * initializes the distributor to work with the configured number of workers.
+ *
+ * @param name
+ *   The name to be given to the distributor instance.
+ * @param socket_id
+ *   The NUMA node on which the memory is to be allocated
+ * @param num_workers
+ *   The maximum number of workers that will request packets from this
+ *   distributor
+ * @return
+ *   The newly created distributor instance
+ */
+struct rte_distributor *
+rte_distributor_create(const char *name, unsigned int socket_id,
+		unsigned int num_workers);
+
+/*  *** APIS to be called on the distributor lcore ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on a
+ * single lcore which acts as the distributor lcore for a given distributor
+ * instance. These functions cannot be called on multiple cores simultaneously
+ * without using locking to protect access to the internals of the distributor.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * Process a set of packets by distributing them among workers that request
+ * packets. The distributor will ensure that no two packets that have the
+ * same flow id, or tag, in the mbuf will be processed at the same time.
+ *
+ * The user is advocated to set tag for each mbuf before calling this function.
+ * If user doesn't set the tag, the tag value can be various values depending on
+ * driver implementation and configuration.
+ *
+ * This is not multi-thread safe and should only be called on a single lcore.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs to be distributed
+ * @param num_mbufs
+ *   The number of mbufs in the mbufs array
+ * @return
+ *   The number of mbufs processed.
+ */
+int
+rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+/**
+ * Get a set of mbufs that have been returned to the distributor by workers
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs pointer array to be filled in
+ * @param max_mbufs
+ *   The size of the mbufs array
+ * @return
+ *   The number of mbufs returned in the mbufs array.
+ */
+int
+rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+/**
+ * Flush the distributor component, so that there are no in-flight or
+ * backlogged packets awaiting processing
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @return
+ *   The number of queued/in-flight packets that were completed by this call.
+ */
+int
+rte_distributor_flush(struct rte_distributor *d);
+
+/**
+ * Clears the array of returned packets used as the source for the
+ * rte_distributor_returned_pkts() API call.
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ */
+void
+rte_distributor_clear_returns(struct rte_distributor *d);
+
+/*  *** APIS to be called on the worker lcores ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on
+ * multiple lcores which act as workers for a distributor. Each lcore should use
+ * a unique worker id when requesting packets.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * API called by a worker to get a new packet to process. Any previous packet
+ * given to the worker is assumed to have completed processing, and may be
+ * optionally returned to the distributor via the oldpkt parameter.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ *
+ * @return
+ *   A new packet to be processed by the worker thread.
+ */
+struct rte_mbuf *
+rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to return a completed packet without requesting a
+ * new packet, for example, because a worker thread is shutting down
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param mbuf
+ *   The previous packet being processed by the worker
+ */
+int
+rte_distributor_return_pkt(struct rte_distributor *d, unsigned int worker_id,
+		struct rte_mbuf *mbuf);
+
+/**
+ * API called by a worker to request a new packet to process.
+ * Any previous packet given to the worker is assumed to have completed
+ * processing, and may be optionally returned to the distributor via
+ * the oldpkt parameter.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for a new
+ * packet to be provided by the distributor.
+ *
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packet requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packet.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ */
+void
+rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to check for a new packet that was previously
+ * requested by a call to rte_distributor_request_pkt(). It does not wait
+ * for the new packet to be available, but returns NULL if the request has
+ * not yet been fulfilled by the distributor.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ *
+ * @return
+ *   A new packet to be processed by the worker thread, or NULL if no
+ *   packet is yet available.
+ */
+struct rte_mbuf *
+rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.7.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v10 0/18] distributor library performance enhancements
  2017-03-06  9:10  1%           ` [dpdk-dev] [PATCH v9 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-15  6:19  2%             ` David Hunt
  2017-03-15  6:19  1%               ` [dpdk-dev] [PATCH v10 01/18] lib: rename legacy distributor lib files David Hunt
  2017-03-15  6:19  2%               ` [dpdk-dev] [PATCH v10 " David Hunt
  0 siblings, 2 replies; 200+ results
From: David Hunt @ 2017-03-15  6:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson

This patch aims to improve the throughput of the distributor library.

It uses a similar handshake mechanism to the previous version of
the library, in that bits are used to indicate when packets are ready
to be sent to a worker and ready to be returned from a worker. One main
difference is that instead of sending one packet in a cache line, it makes
use of the 7 free spaces in the same cache line in order to send up to
8 packets at a time to/from a worker.

The flow matching algorithm has had significant re-work, and now keeps an
array of inflight flows and an array of backlog flows, and matches incoming
flows to the inflight/backlog flows of all workers so that flow pinning to
workers can be maintained.

The Flow Match algorithm has both scalar and a vector versions, and a
function pointer is used to select the post appropriate function at run time,
depending on the presence of the SSE2 cpu flag. On non-x86 platforms, the
the scalar match function is selected, which should still gives a good boost
in performance over the non-burst API.

v10 changes:
   * Addressed all review comments from v9 (thanks, Bruce)
   * Squashed the two patches containing distributor structs and code
   * Renamed confusing rte_distributor_v1705.h to rte_distributor_next.h
   * Added usleep in main so as to be a little more gentle with that core
   * Fixed some patch titles and improved some descriptions
   * Updated sample app guide documentation
   * Removed un-needed code limiting Tx rings and cleaned up patch
   * Inherited v9 series Ack by Bruce, except new suggested addition
     for example app documentation (17/18)

v9 changes:
   * fixed symbol versioning so it will compile on CentOS and RedHat

v8 changes:
   * Changed the patch set to have a more logical order order of
     the changes, but the end result is basically the same.
   * Fixed broken shared library build.
   * Split down the updates to example app more
   * No longer changes the test app and sample app to use a temporary
     API.
   * No longer temporarily re-names the functions in the
     version.map file.

v7 changes:
   * Reorganised patch so there's a more natural progression in the
     changes, and divided them down into easier to review chunks.
   * Previous versions of this patch set were effectively two APIs.
     We now have a single API. Legacy functionality can
     be used by by using the rte_distributor_create API call with the
     RTE_DISTRIBUTOR_SINGLE flag when creating a distributor instance.
   * Added symbol versioning for old API so that ABI is preserved.

v6 changes:
   * Fixed intermittent segfault where num pkts not divisible
     by BURST_SIZE
   * Cleanup due to review comments on mailing list
   * Renamed _priv.h to _private.h.

v5 changes:
   * Removed some un-needed code around retries in worker API calls
   * Cleanup due to review comments on mailing list
   * Cleanup of non-x86 platform compilation, fallback to scalar match

v4 changes:
   * fixed issue building shared libraries

v3 changes:
  * Addressed mailing list review comments
  * Test code removal
  * Split out SSE match into separate file to facilitate NEON addition
  * Cleaned up conditional compilation flags for SSE2
  * Addressed c99 style compilation errors
  * rebased on latest head (Jan 2 2017, Happy New Year to all)

v2 changes:
  * Created a common distributor_priv.h header file with common
    definitions and structures.
  * Added a scalar version so it can be built and used on machines without
    sse2 instruction set
  * Added unit autotests
  * Added perf autotest

Notes:
   Apps must now work in bursts, as up to 8 are given to a worker at a time
   For performance in matching, Flow ID's are 15-bits
   If 32 bits Flow IDs are required, use the packet-at-a-time (SINGLE)
   mode.

Performance Gains
   2.2GHz Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
   2 x XL710 40GbE NICS to 2 x 40Gbps traffic generator channels 64b packets
   separate cores for rx, tx, distributor
    1 worker  - up to 4.8x
    4 workers - up to 2.9x
    8 workers - up to 1.8x
   12 workers - up to 2.1x
   16 workers - up to 1.8x

[01/18] lib: rename legacy distributor lib files
[02/18] lib: create private header file
[03/18] lib: add new distributor code
[04/18] lib: add SIMD flow matching to distributor
[05/18] test/distributor: extra params for autotests
[06/18] lib: switch distributor over to new API
[07/18] lib: make v20 header file private
[08/18] lib: add symbol versioning to distributor
[09/18] test: test single and burst distributor API
[10/18] test: add perf test for distributor burst mode
[11/18] examples/distributor: allow for extra stats
[12/18] examples/distributor: wait for ports to come up
[13/18] examples/distributor: add dedicated core for dist
[14/18] examples/distributor: tweaks for performance
[15/18] examples/distributor: give Rx thread a core
[16/18] doc: distributor library changes for new burst API
[17/18] doc: distributor app changes for new burst API
[18/18] maintainers: add to distributor lib maintainers

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3] lpm: extend IPv6 next hop field
  @ 2017-03-14 17:17  4% ` Vladyslav Buslov
  0 siblings, 0 replies; 200+ results
From: Vladyslav Buslov @ 2017-03-14 17:17 UTC (permalink / raw)
  To: thomas.monjalon; +Cc: bruce.richardson, dev

This patch extend next_hop field from 8-bits to 21-bits in LPM library
for IPv6.

Added versioning symbols to functions and updated
library and applications that have a dependency on LPM library.

Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---

Fixed compilation error in l3fwd_lpm.h

 doc/guides/prog_guide/lpm6_lib.rst              |   2 +-
 doc/guides/rel_notes/release_17_05.rst          |   5 +
 examples/ip_fragmentation/main.c                |  17 +--
 examples/ip_reassembly/main.c                   |  17 +--
 examples/ipsec-secgw/ipsec-secgw.c              |   2 +-
 examples/l3fwd/l3fwd_lpm.h                      |   2 +-
 examples/l3fwd/l3fwd_lpm_sse.h                  |  24 ++---
 examples/performance-thread/l3fwd-thread/main.c |  11 +-
 lib/librte_lpm/rte_lpm6.c                       | 134 +++++++++++++++++++++---
 lib/librte_lpm/rte_lpm6.h                       |  32 +++++-
 lib/librte_lpm/rte_lpm_version.map              |  10 ++
 lib/librte_table/rte_table_lpm_ipv6.c           |   9 +-
 test/test/test_lpm6.c                           | 115 ++++++++++++++------
 test/test/test_lpm6_perf.c                      |   4 +-
 14 files changed, 293 insertions(+), 91 deletions(-)

diff --git a/doc/guides/prog_guide/lpm6_lib.rst b/doc/guides/prog_guide/lpm6_lib.rst
index 0aea5c5..f791507 100644
--- a/doc/guides/prog_guide/lpm6_lib.rst
+++ b/doc/guides/prog_guide/lpm6_lib.rst
@@ -53,7 +53,7 @@ several thousand IPv6 rules, but the number can vary depending on the case.
 An LPM prefix is represented by a pair of parameters (128-bit key, depth), with depth in the range of 1 to 128.
 An LPM rule is represented by an LPM prefix and some user data associated with the prefix.
 The prefix serves as the unique identifier for the LPM rule.
-In this implementation, the user data is 1-byte long and is called "next hop",
+In this implementation, the user data is 21-bits long and is called "next hop",
 which corresponds to its main use of storing the ID of the next hop in a routing table entry.
 
 The main methods exported for the LPM component are:
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 4b90036..918f483 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -41,6 +41,9 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Increased number of next hops for LPM IPv6 to 2^21.**
+
+  The next_hop field is extended from 8 bits to 21 bits for IPv6.
 
 * **Added powerpc support in pci probing for vfio-pci devices.**
 
@@ -114,6 +117,8 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* The LPM ``next_hop`` field is extended from 8 bits to 21 bits for IPv6
+  while keeping ABI compatibility.
 
 ABI Changes
 -----------
diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 9e9ecae..1b005b5 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -265,8 +265,8 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 		uint8_t queueid, uint8_t port_in)
 {
 	struct rx_queue *rxq;
-	uint32_t i, len, next_hop_ipv4;
-	uint8_t next_hop_ipv6, port_out, ipv6;
+	uint32_t i, len, next_hop;
+	uint8_t port_out, ipv6;
 	int32_t len2;
 
 	ipv6 = 0;
@@ -290,9 +290,9 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 		ip_dst = rte_be_to_cpu_32(ip_hdr->dst_addr);
 
 		/* Find destination port */
-		if (rte_lpm_lookup(rxq->lpm, ip_dst, &next_hop_ipv4) == 0 &&
-				(enabled_port_mask & 1 << next_hop_ipv4) != 0) {
-			port_out = next_hop_ipv4;
+		if (rte_lpm_lookup(rxq->lpm, ip_dst, &next_hop) == 0 &&
+				(enabled_port_mask & 1 << next_hop) != 0) {
+			port_out = next_hop;
 
 			/* Build transmission burst for new port */
 			len = qconf->tx_mbufs[port_out].len;
@@ -326,9 +326,10 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 		ip_hdr = rte_pktmbuf_mtod(m, struct ipv6_hdr *);
 
 		/* Find destination port */
-		if (rte_lpm6_lookup(rxq->lpm6, ip_hdr->dst_addr, &next_hop_ipv6) == 0 &&
-				(enabled_port_mask & 1 << next_hop_ipv6) != 0) {
-			port_out = next_hop_ipv6;
+		if (rte_lpm6_lookup(rxq->lpm6, ip_hdr->dst_addr,
+						&next_hop) == 0 &&
+				(enabled_port_mask & 1 << next_hop) != 0) {
+			port_out = next_hop;
 
 			/* Build transmission burst for new port */
 			len = qconf->tx_mbufs[port_out].len;
diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index e62674c..b641576 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -346,8 +346,8 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 	struct rte_ip_frag_death_row *dr;
 	struct rx_queue *rxq;
 	void *d_addr_bytes;
-	uint32_t next_hop_ipv4;
-	uint8_t next_hop_ipv6, dst_port;
+	uint32_t next_hop;
+	uint8_t dst_port;
 
 	rxq = &qconf->rx_queue_list[queue];
 
@@ -390,9 +390,9 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 		ip_dst = rte_be_to_cpu_32(ip_hdr->dst_addr);
 
 		/* Find destination port */
-		if (rte_lpm_lookup(rxq->lpm, ip_dst, &next_hop_ipv4) == 0 &&
-				(enabled_port_mask & 1 << next_hop_ipv4) != 0) {
-			dst_port = next_hop_ipv4;
+		if (rte_lpm_lookup(rxq->lpm, ip_dst, &next_hop) == 0 &&
+				(enabled_port_mask & 1 << next_hop) != 0) {
+			dst_port = next_hop;
 		}
 
 		eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
@@ -427,9 +427,10 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 		}
 
 		/* Find destination port */
-		if (rte_lpm6_lookup(rxq->lpm6, ip_hdr->dst_addr, &next_hop_ipv6) == 0 &&
-				(enabled_port_mask & 1 << next_hop_ipv6) != 0) {
-			dst_port = next_hop_ipv6;
+		if (rte_lpm6_lookup(rxq->lpm6, ip_hdr->dst_addr,
+						&next_hop) == 0 &&
+				(enabled_port_mask & 1 << next_hop) != 0) {
+			dst_port = next_hop;
 		}
 
 		eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv6);
diff --git a/examples/ipsec-secgw/ipsec-secgw.c b/examples/ipsec-secgw/ipsec-secgw.c
index 685feec..d3c229a 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -618,7 +618,7 @@ route4_pkts(struct rt_ctx *rt_ctx, struct rte_mbuf *pkts[], uint8_t nb_pkts)
 static inline void
 route6_pkts(struct rt_ctx *rt_ctx, struct rte_mbuf *pkts[], uint8_t nb_pkts)
 {
-	int16_t hop[MAX_PKT_BURST * 2];
+	int32_t hop[MAX_PKT_BURST * 2];
 	uint8_t dst_ip[MAX_PKT_BURST * 2][16];
 	uint8_t *ip6_dst;
 	uint16_t i, offset;
diff --git a/examples/l3fwd/l3fwd_lpm.h b/examples/l3fwd/l3fwd_lpm.h
index a43c507..258a82f 100644
--- a/examples/l3fwd/l3fwd_lpm.h
+++ b/examples/l3fwd/l3fwd_lpm.h
@@ -49,7 +49,7 @@ lpm_get_ipv4_dst_port(void *ipv4_hdr,  uint8_t portid, void *lookup_struct)
 static inline uint8_t
 lpm_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, void *lookup_struct)
 {
-	uint8_t next_hop;
+	uint32_t next_hop;
 	struct rte_lpm6 *ipv6_l3fwd_lookup_struct =
 		(struct rte_lpm6 *)lookup_struct;
 
diff --git a/examples/l3fwd/l3fwd_lpm_sse.h b/examples/l3fwd/l3fwd_lpm_sse.h
index 538fe3d..aa06b6d 100644
--- a/examples/l3fwd/l3fwd_lpm_sse.h
+++ b/examples/l3fwd/l3fwd_lpm_sse.h
@@ -40,8 +40,7 @@ static inline __attribute__((always_inline)) uint16_t
 lpm_get_dst_port(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 		uint8_t portid)
 {
-	uint32_t next_hop_ipv4;
-	uint8_t next_hop_ipv6;
+	uint32_t next_hop;
 	struct ipv6_hdr *ipv6_hdr;
 	struct ipv4_hdr *ipv4_hdr;
 	struct ether_hdr *eth_hdr;
@@ -51,9 +50,11 @@ lpm_get_dst_port(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 		eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
 		ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
 
-		return (uint16_t) ((rte_lpm_lookup(qconf->ipv4_lookup_struct,
-				rte_be_to_cpu_32(ipv4_hdr->dst_addr), &next_hop_ipv4) == 0) ?
-						next_hop_ipv4 : portid);
+		return (uint16_t) (
+			(rte_lpm_lookup(qconf->ipv4_lookup_struct,
+					rte_be_to_cpu_32(ipv4_hdr->dst_addr),
+					&next_hop) == 0) ?
+						next_hop : portid);
 
 	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
 
@@ -61,8 +62,8 @@ lpm_get_dst_port(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 		ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
 
 		return (uint16_t) ((rte_lpm6_lookup(qconf->ipv6_lookup_struct,
-				ipv6_hdr->dst_addr, &next_hop_ipv6) == 0)
-				? next_hop_ipv6 : portid);
+				ipv6_hdr->dst_addr, &next_hop) == 0)
+				? next_hop : portid);
 
 	}
 
@@ -78,14 +79,13 @@ static inline __attribute__((always_inline)) uint16_t
 lpm_get_dst_port_with_ipv4(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 	uint32_t dst_ipv4, uint8_t portid)
 {
-	uint32_t next_hop_ipv4;
-	uint8_t next_hop_ipv6;
+	uint32_t next_hop;
 	struct ipv6_hdr *ipv6_hdr;
 	struct ether_hdr *eth_hdr;
 
 	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
 		return (uint16_t) ((rte_lpm_lookup(qconf->ipv4_lookup_struct, dst_ipv4,
-			&next_hop_ipv4) == 0) ? next_hop_ipv4 : portid);
+			&next_hop) == 0) ? next_hop : portid);
 
 	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
 
@@ -93,8 +93,8 @@ lpm_get_dst_port_with_ipv4(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 		ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
 
 		return (uint16_t) ((rte_lpm6_lookup(qconf->ipv6_lookup_struct,
-				ipv6_hdr->dst_addr, &next_hop_ipv6) == 0)
-				? next_hop_ipv6 : portid);
+				ipv6_hdr->dst_addr, &next_hop) == 0)
+				? next_hop : portid);
 
 	}
 
diff --git a/examples/performance-thread/l3fwd-thread/main.c b/examples/performance-thread/l3fwd-thread/main.c
index 6845e28..bf92582 100644
--- a/examples/performance-thread/l3fwd-thread/main.c
+++ b/examples/performance-thread/l3fwd-thread/main.c
@@ -909,7 +909,7 @@ static inline uint8_t
 get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid,
 		lookup6_struct_t *ipv6_l3fwd_lookup_struct)
 {
-	uint8_t next_hop;
+	uint32_t next_hop;
 
 	return (uint8_t) ((rte_lpm6_lookup(ipv6_l3fwd_lookup_struct,
 			((struct ipv6_hdr *)ipv6_hdr)->dst_addr, &next_hop) == 0) ?
@@ -1396,15 +1396,14 @@ rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t ptype)
 static inline __attribute__((always_inline)) uint16_t
 get_dst_port(struct rte_mbuf *pkt, uint32_t dst_ipv4, uint8_t portid)
 {
-	uint32_t next_hop_ipv4;
-	uint8_t next_hop_ipv6;
+	uint32_t next_hop;
 	struct ipv6_hdr *ipv6_hdr;
 	struct ether_hdr *eth_hdr;
 
 	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
 		return (uint16_t) ((rte_lpm_lookup(
 				RTE_PER_LCORE(lcore_conf)->ipv4_lookup_struct, dst_ipv4,
-				&next_hop_ipv4) == 0) ? next_hop_ipv4 : portid);
+				&next_hop) == 0) ? next_hop : portid);
 
 	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
 
@@ -1413,8 +1412,8 @@ get_dst_port(struct rte_mbuf *pkt, uint32_t dst_ipv4, uint8_t portid)
 
 		return (uint16_t) ((rte_lpm6_lookup(
 				RTE_PER_LCORE(lcore_conf)->ipv6_lookup_struct,
-				ipv6_hdr->dst_addr, &next_hop_ipv6) == 0) ? next_hop_ipv6 :
-						portid);
+				ipv6_hdr->dst_addr, &next_hop) == 0) ?
+				next_hop : portid);
 
 	}
 
diff --git a/lib/librte_lpm/rte_lpm6.c b/lib/librte_lpm/rte_lpm6.c
index 32fdba0..9cc7be7 100644
--- a/lib/librte_lpm/rte_lpm6.c
+++ b/lib/librte_lpm/rte_lpm6.c
@@ -97,7 +97,7 @@ struct rte_lpm6_tbl_entry {
 /** Rules tbl entry structure. */
 struct rte_lpm6_rule {
 	uint8_t ip[RTE_LPM6_IPV6_ADDR_SIZE]; /**< Rule IP address. */
-	uint8_t next_hop; /**< Rule next hop. */
+	uint32_t next_hop; /**< Rule next hop. */
 	uint8_t depth; /**< Rule depth. */
 };
 
@@ -297,7 +297,7 @@ rte_lpm6_free(struct rte_lpm6 *lpm)
  * the nexthop if so. Otherwise it adds a new rule if enough space is available.
  */
 static inline int32_t
-rule_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t next_hop, uint8_t depth)
+rule_add(struct rte_lpm6 *lpm, uint8_t *ip, uint32_t next_hop, uint8_t depth)
 {
 	uint32_t rule_index;
 
@@ -340,7 +340,7 @@ rule_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t next_hop, uint8_t depth)
  */
 static void
 expand_rule(struct rte_lpm6 *lpm, uint32_t tbl8_gindex, uint8_t depth,
-		uint8_t next_hop)
+		uint32_t next_hop)
 {
 	uint32_t tbl8_group_end, tbl8_gindex_next, j;
 
@@ -377,7 +377,7 @@ expand_rule(struct rte_lpm6 *lpm, uint32_t tbl8_gindex, uint8_t depth,
 static inline int
 add_step(struct rte_lpm6 *lpm, struct rte_lpm6_tbl_entry *tbl,
 		struct rte_lpm6_tbl_entry **tbl_next, uint8_t *ip, uint8_t bytes,
-		uint8_t first_byte, uint8_t depth, uint8_t next_hop)
+		uint8_t first_byte, uint8_t depth, uint32_t next_hop)
 {
 	uint32_t tbl_index, tbl_range, tbl8_group_start, tbl8_group_end, i;
 	int32_t tbl8_gindex;
@@ -507,9 +507,17 @@ add_step(struct rte_lpm6 *lpm, struct rte_lpm6_tbl_entry *tbl,
  * Add a route
  */
 int
-rte_lpm6_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+rte_lpm6_add_v20(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
 		uint8_t next_hop)
 {
+	return rte_lpm6_add_v1705(lpm, ip, depth, next_hop);
+}
+VERSION_SYMBOL(rte_lpm6_add, _v20, 2.0);
+
+int
+rte_lpm6_add_v1705(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint32_t next_hop)
+{
 	struct rte_lpm6_tbl_entry *tbl;
 	struct rte_lpm6_tbl_entry *tbl_next;
 	int32_t rule_index;
@@ -560,6 +568,10 @@ rte_lpm6_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
 
 	return status;
 }
+BIND_DEFAULT_SYMBOL(rte_lpm6_add, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_lpm6_add(struct rte_lpm6 *lpm, uint8_t *ip,
+				uint8_t depth, uint32_t next_hop),
+		rte_lpm6_add_v1705);
 
 /*
  * Takes a pointer to a table entry and inspect one level.
@@ -569,7 +581,7 @@ rte_lpm6_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
 static inline int
 lookup_step(const struct rte_lpm6 *lpm, const struct rte_lpm6_tbl_entry *tbl,
 		const struct rte_lpm6_tbl_entry **tbl_next, uint8_t *ip,
-		uint8_t first_byte, uint8_t *next_hop)
+		uint8_t first_byte, uint32_t *next_hop)
 {
 	uint32_t tbl8_index, tbl_entry;
 
@@ -589,7 +601,7 @@ lookup_step(const struct rte_lpm6 *lpm, const struct rte_lpm6_tbl_entry *tbl,
 		return 1;
 	} else {
 		/* If not extended then we can have a match. */
-		*next_hop = (uint8_t)tbl_entry;
+		*next_hop = ((uint32_t)tbl_entry & RTE_LPM6_TBL8_BITMASK);
 		return (tbl_entry & RTE_LPM6_LOOKUP_SUCCESS) ? 0 : -ENOENT;
 	}
 }
@@ -598,7 +610,26 @@ lookup_step(const struct rte_lpm6 *lpm, const struct rte_lpm6_tbl_entry *tbl,
  * Looks up an IP
  */
 int
-rte_lpm6_lookup(const struct rte_lpm6 *lpm, uint8_t *ip, uint8_t *next_hop)
+rte_lpm6_lookup_v20(const struct rte_lpm6 *lpm, uint8_t *ip, uint8_t *next_hop)
+{
+	uint32_t next_hop32 = 0;
+	int32_t status;
+
+	/* DEBUG: Check user input arguments. */
+	if (next_hop == NULL)
+		return -EINVAL;
+
+	status = rte_lpm6_lookup_v1705(lpm, ip, &next_hop32);
+	if (status == 0)
+		*next_hop = (uint8_t)next_hop32;
+
+	return status;
+}
+VERSION_SYMBOL(rte_lpm6_lookup, _v20, 2.0);
+
+int
+rte_lpm6_lookup_v1705(const struct rte_lpm6 *lpm, uint8_t *ip,
+		uint32_t *next_hop)
 {
 	const struct rte_lpm6_tbl_entry *tbl;
 	const struct rte_lpm6_tbl_entry *tbl_next = NULL;
@@ -625,20 +656,23 @@ rte_lpm6_lookup(const struct rte_lpm6 *lpm, uint8_t *ip, uint8_t *next_hop)
 
 	return status;
 }
+BIND_DEFAULT_SYMBOL(rte_lpm6_lookup, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_lpm6_lookup(const struct rte_lpm6 *lpm, uint8_t *ip,
+				uint32_t *next_hop), rte_lpm6_lookup_v1705);
 
 /*
  * Looks up a group of IP addresses
  */
 int
-rte_lpm6_lookup_bulk_func(const struct rte_lpm6 *lpm,
+rte_lpm6_lookup_bulk_func_v20(const struct rte_lpm6 *lpm,
 		uint8_t ips[][RTE_LPM6_IPV6_ADDR_SIZE],
 		int16_t * next_hops, unsigned n)
 {
 	unsigned i;
 	const struct rte_lpm6_tbl_entry *tbl;
 	const struct rte_lpm6_tbl_entry *tbl_next = NULL;
-	uint32_t tbl24_index;
-	uint8_t first_byte, next_hop;
+	uint32_t tbl24_index, next_hop;
+	uint8_t first_byte;
 	int status;
 
 	/* DEBUG: Check user input arguments. */
@@ -664,11 +698,59 @@ rte_lpm6_lookup_bulk_func(const struct rte_lpm6 *lpm,
 		if (status < 0)
 			next_hops[i] = -1;
 		else
-			next_hops[i] = next_hop;
+			next_hops[i] = (int16_t)next_hop;
+	}
+
+	return 0;
+}
+VERSION_SYMBOL(rte_lpm6_lookup_bulk_func, _v20, 2.0);
+
+int
+rte_lpm6_lookup_bulk_func_v1705(const struct rte_lpm6 *lpm,
+		uint8_t ips[][RTE_LPM6_IPV6_ADDR_SIZE],
+		int32_t *next_hops, unsigned int n)
+{
+	unsigned int i;
+	const struct rte_lpm6_tbl_entry *tbl;
+	const struct rte_lpm6_tbl_entry *tbl_next = NULL;
+	uint32_t tbl24_index, next_hop;
+	uint8_t first_byte;
+	int status;
+
+	/* DEBUG: Check user input arguments. */
+	if ((lpm == NULL) || (ips == NULL) || (next_hops == NULL))
+		return -EINVAL;
+
+	for (i = 0; i < n; i++) {
+		first_byte = LOOKUP_FIRST_BYTE;
+		tbl24_index = (ips[i][0] << BYTES2_SIZE) |
+				(ips[i][1] << BYTE_SIZE) | ips[i][2];
+
+		/* Calculate pointer to the first entry to be inspected */
+		tbl = &lpm->tbl24[tbl24_index];
+
+		do {
+			/* Continue inspecting following levels
+			 * until success or failure
+			 */
+			status = lookup_step(lpm, tbl, &tbl_next, ips[i],
+					first_byte++, &next_hop);
+			tbl = tbl_next;
+		} while (status == 1);
+
+		if (status < 0)
+			next_hops[i] = -1;
+		else
+			next_hops[i] = (int32_t)next_hop;
 	}
 
 	return 0;
 }
+BIND_DEFAULT_SYMBOL(rte_lpm6_lookup_bulk_func, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_lpm6_lookup_bulk_func(const struct rte_lpm6 *lpm,
+				uint8_t ips[][RTE_LPM6_IPV6_ADDR_SIZE],
+				int32_t *next_hops, unsigned int n),
+		rte_lpm6_lookup_bulk_func_v1705);
 
 /*
  * Finds a rule in rule table.
@@ -698,8 +780,28 @@ rule_find(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth)
  * Look for a rule in the high-level rules table
  */
 int
-rte_lpm6_is_rule_present(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
-uint8_t *next_hop)
+rte_lpm6_is_rule_present_v20(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint8_t *next_hop)
+{
+	uint32_t next_hop32 = 0;
+	int32_t status;
+
+	/* DEBUG: Check user input arguments. */
+	if (next_hop == NULL)
+		return -EINVAL;
+
+	status = rte_lpm6_is_rule_present_v1705(lpm, ip, depth, &next_hop32);
+	if (status > 0)
+		*next_hop = (uint8_t)next_hop32;
+
+	return status;
+
+}
+VERSION_SYMBOL(rte_lpm6_is_rule_present, _v20, 2.0);
+
+int
+rte_lpm6_is_rule_present_v1705(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint32_t *next_hop)
 {
 	uint8_t ip_masked[RTE_LPM6_IPV6_ADDR_SIZE];
 	int32_t rule_index;
@@ -724,6 +826,10 @@ uint8_t *next_hop)
 	/* If rule is not found return 0. */
 	return 0;
 }
+BIND_DEFAULT_SYMBOL(rte_lpm6_is_rule_present, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_lpm6_is_rule_present(struct rte_lpm6 *lpm,
+				uint8_t *ip, uint8_t depth, uint32_t *next_hop),
+		rte_lpm6_is_rule_present_v1705);
 
 /*
  * Delete a rule from the rule table.
diff --git a/lib/librte_lpm/rte_lpm6.h b/lib/librte_lpm/rte_lpm6.h
index 13d027f..3a3342d 100644
--- a/lib/librte_lpm/rte_lpm6.h
+++ b/lib/librte_lpm/rte_lpm6.h
@@ -39,6 +39,7 @@
  */
 
 #include <stdint.h>
+#include <rte_compat.h>
 
 #ifdef __cplusplus
 extern "C" {
@@ -123,7 +124,13 @@ rte_lpm6_free(struct rte_lpm6 *lpm);
  */
 int
 rte_lpm6_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint32_t next_hop);
+int
+rte_lpm6_add_v20(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
 		uint8_t next_hop);
+int
+rte_lpm6_add_v1705(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint32_t next_hop);
 
 /**
  * Check if a rule is present in the LPM table,
@@ -142,7 +149,13 @@ rte_lpm6_add(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
  */
 int
 rte_lpm6_is_rule_present(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
-uint8_t *next_hop);
+		uint32_t *next_hop);
+int
+rte_lpm6_is_rule_present_v20(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint8_t *next_hop);
+int
+rte_lpm6_is_rule_present_v1705(struct rte_lpm6 *lpm, uint8_t *ip, uint8_t depth,
+		uint32_t *next_hop);
 
 /**
  * Delete a rule from the LPM table.
@@ -199,7 +212,12 @@ rte_lpm6_delete_all(struct rte_lpm6 *lpm);
  *   -EINVAL for incorrect arguments, -ENOENT on lookup miss, 0 on lookup hit
  */
 int
-rte_lpm6_lookup(const struct rte_lpm6 *lpm, uint8_t *ip, uint8_t *next_hop);
+rte_lpm6_lookup(const struct rte_lpm6 *lpm, uint8_t *ip, uint32_t *next_hop);
+int
+rte_lpm6_lookup_v20(const struct rte_lpm6 *lpm, uint8_t *ip, uint8_t *next_hop);
+int
+rte_lpm6_lookup_v1705(const struct rte_lpm6 *lpm, uint8_t *ip,
+		uint32_t *next_hop);
 
 /**
  * Lookup multiple IP addresses in an LPM table.
@@ -220,7 +238,15 @@ rte_lpm6_lookup(const struct rte_lpm6 *lpm, uint8_t *ip, uint8_t *next_hop);
 int
 rte_lpm6_lookup_bulk_func(const struct rte_lpm6 *lpm,
 		uint8_t ips[][RTE_LPM6_IPV6_ADDR_SIZE],
-		int16_t * next_hops, unsigned n);
+		int32_t *next_hops, unsigned int n);
+int
+rte_lpm6_lookup_bulk_func_v20(const struct rte_lpm6 *lpm,
+		uint8_t ips[][RTE_LPM6_IPV6_ADDR_SIZE],
+		int16_t *next_hops, unsigned int n);
+int
+rte_lpm6_lookup_bulk_func_v1705(const struct rte_lpm6 *lpm,
+		uint8_t ips[][RTE_LPM6_IPV6_ADDR_SIZE],
+		int32_t *next_hops, unsigned int n);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_lpm/rte_lpm_version.map b/lib/librte_lpm/rte_lpm_version.map
index 239b371..90beac8 100644
--- a/lib/librte_lpm/rte_lpm_version.map
+++ b/lib/librte_lpm/rte_lpm_version.map
@@ -34,3 +34,13 @@ DPDK_16.04 {
 	rte_lpm_delete_all;
 
 } DPDK_2.0;
+
+DPDK_17.05 {
+	global:
+
+	rte_lpm6_add;
+	rte_lpm6_is_rule_present;
+	rte_lpm6_lookup;
+	rte_lpm6_lookup_bulk_func;
+
+} DPDK_16.04;
diff --git a/lib/librte_table/rte_table_lpm_ipv6.c b/lib/librte_table/rte_table_lpm_ipv6.c
index 836f4cf..1e1a173 100644
--- a/lib/librte_table/rte_table_lpm_ipv6.c
+++ b/lib/librte_table/rte_table_lpm_ipv6.c
@@ -211,9 +211,8 @@ rte_table_lpm_ipv6_entry_add(
 	struct rte_table_lpm_ipv6 *lpm = (struct rte_table_lpm_ipv6 *) table;
 	struct rte_table_lpm_ipv6_key *ip_prefix =
 		(struct rte_table_lpm_ipv6_key *) key;
-	uint32_t nht_pos, nht_pos0_valid;
+	uint32_t nht_pos, nht_pos0, nht_pos0_valid;
 	int status;
-	uint8_t nht_pos0;
 
 	/* Check input parameters */
 	if (lpm == NULL) {
@@ -256,7 +255,7 @@ rte_table_lpm_ipv6_entry_add(
 
 	/* Add rule to low level LPM table */
 	if (rte_lpm6_add(lpm->lpm, ip_prefix->ip, ip_prefix->depth,
-		(uint8_t) nht_pos) < 0) {
+		nht_pos) < 0) {
 		RTE_LOG(ERR, TABLE, "%s: LPM IPv6 rule add failed\n", __func__);
 		return -1;
 	}
@@ -280,7 +279,7 @@ rte_table_lpm_ipv6_entry_delete(
 	struct rte_table_lpm_ipv6 *lpm = (struct rte_table_lpm_ipv6 *) table;
 	struct rte_table_lpm_ipv6_key *ip_prefix =
 		(struct rte_table_lpm_ipv6_key *) key;
-	uint8_t nht_pos;
+	uint32_t nht_pos;
 	int status;
 
 	/* Check input parameters */
@@ -356,7 +355,7 @@ rte_table_lpm_ipv6_lookup(
 			uint8_t *ip = RTE_MBUF_METADATA_UINT8_PTR(pkt,
 				lpm->offset);
 			int status;
-			uint8_t nht_pos;
+			uint32_t nht_pos;
 
 			status = rte_lpm6_lookup(lpm->lpm, ip, &nht_pos);
 			if (status == 0) {
diff --git a/test/test/test_lpm6.c b/test/test/test_lpm6.c
index 61134f7..e0e7bf0 100644
--- a/test/test/test_lpm6.c
+++ b/test/test/test_lpm6.c
@@ -79,6 +79,7 @@ static int32_t test24(void);
 static int32_t test25(void);
 static int32_t test26(void);
 static int32_t test27(void);
+static int32_t test28(void);
 
 rte_lpm6_test tests6[] = {
 /* Test Cases */
@@ -110,6 +111,7 @@ rte_lpm6_test tests6[] = {
 	test25,
 	test26,
 	test27,
+	test28,
 };
 
 #define NUM_LPM6_TESTS                (sizeof(tests6)/sizeof(tests6[0]))
@@ -354,7 +356,7 @@ test6(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t next_hop_return = 0;
+	uint32_t next_hop_return = 0;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -392,7 +394,7 @@ test7(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[10][16];
-	int16_t next_hop_return[10];
+	int32_t next_hop_return[10];
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -469,7 +471,8 @@ test9(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth = 16, next_hop_add = 100, next_hop_return = 0;
+	uint8_t depth = 16;
+	uint32_t next_hop_add = 100, next_hop_return = 0;
 	int32_t status = 0;
 	uint8_t i;
 
@@ -513,7 +516,8 @@ test10(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth, next_hop_add = 100;
+	uint8_t depth;
+	uint32_t next_hop_add = 100;
 	int32_t status = 0;
 	int i;
 
@@ -557,7 +561,8 @@ test11(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth, next_hop_add = 100;
+	uint8_t depth;
+	uint32_t next_hop_add = 100;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -617,7 +622,8 @@ test12(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth, next_hop_add = 100;
+	uint8_t depth;
+	uint32_t next_hop_add = 100;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -655,7 +661,8 @@ test13(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth, next_hop_add = 100;
+	uint8_t depth;
+	uint32_t next_hop_add = 100;
 	int32_t status = 0;
 
 	config.max_rules = 2;
@@ -702,7 +709,8 @@ test14(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth = 25, next_hop_add = 100;
+	uint8_t depth = 25;
+	uint32_t next_hop_add = 100;
 	int32_t status = 0;
 	int i;
 
@@ -748,7 +756,8 @@ test15(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth = 24, next_hop_add = 100, next_hop_return = 0;
+	uint8_t depth = 24;
+	uint32_t next_hop_add = 100, next_hop_return = 0;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -784,7 +793,8 @@ test16(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[] = {12,12,1,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth = 128, next_hop_add = 100, next_hop_return = 0;
+	uint8_t depth = 128;
+	uint32_t next_hop_add = 100, next_hop_return = 0;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -828,7 +838,8 @@ test17(void)
 	uint8_t ip1[] = {127,255,255,255,255,255,255,255,255,
 			255,255,255,255,255,255,255};
 	uint8_t ip2[] = {128,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
-	uint8_t depth, next_hop_add, next_hop_return;
+	uint8_t depth;
+	uint32_t next_hop_add, next_hop_return;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -857,7 +868,7 @@ test17(void)
 
 	/* Loop with rte_lpm6_delete. */
 	for (depth = 16; depth >= 1; depth--) {
-		next_hop_add = (uint8_t) (depth - 1);
+		next_hop_add = (depth - 1);
 
 		status = rte_lpm6_delete(lpm, ip2, depth);
 		TEST_LPM_ASSERT(status == 0);
@@ -893,8 +904,9 @@ test18(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[16], ip_1[16], ip_2[16];
-	uint8_t depth, depth_1, depth_2, next_hop_add, next_hop_add_1,
-		next_hop_add_2, next_hop_return;
+	uint8_t depth, depth_1, depth_2;
+	uint32_t next_hop_add, next_hop_add_1,
+			next_hop_add_2, next_hop_return;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1055,7 +1067,8 @@ test19(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[16];
-	uint8_t depth, next_hop_add, next_hop_return;
+	uint8_t depth;
+	uint32_t next_hop_add, next_hop_return;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1253,7 +1266,8 @@ test20(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip[16];
-	uint8_t depth, next_hop_add, next_hop_return;
+	uint8_t depth;
+	uint32_t next_hop_add, next_hop_return;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1320,8 +1334,9 @@ test21(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip_batch[4][16];
-	uint8_t depth, next_hop_add;
-	int16_t next_hop_return[4];
+	uint8_t depth;
+	uint32_t next_hop_add;
+	int32_t next_hop_return[4];
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1378,8 +1393,9 @@ test22(void)
 	struct rte_lpm6 *lpm = NULL;
 	struct rte_lpm6_config config;
 	uint8_t ip_batch[5][16];
-	uint8_t depth[5], next_hop_add;
-	int16_t next_hop_return[5];
+	uint8_t depth[5];
+	uint32_t next_hop_add;
+	int32_t next_hop_return[5];
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1495,7 +1511,8 @@ test23(void)
 	struct rte_lpm6_config config;
 	uint32_t i;
 	uint8_t ip[16];
-	uint8_t depth, next_hop_add, next_hop_return;
+	uint8_t depth;
+	uint32_t next_hop_add, next_hop_return;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1579,7 +1596,8 @@ test25(void)
 	struct rte_lpm6_config config;
 	uint8_t ip[16];
 	uint32_t i;
-	uint8_t depth, next_hop_add, next_hop_return, next_hop_expected;
+	uint8_t depth;
+	uint32_t next_hop_add, next_hop_return, next_hop_expected;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1632,10 +1650,10 @@ test26(void)
 	uint8_t d_ip_10_32 = 32;
 	uint8_t	d_ip_10_24 = 24;
 	uint8_t	d_ip_20_25 = 25;
-	uint8_t next_hop_ip_10_32 = 100;
-	uint8_t	next_hop_ip_10_24 = 105;
-	uint8_t	next_hop_ip_20_25 = 111;
-	uint8_t next_hop_return = 0;
+	uint32_t next_hop_ip_10_32 = 100;
+	uint32_t next_hop_ip_10_24 = 105;
+	uint32_t next_hop_ip_20_25 = 111;
+	uint32_t next_hop_return = 0;
 	int32_t status = 0;
 
 	config.max_rules = MAX_RULES;
@@ -1650,7 +1668,7 @@ test26(void)
 		return -1;
 
 	status = rte_lpm6_lookup(lpm, ip_10_32, &next_hop_return);
-	uint8_t test_hop_10_32 = next_hop_return;
+	uint32_t test_hop_10_32 = next_hop_return;
 	TEST_LPM_ASSERT(status == 0);
 	TEST_LPM_ASSERT(next_hop_return == next_hop_ip_10_32);
 
@@ -1659,7 +1677,7 @@ test26(void)
 			return -1;
 
 	status = rte_lpm6_lookup(lpm, ip_10_24, &next_hop_return);
-	uint8_t test_hop_10_24 = next_hop_return;
+	uint32_t test_hop_10_24 = next_hop_return;
 	TEST_LPM_ASSERT(status == 0);
 	TEST_LPM_ASSERT(next_hop_return == next_hop_ip_10_24);
 
@@ -1668,7 +1686,7 @@ test26(void)
 		return -1;
 
 	status = rte_lpm6_lookup(lpm, ip_20_25, &next_hop_return);
-	uint8_t test_hop_20_25 = next_hop_return;
+	uint32_t test_hop_20_25 = next_hop_return;
 	TEST_LPM_ASSERT(status == 0);
 	TEST_LPM_ASSERT(next_hop_return == next_hop_ip_20_25);
 
@@ -1707,7 +1725,8 @@ test27(void)
 		struct rte_lpm6 *lpm = NULL;
 		struct rte_lpm6_config config;
 		uint8_t ip[] = {128,128,128,128,128,128,128,128,128,128,128,128,128,128,0,0};
-		uint8_t depth = 128, next_hop_add = 100, next_hop_return;
+		uint8_t depth = 128;
+		uint32_t next_hop_add = 100, next_hop_return;
 		int32_t status = 0;
 		int i, j;
 
@@ -1746,6 +1765,42 @@ test27(void)
 }
 
 /*
+ * Call add, lookup and delete for a single rule with maximum 21bit next_hop
+ * size.
+ * Check that next_hop returned from lookup is equal to provisioned value.
+ * Delete the rule and check that the same test returs a miss.
+ */
+int32_t
+test28(void)
+{
+	struct rte_lpm6 *lpm = NULL;
+	struct rte_lpm6_config config;
+	uint8_t ip[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
+	uint8_t depth = 16;
+	uint32_t next_hop_add = 0x001FFFFF, next_hop_return = 0;
+	int32_t status = 0;
+
+	config.max_rules = MAX_RULES;
+	config.number_tbl8s = NUMBER_TBL8S;
+	config.flags = 0;
+
+	lpm = rte_lpm6_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_LPM_ASSERT(lpm != NULL);
+
+	status = rte_lpm6_add(lpm, ip, depth, next_hop_add);
+	TEST_LPM_ASSERT(status == 0);
+
+	status = rte_lpm6_lookup(lpm, ip, &next_hop_return);
+	TEST_LPM_ASSERT((status == 0) && (next_hop_return == next_hop_add));
+
+	status = rte_lpm6_delete(lpm, ip, depth);
+	TEST_LPM_ASSERT(status == 0);
+	rte_lpm6_free(lpm);
+
+	return PASS;
+}
+
+/*
  * Do all unit tests.
  */
 static int
diff --git a/test/test/test_lpm6_perf.c b/test/test/test_lpm6_perf.c
index 0723081..30be430 100644
--- a/test/test/test_lpm6_perf.c
+++ b/test/test/test_lpm6_perf.c
@@ -86,7 +86,7 @@ test_lpm6_perf(void)
 	struct rte_lpm6_config config;
 	uint64_t begin, total_time;
 	unsigned i, j;
-	uint8_t next_hop_add = 0xAA, next_hop_return = 0;
+	uint32_t next_hop_add = 0xAA, next_hop_return = 0;
 	int status = 0;
 	int64_t count = 0;
 
@@ -148,7 +148,7 @@ test_lpm6_perf(void)
 	count = 0;
 
 	uint8_t ip_batch[NUM_IPS_ENTRIES][16];
-	int16_t next_hops[NUM_IPS_ENTRIES];
+	int32_t next_hops[NUM_IPS_ENTRIES];
 
 	for (i = 0; i < NUM_IPS_ENTRIES; i++)
 		memcpy(ip_batch[i], large_ips_table[i].ip, 16);
-- 
2.1.4

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 04/17] net/avp: add PMD version map file
  @ 2017-03-13 19:16  3%       ` Allain Legacy
  2017-03-16 14:52  0%         ` Ferruh Yigit
    1 sibling, 1 reply; 200+ results
From: Allain Legacy @ 2017-03-13 19:16 UTC (permalink / raw)
  To: ferruh.yigit
  Cc: dev, ian.jolliffe, bruce.richardson, john.mcnamara, keith.wiles,
	thomas.monjalon, vincent.jardin, jerin.jacob, stephen, 3chas3

Adds a default ABI version file for the AVP PMD.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
---
 drivers/net/avp/rte_pmd_avp_version.map | 4 ++++
 1 file changed, 4 insertions(+)
 create mode 100644 drivers/net/avp/rte_pmd_avp_version.map

diff --git a/drivers/net/avp/rte_pmd_avp_version.map b/drivers/net/avp/rte_pmd_avp_version.map
new file mode 100644
index 0000000..af8f3f4
--- /dev/null
+++ b/drivers/net/avp/rte_pmd_avp_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+
+    local: *;
+};
-- 
1.8.3.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
  2017-03-13 11:01  0%                 ` Van Haaren, Harry
@ 2017-03-13 11:02  0%                   ` Hunt, David
  0 siblings, 0 replies; 200+ results
From: Hunt, David @ 2017-03-13 11:02 UTC (permalink / raw)
  To: Van Haaren, Harry, Richardson, Bruce; +Cc: dev


On 13/3/2017 11:01 AM, Van Haaren, Harry wrote:
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Hunt, David
>> Subject: Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
>>
>> On 10/3/2017 4:22 PM, Bruce Richardson wrote:
>>> On Mon, Mar 06, 2017 at 09:10:24AM +0000, David Hunt wrote:
>>>> Also bumped up the ABI version number in the Makefile
>>>>
>>>> Signed-off-by: David Hunt <david.hunt@intel.com>
>>>> ---
>>>>    lib/librte_distributor/Makefile                    |  2 +-
>>>>    lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
>>>>    lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
>>> A file named rte_distributor_v1705.h was added in patch 4, then deleted
>>> in patch 7, and now added again here. Seems a lot of churn.
>>>
>>> /Bruce
>>>
>> The first introduction of this file is what will become the public
>> header. For successful compilation,
>> this cannot be called rte_distributor.h until the symbol versioning
>> patch, at which stage I will
>> rename the file, and introduce the symbol versioned header at the same
>> time. In the next patch
>> I'll rename this version of the files as rte_distributor_public.h to
>> make this clearer.
>
> Suggestion to use rte_distributor_next.h instead of public?
> Public doesn't indicate if its old or new, while next would make that clearer IMO :)

Good call, will use "_next". Its clearer.
Thanks,
Dave.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
  2017-03-13 10:28  0%               ` Hunt, David
@ 2017-03-13 11:01  0%                 ` Van Haaren, Harry
  2017-03-13 11:02  0%                   ` Hunt, David
  0 siblings, 1 reply; 200+ results
From: Van Haaren, Harry @ 2017-03-13 11:01 UTC (permalink / raw)
  To: Hunt, David, Richardson, Bruce; +Cc: dev

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Hunt, David
> Subject: Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
> 
> On 10/3/2017 4:22 PM, Bruce Richardson wrote:
> > On Mon, Mar 06, 2017 at 09:10:24AM +0000, David Hunt wrote:
> >> Also bumped up the ABI version number in the Makefile
> >>
> >> Signed-off-by: David Hunt <david.hunt@intel.com>
> >> ---
> >>   lib/librte_distributor/Makefile                    |  2 +-
> >>   lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
> >>   lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
> > A file named rte_distributor_v1705.h was added in patch 4, then deleted
> > in patch 7, and now added again here. Seems a lot of churn.
> >
> > /Bruce
> >
> 
> The first introduction of this file is what will become the public
> header. For successful compilation,
> this cannot be called rte_distributor.h until the symbol versioning
> patch, at which stage I will
> rename the file, and introduce the symbol versioned header at the same
> time. In the next patch
> I'll rename this version of the files as rte_distributor_public.h to
> make this clearer.


Suggestion to use rte_distributor_next.h instead of public?
Public doesn't indicate if its old or new, while next would make that clearer IMO :)

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
  2017-03-10 16:22  0%             ` Bruce Richardson
  2017-03-13 10:17  0%               ` Hunt, David
@ 2017-03-13 10:28  0%               ` Hunt, David
  2017-03-13 11:01  0%                 ` Van Haaren, Harry
  1 sibling, 1 reply; 200+ results
From: Hunt, David @ 2017-03-13 10:28 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev


On 10/3/2017 4:22 PM, Bruce Richardson wrote:
> On Mon, Mar 06, 2017 at 09:10:24AM +0000, David Hunt wrote:
>> Also bumped up the ABI version number in the Makefile
>>
>> Signed-off-by: David Hunt <david.hunt@intel.com>
>> ---
>>   lib/librte_distributor/Makefile                    |  2 +-
>>   lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
>>   lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
> A file named rte_distributor_v1705.h was added in patch 4, then deleted
> in patch 7, and now added again here. Seems a lot of churn.
>
> /Bruce
>

The first introduction of this file is what will become the public 
header. For successful compilation,
this cannot be called rte_distributor.h until the symbol versioning 
patch, at which stage I will
rename the file, and introduce the symbol versioned header at the same 
time. In the next patch
I'll rename this version of the files as rte_distributor_public.h to 
make this clearer.

Regards,
Dave.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
  2017-03-10 16:22  0%             ` Bruce Richardson
@ 2017-03-13 10:17  0%               ` Hunt, David
  2017-03-13 10:28  0%               ` Hunt, David
  1 sibling, 0 replies; 200+ results
From: Hunt, David @ 2017-03-13 10:17 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev



-----Original Message-----
From: Richardson, Bruce 
Sent: Friday, 10 March, 2017 4:22 PM
To: Hunt, David <david.hunt@intel.com>
Cc: dev@dpdk.org
Subject: Re: [PATCH v9 09/18] lib: add symbol versioning to distributor

On Mon, Mar 06, 2017 at 09:10:24AM +0000, David Hunt wrote:
> Also bumped up the ABI version number in the Makefile
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  lib/librte_distributor/Makefile                    |  2 +-
>  lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
>  lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++

A file named rte_distributor_v1705.h was added in patch 4, then deleted in patch 7, and now added again here. Seems a lot of churn.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
  2017-03-06  9:10  2%           ` [dpdk-dev] [PATCH v9 09/18] " David Hunt
@ 2017-03-10 16:22  0%             ` Bruce Richardson
  2017-03-13 10:17  0%               ` Hunt, David
  2017-03-13 10:28  0%               ` Hunt, David
  0 siblings, 2 replies; 200+ results
From: Bruce Richardson @ 2017-03-10 16:22 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Mon, Mar 06, 2017 at 09:10:24AM +0000, David Hunt wrote:
> Also bumped up the ABI version number in the Makefile
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  lib/librte_distributor/Makefile                    |  2 +-
>  lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
>  lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++

A file named rte_distributor_v1705.h was added in patch 4, then deleted
in patch 7, and now added again here. Seems a lot of churn.

/Bruce

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v11 3/7] lib: add bitrate statistics library
    2017-03-09 16:25  1% ` [dpdk-dev] [PATCH v11 1/7] lib: add information metrics library Remy Horton
@ 2017-03-09 16:25  2% ` Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-09 16:25 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a library that calculates peak and average data-rate
statistics. For ethernet devices. These statistics are reported using
the metrics library.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                        |   4 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   1 +
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/prog_guide/metrics_lib.rst              |  65 ++++++++++
 doc/guides/rel_notes/release_17_02.rst             |   1 +
 doc/guides/rel_notes/release_17_05.rst             |   5 +
 lib/Makefile                                       |   1 +
 lib/librte_bitratestats/Makefile                   |  53 ++++++++
 lib/librte_bitratestats/rte_bitrate.c              | 141 +++++++++++++++++++++
 lib/librte_bitratestats/rte_bitrate.h              |  80 ++++++++++++
 .../rte_bitratestats_version.map                   |   9 ++
 mk/rte.app.mk                                      |   1 +
 13 files changed, 367 insertions(+)
 create mode 100644 lib/librte_bitratestats/Makefile
 create mode 100644 lib/librte_bitratestats/rte_bitrate.c
 create mode 100644 lib/librte_bitratestats/rte_bitrate.h
 create mode 100644 lib/librte_bitratestats/rte_bitratestats_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 66478f3..8abf4fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -639,6 +639,10 @@ Metrics
 M: Remy Horton <remy.horton@intel.com>
 F: lib/librte_metrics/
 
+Bit-rate statistica
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_bitratestats/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index cea055f..d700ee0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -630,3 +630,8 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
 # Compile the crypto performance application
 #
 CONFIG_RTE_APP_CRYPTO_PERF=y
+
+#
+# Compile the bitrate statistics library
+#
+CONFIG_RTE_LIBRTE_BITRATE=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 26a26b7..8492bce 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -157,4 +157,5 @@ There are many libraries, so their headers may be grouped by topics:
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
   [device metrics]     (@ref rte_metrics.h),
+  [bitrate statistics] (@ref rte_bitrate.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fbbcf8e..c4b3b68 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -36,6 +36,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_eal/common/include \
                           lib/librte_eal/common/include/generic \
                           lib/librte_acl \
+                          lib/librte_bitratestats \
                           lib/librte_cfgfile \
                           lib/librte_cmdline \
                           lib/librte_compat \
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
index 87f806d..1c2a28f 100644
--- a/doc/guides/prog_guide/metrics_lib.rst
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -178,3 +178,68 @@ print out all metrics for a given port:
         free(metrics);
         free(names);
     }
+
+
+Bit-rate statistics library
+---------------------------
+
+The bit-rate library calculates the exponentially-weighted moving
+average and peak bit-rates for each active port (i.e. network device).
+These statistics are reported via the metrics library using the
+following names:
+
+    - ``mean_bits_in``: Average inbound bit-rate
+    - ``mean_bits_out``:  Average outbound bit-rate
+    - ``ewma_bits_in``: Average inbound bit-rate (EWMA smoothed)
+    - ``ewma_bits_out``:  Average outbound bit-rate (EWMA smoothed)
+    - ``peak_bits_in``:  Peak inbound bit-rate
+    - ``peak_bits_out``:  Peak outbound bit-rate
+
+Once initialised and clocked at the appropriate frequency, these
+statistics can be obtained by querying the metrics library.
+
+Initialization
+~~~~~~~~~~~~~~
+
+Before it is used the bit-rate statistics library has to be initialised
+by calling ``rte_stats_bitrate_create()``, which will return a bit-rate
+calculation object. Since the bit-rate library uses the metrics library
+to report the calculated statistics, the bit-rate library then needs to
+register the calculated statistics with the metrics library. This is
+done using the helper function ``rte_stats_bitrate_reg()``.
+
+.. code-block:: c
+
+    struct rte_stats_bitrates *bitrate_data;
+
+    bitrate_data = rte_stats_bitrate_create();
+    if (bitrate_data == NULL)
+        rte_exit(EXIT_FAILURE, "Could not allocate bit-rate data.\n");
+    rte_stats_bitrate_reg(bitrate_data);
+
+Controlling the sampling rate
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since the library works by periodic sampling but does not use an
+internal thread, the application has to periodically call
+``rte_stats_bitrate_calc()``. The frequency at which this function
+is called should be the intended sampling rate required for the
+calculated statistics. For instance if per-second statistics are
+desired, this function should be called once a second.
+
+.. code-block:: c
+
+    tics_datum = rte_rdtsc();
+    tics_per_1sec = rte_get_timer_hz();
+
+    while( 1 ) {
+        /* ... */
+        tics_current = rte_rdtsc();
+	if (tics_current - tics_datum >= tics_per_1sec) {
+	    /* Periodic bitrate calculation */
+	    for (idx_port = 0; idx_port < cnt_ports; idx_port++)
+	            rte_stats_bitrate_calc(bitrate_data, idx_port);
+		tics_datum = tics_current;
+	    }
+        /* ... */
+    }
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 8bd706f..63786df 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -353,6 +353,7 @@ The libraries prepended with a plus sign were incremented in this version.
 .. code-block:: diff
 
      librte_acl.so.2
+   + librte_bitratestats.so.1
      librte_cfgfile.so.2
      librte_cmdline.so.2
      librte_cryptodev.so.2
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 3ed809e..83c83b2 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -69,6 +69,11 @@ Resolved Issues
   reporting mechanism that is independent of other libraries such
   as ethdev.
 
+* **Added bit-rate calculation library.**
+
+  A library that can be used to calculate device bit-rates. Calculated
+  bitrates are reported using the metrics library.
+
 
 EAL
 ~~~
diff --git a/lib/Makefile b/lib/Makefile
index 29f6a81..ecc54c0 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -50,6 +50,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
 DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
 DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
+DIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += librte_bitratestats
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
diff --git a/lib/librte_bitratestats/Makefile b/lib/librte_bitratestats/Makefile
new file mode 100644
index 0000000..743b62c
--- /dev/null
+++ b/lib/librte_bitratestats/Makefile
@@ -0,0 +1,53 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_bitratestats.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_bitratestats_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_BITRATE) := rte_bitrate.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_BITRATE)-include += rte_bitrate.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BITRATE) += lib/librte_metrics
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_bitratestats/rte_bitrate.c b/lib/librte_bitratestats/rte_bitrate.c
new file mode 100644
index 0000000..3252598
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.c
@@ -0,0 +1,141 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_bitrate.h>
+
+/*
+ * Persistent bit-rate data.
+ * @internal
+ */
+struct rte_stats_bitrate {
+	uint64_t last_ibytes;
+	uint64_t last_obytes;
+	uint64_t peak_ibits;
+	uint64_t peak_obits;
+	uint64_t mean_ibits;
+	uint64_t mean_obits;
+	uint64_t ewma_ibits;
+	uint64_t ewma_obits;
+};
+
+struct rte_stats_bitrates {
+	struct rte_stats_bitrate port_stats[RTE_MAX_ETHPORTS];
+	uint16_t id_stats_set;
+};
+
+struct rte_stats_bitrates *
+rte_stats_bitrate_create(void)
+{
+	return rte_zmalloc(NULL, sizeof(struct rte_stats_bitrates),
+		RTE_CACHE_LINE_SIZE);
+}
+
+int
+rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data)
+{
+	const char * const names[] = {
+		"ewma_bits_in", "ewma_bits_out",
+		"mean_bits_in", "mean_bits_out",
+		"peak_bits_in", "peak_bits_out",
+	};
+	int return_value;
+
+	return_value = rte_metrics_reg_names(&names[0], 6);
+	if (return_value >= 0)
+		bitrate_data->id_stats_set = return_value;
+	return return_value;
+}
+
+int
+rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id)
+{
+	struct rte_stats_bitrate *port_data;
+	struct rte_eth_stats eth_stats;
+	int ret_code;
+	uint64_t cnt_bits;
+	int64_t delta;
+	const int64_t alpha_percent = 20;
+	uint64_t values[6];
+
+	ret_code = rte_eth_stats_get(port_id, &eth_stats);
+	if (ret_code != 0)
+		return ret_code;
+
+	port_data = &bitrate_data->port_stats[port_id];
+
+	/* Incoming bitrate. This is an iteratively calculated EWMA
+	 * (Expomentially Weighted Moving Average) that uses a
+	 * weighting factor of alpha_percent. An unsmoothed mean
+	 * for just the current time delta is also calculated for the
+	 * benefit of people who don't understand signal processing.
+	 */
+	cnt_bits = (eth_stats.ibytes - port_data->last_ibytes) << 3;
+	port_data->last_ibytes = eth_stats.ibytes;
+	if (cnt_bits > port_data->peak_ibits)
+		port_data->peak_ibits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_ibits;
+	/* The +-50 fixes integer rounding during divison */
+	if (delta > 0)
+		delta = (delta * alpha_percent + 50) / 100;
+	else
+		delta = (delta * alpha_percent - 50) / 100;
+	port_data->ewma_ibits += delta;
+	port_data->mean_ibits = cnt_bits;
+
+	/* Outgoing bitrate (also EWMA) */
+	cnt_bits = (eth_stats.obytes - port_data->last_obytes) << 3;
+	port_data->last_obytes = eth_stats.obytes;
+	if (cnt_bits > port_data->peak_obits)
+		port_data->peak_obits = cnt_bits;
+	delta = cnt_bits;
+	delta -= port_data->ewma_obits;
+	delta = (delta * alpha_percent + 50) / 100;
+	port_data->ewma_obits += delta;
+	port_data->mean_obits = cnt_bits;
+
+	values[0] = port_data->ewma_ibits;
+	values[1] = port_data->ewma_obits;
+	values[2] = port_data->mean_ibits;
+	values[3] = port_data->mean_obits;
+	values[4] = port_data->peak_ibits;
+	values[5] = port_data->peak_obits;
+	rte_metrics_update_values(port_id, bitrate_data->id_stats_set,
+		values, 6);
+	return 0;
+}
diff --git a/lib/librte_bitratestats/rte_bitrate.h b/lib/librte_bitratestats/rte_bitrate.h
new file mode 100644
index 0000000..564e4f7
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitrate.h
@@ -0,0 +1,80 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+/**
+ *  Bitrate statistics data structure.
+ *  This data structure is intentionally opaque.
+ */
+struct rte_stats_bitrates;
+
+
+/**
+ * Allocate a bitrate statistics structure
+ *
+ * @return
+ *   - Pointer to structure on success
+ *   - NULL on error (zmalloc failure)
+ */
+struct rte_stats_bitrates *rte_stats_bitrate_create(void);
+
+
+/**
+ * Register bitrate statistics with the metric library.
+ *
+ * @param bitrate_data
+ *   Pointer allocated by rte_stats_create()
+ *
+ * @return
+ *   Zero on success
+ *   Negative on error
+ */
+int rte_stats_bitrate_reg(struct rte_stats_bitrates *bitrate_data);
+
+
+/**
+ * Calculate statistics for current time window. The period with which
+ * this function is called should be the intended sampling window width.
+ *
+ * @param bitrate_data
+ *   Bitrate statistics data pointer
+ *
+ * @param port_id
+ *   Port id to calculate statistics for
+ *
+ * @return
+ *  - Zero on success
+ *  - Negative value on error
+ */
+int rte_stats_bitrate_calc(struct rte_stats_bitrates *bitrate_data,
+	uint8_t port_id);
diff --git a/lib/librte_bitratestats/rte_bitratestats_version.map b/lib/librte_bitratestats/rte_bitratestats_version.map
new file mode 100644
index 0000000..fe74544
--- /dev/null
+++ b/lib/librte_bitratestats/rte_bitratestats_version.map
@@ -0,0 +1,9 @@
+DPDK_17.05 {
+	global:
+
+	rte_stats_bitrate_calc;
+	rte_stats_bitrate_create;
+	rte_stats_bitrate_reg;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 98eb052..39c988a 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -100,6 +100,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
+_LDLIBS-$(CONFIG_RTE_LIBRTE_BITRATE)        += -lrte_bitratestats
 
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
-- 
2.5.5

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v11 1/7] lib: add information metrics library
  @ 2017-03-09 16:25  1% ` Remy Horton
  2017-03-09 16:25  2% ` [dpdk-dev] [PATCH v11 3/7] lib: add bitrate statistics library Remy Horton
  1 sibling, 0 replies; 200+ results
From: Remy Horton @ 2017-03-09 16:25 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

This patch adds a new information metrics library. This Metrics
library implements a mechanism by which producers can publish
numeric information for later querying by consumers. Metrics
themselves are statistics that are not generated by PMDs, and
hence are not reported via ethdev extended statistics.

Metric information is populated using a push model, where
producers update the values contained within the metric
library by calling an update function on the relevant metrics.
Consumers receive metric information by querying the central
metric data, which is held in shared memory.

Signed-off-by: Remy Horton <remy.horton@intel.com>
---
 MAINTAINERS                                |   4 +
 config/common_base                         |   5 +
 doc/api/doxy-api-index.md                  |   1 +
 doc/api/doxy-api.conf                      |   1 +
 doc/guides/prog_guide/index.rst            |   1 +
 doc/guides/prog_guide/metrics_lib.rst      | 180 +++++++++++++++++
 doc/guides/rel_notes/release_17_02.rst     |   1 +
 doc/guides/rel_notes/release_17_05.rst     |   8 +
 lib/Makefile                               |   1 +
 lib/librte_metrics/Makefile                |  51 +++++
 lib/librte_metrics/rte_metrics.c           | 299 +++++++++++++++++++++++++++++
 lib/librte_metrics/rte_metrics.h           | 240 +++++++++++++++++++++++
 lib/librte_metrics/rte_metrics_version.map |  13 ++
 mk/rte.app.mk                              |   2 +
 14 files changed, 807 insertions(+)
 create mode 100644 doc/guides/prog_guide/metrics_lib.rst
 create mode 100644 lib/librte_metrics/Makefile
 create mode 100644 lib/librte_metrics/rte_metrics.c
 create mode 100644 lib/librte_metrics/rte_metrics.h
 create mode 100644 lib/librte_metrics/rte_metrics_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 5030c1c..66478f3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -635,6 +635,10 @@ F: lib/librte_jobstats/
 F: examples/l2fwd-jobstats/
 F: doc/guides/sample_app_ug/l2_forward_job_stats.rst
 
+Metrics
+M: Remy Horton <remy.horton@intel.com>
+F: lib/librte_metrics/
+
 
 Test Applications
 -----------------
diff --git a/config/common_base b/config/common_base
index aeee13e..cea055f 100644
--- a/config/common_base
+++ b/config/common_base
@@ -501,6 +501,11 @@ CONFIG_RTE_LIBRTE_EFD=y
 CONFIG_RTE_LIBRTE_JOBSTATS=y
 
 #
+# Compile the device metrics library
+#
+CONFIG_RTE_LIBRTE_METRICS=y
+
+#
 # Compile librte_lpm
 #
 CONFIG_RTE_LIBRTE_LPM=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index eb39f69..26a26b7 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -156,4 +156,5 @@ There are many libraries, so their headers may be grouped by topics:
   [common]             (@ref rte_common.h),
   [ABI compat]         (@ref rte_compat.h),
   [keepalive]          (@ref rte_keepalive.h),
+  [device metrics]     (@ref rte_metrics.h),
   [version]            (@ref rte_version.h)
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index fdcf13c..fbbcf8e 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -52,6 +52,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_mbuf \
                           lib/librte_mempool \
                           lib/librte_meter \
+                          lib/librte_metrics \
                           lib/librte_net \
                           lib/librte_pdump \
                           lib/librte_pipeline \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 77f427e..2a69844 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -62,6 +62,7 @@ Programmer's Guide
     packet_classif_access_ctrl
     packet_framework
     vhost_lib
+    metrics_lib
     port_hotplug_framework
     source_org
     dev_kit_build_system
diff --git a/doc/guides/prog_guide/metrics_lib.rst b/doc/guides/prog_guide/metrics_lib.rst
new file mode 100644
index 0000000..87f806d
--- /dev/null
+++ b/doc/guides/prog_guide/metrics_lib.rst
@@ -0,0 +1,180 @@
+..  BSD LICENSE
+    Copyright(c) 2017 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Metrics_Library:
+
+Metrics Library
+===============
+
+The Metrics library implements a mechanism by which *producers* can
+publish numeric information for later querying by *consumers*. In
+practice producers will typically be other libraries or primary
+processes, whereas consumers will typically be applications.
+
+Metrics themselves are statistics that are not generated by PMDs. Metric
+information is populated using a push model, where producers update the
+values contained within the metric library by calling an update function
+on the relevant metrics. Consumers receive metric information by querying
+the central metric data, which is held in shared memory.
+
+For each metric, a separate value is maintained for each port id, and
+when publishing metric values the producers need to specify which port is
+being updated. In addition there is a special id ``RTE_METRICS_GLOBAL``
+that is intended for global statistics that are not associated with any
+individual device. Since the metrics library is self-contained, the only
+restriction on port numbers is that they are less than ``RTE_MAX_ETHPORTS``
+- there is no requirement for the ports to actually exist.
+
+Initialising the library
+------------------------
+
+Before the library can be used, it has to be initialized by calling
+``rte_metrics_init()`` which sets up the metric store in shared memory.
+This is where producers will publish metric information to, and where
+consumers will query it from.
+
+.. code-block:: c
+
+    rte_metrics_init(rte_socket_id());
+
+This function **must** be called from a primary process, but otherwise
+producers and consumers can be in either primary or secondary processes.
+
+Registering metrics
+-------------------
+
+Metrics must first be *registered*, which is the way producers declare
+the names of the metrics they will be publishing. Registration can either
+be done individually, or a set of metrics can be registered as a group.
+Individual registration is done using ``rte_metrics_reg_name()``:
+
+.. code-block:: c
+
+    id_1 = rte_metrics_reg_name("mean_bits_in");
+    id_2 = rte_metrics_reg_name("mean_bits_out");
+    id_3 = rte_metrics_reg_name("peak_bits_in");
+    id_4 = rte_metrics_reg_name("peak_bits_out");
+
+or alternatively, a set of metrics can be registered together using
+``rte_metrics_reg_names()``:
+
+.. code-block:: c
+
+    const char * const names[] = {
+        "mean_bits_in", "mean_bits_out",
+        "peak_bits_in", "peak_bits_out",
+    };
+    id_set = rte_metrics_reg_names(&names[0], 4);
+
+If the return value is negative, it means registration failed. Otherwise
+the return value is the *key* for the metric, which is used when updating
+values. A table mapping together these key values and the metrics' names
+can be obtained using ``rte_metrics_get_names()``.
+
+Updating metric values
+----------------------
+
+Once registered, producers can update the metric for a given port using
+the ``rte_metrics_update_value()`` function. This uses the metric key
+that is returned when registering the metric, and can also be looked up
+using ``rte_metrics_get_names()``.
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_1, values[0]);
+    rte_metrics_update_value(port_id, id_2, values[1]);
+    rte_metrics_update_value(port_id, id_3, values[2]);
+    rte_metrics_update_value(port_id, id_4, values[3]);
+
+if metrics were registered as a single set, they can either be updated
+individually using ``rte_metrics_update_value()``, or updated together
+using the ``rte_metrics_update_values()`` function:
+
+.. code-block:: c
+
+    rte_metrics_update_value(port_id, id_set, values[0]);
+    rte_metrics_update_value(port_id, id_set + 1, values[1]);
+    rte_metrics_update_value(port_id, id_set + 2, values[2]);
+    rte_metrics_update_value(port_id, id_set + 3, values[3]);
+
+    rte_metrics_update_values(port_id, id_set, values, 4);
+
+Note that ``rte_metrics_update_values()`` cannot be used to update
+metric values from *multiple* *sets*, as there is no guarantee two
+sets registered one after the other have contiguous id values.
+
+Querying metrics
+----------------
+
+Consumers can obtain metric values by querying the metrics library using
+the ``rte_metrics_get_values()`` function that return an array of
+``struct rte_metric_value``. Each entry within this array contains a metric
+value and its associated key. A key-name mapping can be obtained using the
+``rte_metrics_get_names()`` function that returns an array of
+``struct rte_metric_name`` that is indexed by the key. The following will
+print out all metrics for a given port:
+
+.. code-block:: c
+
+    void print_metrics() {
+        struct rte_metric_name *names;
+        int len;
+
+        len = rte_metrics_get_names(NULL, 0);
+        if (len < 0) {
+            printf("Cannot get metrics count\n");
+            return;
+        }
+        if (len == 0) {
+            printf("No metrics to display (none have been registered)\n");
+            return;
+        }
+        metrics = malloc(sizeof(struct rte_metric_value) * len);
+        names =  malloc(sizeof(struct rte_metric_name) * len);
+        if (metrics == NULL || names == NULL) {
+            printf("Cannot allocate memory\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        ret = rte_metrics_get_values(port_id, metrics, len);
+        if (ret < 0 || ret > len) {
+            printf("Cannot get metrics values\n");
+            free(metrics);
+            free(names);
+            return;
+        }
+        printf("Metrics for port %i:\n", port_id);
+        for (i = 0; i < len; i++)
+            printf("  %s: %"PRIu64"\n",
+                names[metrics[i].key].name, metrics[i].value);
+        free(metrics);
+        free(names);
+    }
diff --git a/doc/guides/rel_notes/release_17_02.rst b/doc/guides/rel_notes/release_17_02.rst
index 357965a..8bd706f 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -368,6 +368,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_mbuf.so.2
      librte_mempool.so.2
      librte_meter.so.1
+   + librte_metrics.so.1
      librte_net.so.1
      librte_pdump.so.1
      librte_pipeline.so.3
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index e25ea9f..3ed809e 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -61,6 +61,14 @@ Resolved Issues
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Added information metric library.**
+
+  A library that allows information metrics to be added and updated
+  by producers, typically other libraries, for later retrieval by
+  consumers such as applications. It is intended to provide a
+  reporting mechanism that is independent of other libraries such
+  as ethdev.
+
 
 EAL
 ~~~
diff --git a/lib/Makefile b/lib/Makefile
index 4178325..29f6a81 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -49,6 +49,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
 DIRS-$(CONFIG_RTE_LIBRTE_NET) += librte_net
 DIRS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += librte_ip_frag
 DIRS-$(CONFIG_RTE_LIBRTE_JOBSTATS) += librte_jobstats
+DIRS-$(CONFIG_RTE_LIBRTE_METRICS) += librte_metrics
 DIRS-$(CONFIG_RTE_LIBRTE_POWER) += librte_power
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
diff --git a/lib/librte_metrics/Makefile b/lib/librte_metrics/Makefile
new file mode 100644
index 0000000..8d6e23a
--- /dev/null
+++ b/lib/librte_metrics/Makefile
@@ -0,0 +1,51 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_metrics.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_metrics_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_METRICS) := rte_metrics.c
+
+# Install header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_METRICS)-include += rte_metrics.h
+
+DEPDIRS-$(CONFIG_RTE_LIBRTE_METRICS) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_metrics/rte_metrics.c b/lib/librte_metrics/rte_metrics.c
new file mode 100644
index 0000000..aa9ec50
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.c
@@ -0,0 +1,299 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <sys/queue.h>
+
+#include <rte_common.h>
+#include <rte_malloc.h>
+#include <rte_metrics.h>
+#include <rte_lcore.h>
+#include <rte_memzone.h>
+#include <rte_spinlock.h>
+
+#define RTE_METRICS_MAX_METRICS 256
+#define RTE_METRICS_MEMZONE_NAME "RTE_METRICS"
+
+/**
+ * Internal stats metadata and value entry.
+ *
+ * @internal
+ */
+struct rte_metrics_meta_s {
+	/** Name of metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+	/** Current value for metric */
+	uint64_t value[RTE_MAX_ETHPORTS];
+	/** Used for global metrics */
+	uint64_t global_value;
+	/** Index of next root element (zero for none) */
+	uint16_t idx_next_set;
+	/** Index of next metric in set (zero for none) */
+	uint16_t idx_next_stat;
+};
+
+/**
+ * Internal stats info structure.
+ *
+ * @internal
+ * Offsets into metadata are used instead of pointers because ASLR
+ * means that having the same physical addresses in different
+ * processes is not guaranteed.
+ */
+struct rte_metrics_data_s {
+	/**   Index of last metadata entry with valid data.
+	 * This value is not valid if cnt_stats is zero.
+	 */
+	uint16_t idx_last_set;
+	/**   Number of metrics. */
+	uint16_t cnt_stats;
+	/** Metric data memory block. */
+	struct rte_metrics_meta_s metadata[RTE_METRICS_MAX_METRICS];
+	/** Metric data access lock */
+	rte_spinlock_t lock;
+};
+
+void
+rte_metrics_init(int socket_id)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone != NULL)
+		return;
+	memzone = rte_memzone_reserve(RTE_METRICS_MEMZONE_NAME,
+		sizeof(struct rte_metrics_data_s), socket_id, 0);
+	if (memzone == NULL)
+		rte_exit(EXIT_FAILURE, "Unable to allocate stats memzone\n");
+	stats = memzone->addr;
+	memset(stats, 0, sizeof(struct rte_metrics_data_s));
+	rte_spinlock_init(&stats->lock);
+}
+
+int
+rte_metrics_reg_name(const char *name)
+{
+	const char * const list_names[] = {name};
+
+	return rte_metrics_reg_names(list_names, 1);
+}
+
+int
+rte_metrics_reg_names(const char * const *names, uint16_t cnt_names)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	uint16_t idx_base;
+
+	/* Some sanity checks */
+	if (cnt_names < 1 || names == NULL)
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	if (stats->cnt_stats + cnt_names >= RTE_METRICS_MAX_METRICS)
+		return -ENOMEM;
+
+	rte_spinlock_lock(&stats->lock);
+
+	/* Overwritten later if this is actually first set.. */
+	stats->metadata[stats->idx_last_set].idx_next_set = stats->cnt_stats;
+
+	stats->idx_last_set = idx_base = stats->cnt_stats;
+
+	for (idx_name = 0; idx_name < cnt_names; idx_name++) {
+		entry = &stats->metadata[idx_name + stats->cnt_stats];
+		strncpy(entry->name, names[idx_name],
+			RTE_METRICS_MAX_NAME_LEN);
+		memset(entry->value, 0, sizeof(entry->value));
+		entry->idx_next_stat = idx_name + stats->cnt_stats + 1;
+	}
+	entry->idx_next_stat = 0;
+	entry->idx_next_set = 0;
+	stats->cnt_stats += cnt_names;
+
+	rte_spinlock_unlock(&stats->lock);
+
+	return idx_base;
+}
+
+int
+rte_metrics_update_value(int port_id, uint16_t key, const uint64_t value)
+{
+	return rte_metrics_update_values(port_id, key, &value, 1);
+}
+
+int
+rte_metrics_update_values(int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_metric;
+	uint16_t idx_value;
+	uint16_t cnt_setsize;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	if (memzone == NULL)
+		return -EIO;
+	stats = memzone->addr;
+
+	rte_spinlock_lock(&stats->lock);
+	idx_metric = key;
+	cnt_setsize = 1;
+	while (idx_metric < stats->cnt_stats) {
+		entry = &stats->metadata[idx_metric];
+		if (entry->idx_next_stat == 0)
+			break;
+		cnt_setsize++;
+		idx_metric++;
+	}
+	/* Check update does not cross set border */
+	if (count > cnt_setsize) {
+		rte_spinlock_unlock(&stats->lock);
+		return -ERANGE;
+	}
+
+	if (port_id == RTE_METRICS_GLOBAL)
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].global_value =
+				values[idx_value];
+		}
+	else
+		for (idx_value = 0; idx_value < count; idx_value++) {
+			idx_metric = key + idx_value;
+			stats->metadata[idx_metric].value[port_id] =
+				values[idx_value];
+		}
+	rte_spinlock_unlock(&stats->lock);
+	return 0;
+}
+
+int
+rte_metrics_get_names(struct rte_metric_name *names,
+	uint16_t capacity)
+{
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+	if (names != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		for (idx_name = 0; idx_name < stats->cnt_stats; idx_name++)
+			strncpy(names[idx_name].name,
+				stats->metadata[idx_name].name,
+				RTE_METRICS_MAX_NAME_LEN);
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
+
+int
+rte_metrics_get_values(int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity)
+{
+	struct rte_metrics_meta_s *entry;
+	struct rte_metrics_data_s *stats;
+	const struct rte_memzone *memzone;
+	uint16_t idx_name;
+	int return_value;
+
+	if (port_id != RTE_METRICS_GLOBAL &&
+			(port_id < 0 || port_id > RTE_MAX_ETHPORTS))
+		return -EINVAL;
+
+	memzone = rte_memzone_lookup(RTE_METRICS_MEMZONE_NAME);
+	/* If not allocated, fail silently */
+	if (memzone == NULL)
+		return 0;
+	stats = memzone->addr;
+	rte_spinlock_lock(&stats->lock);
+
+	if (values != NULL) {
+		if (capacity < stats->cnt_stats) {
+			return_value = stats->cnt_stats;
+			rte_spinlock_unlock(&stats->lock);
+			return return_value;
+		}
+		if (port_id == RTE_METRICS_GLOBAL)
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->global_value;
+			}
+		else
+			for (idx_name = 0;
+					idx_name < stats->cnt_stats;
+					idx_name++) {
+				entry = &stats->metadata[idx_name];
+				values[idx_name].key = idx_name;
+				values[idx_name].value = entry->value[port_id];
+			}
+	}
+	return_value = stats->cnt_stats;
+	rte_spinlock_unlock(&stats->lock);
+	return return_value;
+}
diff --git a/lib/librte_metrics/rte_metrics.h b/lib/librte_metrics/rte_metrics.h
new file mode 100644
index 0000000..7458328
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics.h
@@ -0,0 +1,240 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * @file
+ *
+ * DPDK Metrics module
+ *
+ * Metrics are statistics that are not generated by PMDs, and hence
+ * are better reported through a mechanism that is independent from
+ * the ethdev-based extended statistics. Providers will typically
+ * be other libraries and consumers will typically be applications.
+ *
+ * Metric information is populated using a push model, where producers
+ * update the values contained within the metric library by calling
+ * an update function on the relevant metrics. Consumers receive
+ * metric information by querying the central metric data, which is
+ * held in shared memory. Currently only bulk querying of metrics
+ * by consumers is supported.
+ */
+
+#ifndef _RTE_METRICS_H_
+#define _RTE_METRICS_H_
+
+/** Maximum length of metric name (including null-terminator) */
+#define RTE_METRICS_MAX_NAME_LEN 64
+
+/**
+ * Global metric special id.
+ *
+ * When used for the port_id parameter when calling
+ * rte_metrics_update_metric() or rte_metrics_update_metric(),
+ * the global metric, which are not associated with any specific
+ * port (i.e. device), are updated.
+ */
+#define RTE_METRICS_GLOBAL -1
+
+
+/**
+ * A name-key lookup for metrics.
+ *
+ * An array of this structure is returned by rte_metrics_get_names().
+ * The struct rte_metric_value references these names via their array index.
+ */
+struct rte_metric_name {
+	/** String describing metric */
+	char name[RTE_METRICS_MAX_NAME_LEN];
+};
+
+
+/**
+ * Metric value structure.
+ *
+ * This structure is used by rte_metrics_get_values() to return metrics,
+ * which are statistics that are not generated by PMDs. It maps a name key,
+ * which corresponds to an index in the array returned by
+ * rte_metrics_get_names().
+ */
+struct rte_metric_value {
+	/** Numeric identifier of metric. */
+	uint16_t key;
+	/** Value for metric */
+	uint64_t value;
+};
+
+
+/**
+ * Initializes metric module. This function must be called from
+ * a primary process before metrics are used.
+ *
+ * @param socket_id
+ *   Socket to use for shared memory allocation.
+ */
+void rte_metrics_init(int socket_id);
+
+/**
+ * Register a metric, making it available as a reporting parameter.
+ *
+ * Registering a metric is the way producers declare a parameter
+ * that they wish to be reported. Once registered, the associated
+ * numeric key can be obtained via rte_metrics_get_names(), which
+ * is required for updating said metric's value.
+ *
+ * @param name
+ *   Metric name
+ *
+ * @return
+ *  - Zero or positive: Success (index key of new metric)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_name(const char *name);
+
+/**
+ * Register a set of metrics.
+ *
+ * This is a bulk version of rte_metrics_reg_metrics() and aside from
+ * handling multiple keys at once is functionally identical.
+ *
+ * @param names
+ *   List of metric names
+ *
+ * @param cnt_names
+ *   Number of metrics in set
+ *
+ * @return
+ *  - Zero or positive: Success (index key of start of set)
+ *  - -EIO: Error, unable to access metrics shared memory
+ *    (rte_metrics_init() not called)
+ *  - -EINVAL: Error, invalid parameters
+ *  - -ENOMEM: Error, maximum metrics reached
+ */
+int rte_metrics_reg_names(const char * const *names, uint16_t cnt_names);
+
+/**
+ * Get metric name-key lookup table.
+ *
+ * @param names
+ *   A struct rte_metric_name array of at least *capacity* in size to
+ *   receive key names. If this is NULL, function returns the required
+ *   number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_name array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *names* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_names(
+	struct rte_metric_name *names,
+	uint16_t capacity);
+
+/**
+ * Get metric value table.
+ *
+ * @param port_id
+ *   Port id to query
+ *
+ * @param values
+ *   A struct rte_metric_value array of at least *capacity* in size to
+ *   receive metric ids and values. If this is NULL, function returns
+ *   the required number of elements for this array.
+ *
+ * @param capacity
+ *   Size (number of elements) of struct rte_metric_value array.
+ *   Disregarded if names is NULL.
+ *
+ * @return
+ *   - Positive value above capacity: error, *values* is too small.
+ *     Return value is required size.
+ *   - Positive value equal or less than capacity: Success. Return
+ *     value is number of elements filled in.
+ *   - Negative value: error.
+ */
+int rte_metrics_get_values(
+	int port_id,
+	struct rte_metric_value *values,
+	uint16_t capacity);
+
+/**
+ * Updates a metric
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Id of metric to update
+ * @param value
+ *   New value
+ *
+ * @return
+ *   - -EIO if unable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_value(
+	int port_id,
+	uint16_t key,
+	const uint64_t value);
+
+/**
+ * Updates a metric set. Note that it is an error to try to
+ * update across a set boundary.
+ *
+ * @param port_id
+ *   Port to update metrics for
+ * @param key
+ *   Base id of metrics set to update
+ * @param values
+ *   Set of new values
+ * @param count
+ *   Number of new values
+ *
+ * @return
+ *   - -ERANGE if count exceeds metric set size
+ *   - -EIO if upable to access shared metrics memory
+ *   - Zero on success
+ */
+int rte_metrics_update_values(
+	int port_id,
+	uint16_t key,
+	const uint64_t *values,
+	uint32_t count);
+
+#endif
diff --git a/lib/librte_metrics/rte_metrics_version.map b/lib/librte_metrics/rte_metrics_version.map
new file mode 100644
index 0000000..4c5234c
--- /dev/null
+++ b/lib/librte_metrics/rte_metrics_version.map
@@ -0,0 +1,13 @@
+DPDK_17.05 {
+	global:
+
+	rte_metrics_get_names;
+	rte_metrics_get_values;
+	rte_metrics_init;
+	rte_metrics_reg_name;
+	rte_metrics_reg_names;
+	rte_metrics_update_value;
+	rte_metrics_update_values;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index d46a33e..98eb052 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -99,6 +99,8 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_METRICS)        += -lrte_metrics
+
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 # plugins (link only if static libraries)
-- 
2.5.5

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type
  2017-03-09 10:01  0%       ` Ferruh Yigit
@ 2017-03-09 10:43  0%         ` Xing, Beilei
  0 siblings, 0 replies; 200+ results
From: Xing, Beilei @ 2017-03-09 10:43 UTC (permalink / raw)
  To: Yigit, Ferruh, Wu, Jingjing
  Cc: Zhang, Helin, dev, Iremonger, Bernard, Stroe, Laura



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Thursday, March 9, 2017 6:02 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org; Iremonger,
> Bernard <bernard.iremonger@intel.com>; Stroe, Laura
> <laura.stroe@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type
> 
> On 3/9/2017 5:59 AM, Xing, Beilei wrote:
> >
> >
> >> -----Original Message-----
> >> From: Yigit, Ferruh
> >> Sent: Wednesday, March 8, 2017 11:50 PM
> >> To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> >> <jingjing.wu@intel.com>
> >> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org; Iremonger,
> >> Bernard <bernard.iremonger@intel.com>; Stroe, Laura
> >> <laura.stroe@intel.com>
> >> Subject: Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter
> >> type
> >>
> >> On 3/3/2017 9:31 AM, Beilei Xing wrote:
> >>> Add new admin queue function and extended fields in DCR 288:
> >>>  - Add admin queue function for Replace filter
> >>>    command (Opcode: 0x025F)
> >>>  - Add General fields for Add/Remove Cloud filters
> >>>    command
> >>>
> >>> This patch will be removed to base driver in future.
> >>>
> >>> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> >>> Signed-off-by: Stroe Laura <laura.stroe@intel.com>
> >>> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
> >>> Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> >>> ---
> >>>  drivers/net/i40e/i40e_ethdev.h | 106
> ++++++++++++++++++++++++++++
> >>>  drivers/net/i40e/i40e_flow.c   | 152
> >> +++++++++++++++++++++++++++++++++++++++++
> >>>  2 files changed, 258 insertions(+)
> >>>
> >>> diff --git a/drivers/net/i40e/i40e_ethdev.h
> >>> b/drivers/net/i40e/i40e_ethdev.h index f545850..3a49865 100644
> >>> --- a/drivers/net/i40e/i40e_ethdev.h
> >>> +++ b/drivers/net/i40e/i40e_ethdev.h
> >>> @@ -729,6 +729,100 @@ struct i40e_valid_pattern {
> >>>  	parse_filter_t parse_filter;
> >>>  };
> >>>
> >>> +/* Support replace filter */
> >>> +
> >>> +/* i40e_aqc_add_remove_cloud_filters_element_big_data is used
> when
> >>> + * I40E_AQC_ADD_REM_CLOUD_CMD_BIG_BUFFER flag is set. refer to
> >>> + * DCR288
> >>
> >> Please do not refer to DCR, unless you can provide a public link for it.
> > OK, got it.
> >
> >>
> >>> + */
> >>> +struct i40e_aqc_add_remove_cloud_filters_element_big_data {
> >>> +	struct i40e_aqc_add_remove_cloud_filters_element_data element;
> >>
> >> What is the difference between
> >> "i40e_aqc_add_remove_cloud_filters_element_big_data" and
> >> "i40e_aqc_add_remove_cloud_filters_element_data", why need
> big_data
> >> one?
> >
> > As ' Add/Remove Cloud filters -command buffer ' is changed in the DCR288,
> 'general fields' exists only when big_buffer is set.
> 
> What does it mean having "big_buffer" set? What changes functionally being
> big_buffer set or not?

According to DCR288, "Add/Remove Cloud Filter Command" should add 'Big Buffer' in byte20, but we can't change ' struct i40e_aqc_add_remove_cloud_filters ' in base code,
struct i40e_aqc_add_remove_cloud_filters {
        u8      num_filters;
        u8      reserved;
        __le16  seid;
#define I40E_AQC_ADD_CLOUD_CMD_SEID_NUM_SHIFT   0
#define I40E_AQC_ADD_CLOUD_CMD_SEID_NUM_MASK    (0x3FF << \
                                        I40E_AQC_ADD_CLOUD_CMD_SEID_NUM_SHIFT)
        u8      reserved2[4];
        __le32  addr_high;
        __le32  addr_low;
};

So we use reserverd[0] for 'Big Buffer' here, in the patch for ND, we changed above structure with following:

struct i40e_aqc_add_remove_cloud_filters {
        u8      num_filters;
        u8      reserved;
        __le16  seid;
#define I40E_AQC_ADD_CLOUD_CMD_SEID_NUM_SHIFT   0
#define I40E_AQC_ADD_CLOUD_CMD_SEID_NUM_MASK    (0x3FF << \
                                        I40E_AQC_ADD_CLOUD_CMD_SEID_NUM_SHIFT)
        u8      big_buffer;
        u8      reserved2[3];
        __le32  addr_high;
        __le32  addr_low;
};


> 
> > But we don't want to change the  "
> i40e_aqc_add_remove_cloud_filters_element_data " as it will cause ABI/API
> change in kernel driver.
> >
> <...>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type
  2017-03-09  5:59  3%     ` Xing, Beilei
@ 2017-03-09 10:01  0%       ` Ferruh Yigit
  2017-03-09 10:43  0%         ` Xing, Beilei
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2017-03-09 10:01 UTC (permalink / raw)
  To: Xing, Beilei, Wu, Jingjing
  Cc: Zhang, Helin, dev, Iremonger, Bernard, Stroe, Laura

On 3/9/2017 5:59 AM, Xing, Beilei wrote:
> 
> 
>> -----Original Message-----
>> From: Yigit, Ferruh
>> Sent: Wednesday, March 8, 2017 11:50 PM
>> To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
>> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org; Iremonger,
>> Bernard <bernard.iremonger@intel.com>; Stroe, Laura
>> <laura.stroe@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type
>>
>> On 3/3/2017 9:31 AM, Beilei Xing wrote:
>>> Add new admin queue function and extended fields in DCR 288:
>>>  - Add admin queue function for Replace filter
>>>    command (Opcode: 0x025F)
>>>  - Add General fields for Add/Remove Cloud filters
>>>    command
>>>
>>> This patch will be removed to base driver in future.
>>>
>>> Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
>>> Signed-off-by: Stroe Laura <laura.stroe@intel.com>
>>> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
>>> Signed-off-by: Beilei Xing <beilei.xing@intel.com>
>>> ---
>>>  drivers/net/i40e/i40e_ethdev.h | 106 ++++++++++++++++++++++++++++
>>>  drivers/net/i40e/i40e_flow.c   | 152
>> +++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 258 insertions(+)
>>>
>>> diff --git a/drivers/net/i40e/i40e_ethdev.h
>>> b/drivers/net/i40e/i40e_ethdev.h index f545850..3a49865 100644
>>> --- a/drivers/net/i40e/i40e_ethdev.h
>>> +++ b/drivers/net/i40e/i40e_ethdev.h
>>> @@ -729,6 +729,100 @@ struct i40e_valid_pattern {
>>>  	parse_filter_t parse_filter;
>>>  };
>>>
>>> +/* Support replace filter */
>>> +
>>> +/* i40e_aqc_add_remove_cloud_filters_element_big_data is used when
>>> + * I40E_AQC_ADD_REM_CLOUD_CMD_BIG_BUFFER flag is set. refer to
>>> + * DCR288
>>
>> Please do not refer to DCR, unless you can provide a public link for it.
> OK, got it.
> 
>>
>>> + */
>>> +struct i40e_aqc_add_remove_cloud_filters_element_big_data {
>>> +	struct i40e_aqc_add_remove_cloud_filters_element_data element;
>>
>> What is the difference between
>> "i40e_aqc_add_remove_cloud_filters_element_big_data" and
>> "i40e_aqc_add_remove_cloud_filters_element_data", why need big_data
>> one?
> 
> As ' Add/Remove Cloud filters -command buffer ' is changed in the DCR288, 'general fields' exists only when big_buffer is set.

What does it mean having "big_buffer" set? What changes functionally
being big_buffer set or not?

> But we don't want to change the  " i40e_aqc_add_remove_cloud_filters_element_data " as it will cause ABI/API change in kernel driver.
> 
<...>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 4/4] net/i40e: refine consistent tunnel filter
  @ 2017-03-09  6:11  3%     ` Xing, Beilei
  0 siblings, 0 replies; 200+ results
From: Xing, Beilei @ 2017-03-09  6:11 UTC (permalink / raw)
  To: Yigit, Ferruh, Wu, Jingjing; +Cc: Zhang, Helin, dev



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Wednesday, March 8, 2017 11:51 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 4/4] net/i40e: refine consistent tunnel filter
> 
> On 3/3/2017 9:31 AM, Beilei Xing wrote:
> > Add i40e_tunnel_type enumeration type to refine consistent tunnel
> > filter, it will be esay to add new tunnel type for
> 
> s/esay/easy
> 
> > i40e.
> >
> > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> 
> <...>
> 
> >  /**
> > + * Tunnel type.
> > + */
> > +enum i40e_tunnel_type {
> > +	I40E_TUNNEL_TYPE_NONE = 0,
> > +	I40E_TUNNEL_TYPE_VXLAN,
> > +	I40E_TUNNEL_TYPE_GENEVE,
> > +	I40E_TUNNEL_TYPE_TEREDO,
> > +	I40E_TUNNEL_TYPE_NVGRE,
> > +	I40E_TUNNEL_TYPE_IP_IN_GRE,
> > +	I40E_L2_TUNNEL_TYPE_E_TAG,
> > +	I40E_TUNNEL_TYPE_MAX,
> > +};
> 
> Same question here, there is already "rte_eth_tunnel_type", why driver is
> duplicating the structure?
> 

Same with " struct i40e_tunnel_filter_conf ", to avoid ABI change, we create it in PMD to add new tunnel type easily, like MPLS.

> <...>

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type
  @ 2017-03-09  5:59  3%     ` Xing, Beilei
  2017-03-09 10:01  0%       ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Xing, Beilei @ 2017-03-09  5:59 UTC (permalink / raw)
  To: Yigit, Ferruh, Wu, Jingjing
  Cc: Zhang, Helin, dev, Iremonger, Bernard, Stroe, Laura



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Wednesday, March 8, 2017 11:50 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: Zhang, Helin <helin.zhang@intel.com>; dev@dpdk.org; Iremonger,
> Bernard <bernard.iremonger@intel.com>; Stroe, Laura
> <laura.stroe@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type
> 
> On 3/3/2017 9:31 AM, Beilei Xing wrote:
> > Add new admin queue function and extended fields in DCR 288:
> >  - Add admin queue function for Replace filter
> >    command (Opcode: 0x025F)
> >  - Add General fields for Add/Remove Cloud filters
> >    command
> >
> > This patch will be removed to base driver in future.
> >
> > Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > Signed-off-by: Stroe Laura <laura.stroe@intel.com>
> > Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
> > Signed-off-by: Beilei Xing <beilei.xing@intel.com>
> > ---
> >  drivers/net/i40e/i40e_ethdev.h | 106 ++++++++++++++++++++++++++++
> >  drivers/net/i40e/i40e_flow.c   | 152
> +++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 258 insertions(+)
> >
> > diff --git a/drivers/net/i40e/i40e_ethdev.h
> > b/drivers/net/i40e/i40e_ethdev.h index f545850..3a49865 100644
> > --- a/drivers/net/i40e/i40e_ethdev.h
> > +++ b/drivers/net/i40e/i40e_ethdev.h
> > @@ -729,6 +729,100 @@ struct i40e_valid_pattern {
> >  	parse_filter_t parse_filter;
> >  };
> >
> > +/* Support replace filter */
> > +
> > +/* i40e_aqc_add_remove_cloud_filters_element_big_data is used when
> > + * I40E_AQC_ADD_REM_CLOUD_CMD_BIG_BUFFER flag is set. refer to
> > + * DCR288
> 
> Please do not refer to DCR, unless you can provide a public link for it.
OK, got it.

> 
> > + */
> > +struct i40e_aqc_add_remove_cloud_filters_element_big_data {
> > +	struct i40e_aqc_add_remove_cloud_filters_element_data element;
> 
> What is the difference between
> "i40e_aqc_add_remove_cloud_filters_element_big_data" and
> "i40e_aqc_add_remove_cloud_filters_element_data", why need big_data
> one?

As ' Add/Remove Cloud filters -command buffer ' is changed in the DCR288, 'general fields' exists only when big_buffer is set.
But we don't want to change the  " i40e_aqc_add_remove_cloud_filters_element_data " as it will cause ABI/API change in kernel driver.

> 
> > +	uint16_t     general_fields[32];
> 
> Not very useful variable name.

It's the name from DCR.

> 
> <...>
> 
> > +/* Replace filter Command 0x025F
> > + * uses the i40e_aqc_replace_cloud_filters,
> > + * and the generic indirect completion structure  */ struct
> > +i40e_filter_data {
> > +	uint8_t filter_type;
> > +	uint8_t input[3];
> > +};
> > +
> > +struct i40e_aqc_replace_cloud_filters_cmd {
> 
> Is replace does something different than remove old and add new cloud
> filter?

It's just like remove an old filter and add a new filter.
It can replace both l1 filter and cloud filter.

> 
> <...>
> 
> > +enum i40e_status_code i40e_aq_add_cloud_filters_big_buffer(struct
> i40e_hw *hw,
> > +	   uint16_t seid,
> > +	   struct i40e_aqc_add_remove_cloud_filters_element_big_data
> *filters,
> > +	   uint8_t filter_count);
> > +enum i40e_status_code i40e_aq_remove_cloud_filters_big_buffer(
> > +	struct i40e_hw *hw, uint16_t seid,
> > +	struct i40e_aqc_add_remove_cloud_filters_element_big_data
> *filters,
> > +	uint8_t filter_count);
> > +enum i40e_status_code i40e_aq_replace_cloud_filters(struct i40e_hw
> *hw,
> > +		    struct i40e_aqc_replace_cloud_filters_cmd *filters,
> > +		    struct i40e_aqc_replace_cloud_filters_cmd_buf
> *cmd_buf);
> > +
> 
> Do you need these function declarations?
We can remove it if we define them with "static".

> 
> >  #define I40E_DEV_TO_PCI(eth_dev) \
> >  	RTE_DEV_TO_PCI((eth_dev)->device)
> >
> > diff --git a/drivers/net/i40e/i40e_flow.c
> > b/drivers/net/i40e/i40e_flow.c index f163ce5..3c49228 100644
> > --- a/drivers/net/i40e/i40e_flow.c
> > +++ b/drivers/net/i40e/i40e_flow.c
> > @@ -1874,3 +1874,155 @@ i40e_flow_flush_tunnel_filter(struct i40e_pf
> > *pf)
> >
> >  	return ret;
> >  }
> > +
> > +#define i40e_aqc_opc_replace_cloud_filters 0x025F #define
> > +I40E_AQC_ADD_REM_CLOUD_CMD_BIG_BUFFER 1
> > +/**
> > + * i40e_aq_add_cloud_filters_big_buffer
> > + * @hw: pointer to the hardware structure
> > + * @seid: VSI seid to add cloud filters from
> > + * @filters: Buffer which contains the filters in big buffer to be
> > +added
> > + * @filter_count: number of filters contained in the buffer
> > + *
> > + * Set the cloud filters for a given VSI.  The contents of the
> > + * i40e_aqc_add_remove_cloud_filters_element_big_data are filled
> > + * in by the caller of the function.
> > + *
> > + **/
> > +enum i40e_status_code i40e_aq_add_cloud_filters_big_buffer(
> 
> There are already non big_buffer versions of these functions, like
> "i40e_aq_add_cloud_filters()" why big_data version required, what it does
> differently?

Parameters are different.
We add i40e_aq_add_cloud_filters_big_buffer to handle structure " i40e_aqc_add_remove_cloud_filters_element_data " which includes general_fields.

> 
> And is there a reason that these functions are not static? (For this patch they
> are not used at all and will cause build error, but my question is after they
> started to be used)

No.. same with the patch for Pipeline Personalization Profile, it's designed according to base code style.

> 
> <...>

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] Issues with ixgbe  and rte_flow
  @ 2017-03-08 15:41  3%     ` Adrien Mazarguil
  0 siblings, 0 replies; 200+ results
From: Adrien Mazarguil @ 2017-03-08 15:41 UTC (permalink / raw)
  To: Le Scouarnec Nicolas
  Cc: Lu, Wenzhuo, dev, users, Jan Medala, Evgeny Schemeilin,
	Stephen Hurd, Jerin Jacob, Rahul Lakkireddy, John Daley,
	Matej Vido, Helin Zhang, Konstantin Ananyev, Jingjing Wu,
	Jing Chen, Alejandro Lucero, Harish Patil, Rasesh Mody,
	Andrew Rybchenko, Nelio Laranjeiro, Vasily Philipov,
	Pascal Mazon, Thomas Monjalon

CC'ing users@dpdk.org since this issue primarily affects rte_flow users, and
several PMD maintainers to get their opinion on the matter, see below.

On Wed, Mar 08, 2017 at 09:24:26AM +0000, Le Scouarnec Nicolas wrote:
> My response is inline bellow, and further comment on the code excerpt also
> 
> 
> From: Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Sent: Wednesday, March 8, 2017 4:16 AM
> To: Le Scouarnec Nicolas; dev@dpdk.org; Adrien Mazarguil (adrien.mazarguil@6wind.com)
> Cc: Yigit, Ferruh
> Subject: RE: Issues with ixgbe and rte_flow
>     
> >> I have been using the new API rte_flow to program filtering on an X540 (ixgbe)
> >> NIC. My goal is to send packets from different VLANs to different queues
> >> (filtering which should be supported by flow director as far as I understand). I
> >> enclosed the setup code at the bottom of this email.
> >> For reference, here is the setup code I use
> >>
> >>       vlan_spec.tci = vlan_be;
> >>       vlan_spec.tpid = 0;
> >>
> >>       vlan_mask.tci = rte_cpu_to_be_16(0x0fff);
> >>       vlan_mask.tpid =  0;
> 
> >To my opinion, this setting is not right. As we know, vlan tag is inserted between MAC source address and Ether type.
> >So if we have a MAC+VLAN+IPv4 packet, the vlan_spec.tpid should be 0x8100, the eth_spec.type should be 0x0800.
> >+ Adrien, the author. He can correct me if I'm wrong.

That's right, however the confusion is understandable, perhaps the
documentation should be clearer. It currently states what follows without
describing the reason:

 /**
  * RTE_FLOW_ITEM_TYPE_VLAN
  *
  * Matches an 802.1Q/ad VLAN tag.
  *
  * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
  * RTE_FLOW_ITEM_TYPE_VLAN.
  */

> Ok, I apologize, you're right. Being more used to the software-side than to the hardware-side, I misunderstood struct rte_flow_item_vlan and though it was the "equivalent" of struct vlan_hdr, in which case the vlan_hdr contains the type of the encapsulated frame.
> 
> (  /**
>  * Ethernet VLAN Header.
>  * Contains the 16-bit VLAN Tag Control Identifier and the Ethernet type
>  * of the encapsulated frame.
>  */
> struct vlan_hdr {
> 	uint16_t vlan_tci; /**< Priority (3) + CFI (1) + Identifier Code (12) */
> 	uint16_t eth_proto;/**< Ethernet type of encapsulated frame. */
> } __attribute__((__packed__));        )

Indeed, struct vlan_hdr and struct rte_flow_item_vlan are not mapped at the
same offset; the former includes EtherType of the inner packet (eth_proto),
while the latter describes the inserted VLAN header itself starting with
TPID.

This approach was chosen for rte_flow for consistency with the fact each
pattern item describes exactly one protocol header, even though in the case
of VLAN and other layer 2.5 protocols, some happen to be embedded.
IPv4/IPv6 options will be provided as separate items in a similar fashion.

It also allows adding/removing VLAN tags to an existing rule without
modifying the EtherType of the inner frame.

Now assuming you're not the only one facing that issue, if the current
definition does not make sense, perhaps we can update the API before it's
too late. I'll attempt to summarize it with an example below.

In any case, matching nonspecific VLAN-tagged and QinQ UDPv4 packets in
testpmd is written as:

 flow create 0 pattern eth / vlan / ipv4 / udp / end actions queue 1 / end
 flow create 0 pattern eth / vlan / vlan / ipv4 / udp / end actions queue 1 / end

However, with the current API described above, specifying inner/outer
EtherTypes for the above packets yields (as a reminder, 0x8100 stands for
VLAN, 0x8000 for IPv4 and 0x88A8 for QinQ):

#1

 flow create 0 pattern eth type is 0x8000 / vlan tpid is 0x8100 / ipv4 / udp / actions queue 1 / end
 flow create 0 pattern eth type is 0x8000 / vlan tpid is 0x88A8 / vlan tpid is 0x8100 / ipv4 / udp / actions queue 1 / end

Instead of the arguably more accurate (renaming "tpid" to "inner_type" for
clarity):

#2

 flow create 0 pattern eth type is 0x8100 / vlan type is 0x8000 / ipv4 / udp / actions queue 1 / end
 flow create 0 pattern eth type is 0x88A8 / vlan inner_type is 0x8100 / vlan inner_type is 0x8000 / ipv4 / udp / actions queue 1 / end

So, should the VLAN item be updated to behave as described in #2?

Note: doing so will cause a serious API/ABI breakage, I know it was not
supposed to happen according to the rte_flow sales pitch, but hey.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 09/14] ring: allow dequeue fns to return remaining entry count
                       ` (5 preceding siblings ...)
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
@ 2017-03-07 11:32  2%   ` Bruce Richardson
    7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

Add an extra parameter to the ring dequeue burst/bulk functions so that
those functions can optionally return the amount of remaining objs in the
ring. This information can be used by applications in a number of ways,
for instance, with single-consumer queues, it provides a max
dequeue size which is guaranteed to work.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/pdump/main.c                                   |  2 +-
 doc/guides/rel_notes/release_17_05.rst             |  8 ++
 drivers/crypto/null/null_crypto_pmd.c              |  2 +-
 drivers/net/bonding/rte_eth_bond_pmd.c             |  3 +-
 drivers/net/ring/rte_eth_ring.c                    |  2 +-
 examples/distributor/main.c                        |  2 +-
 examples/load_balancer/runtime.c                   |  6 +-
 .../client_server_mp/mp_client/client.c            |  3 +-
 examples/packet_ordering/main.c                    |  6 +-
 examples/qos_sched/app_thread.c                    |  6 +-
 examples/quota_watermark/qw/main.c                 |  5 +-
 examples/server_node_efd/node/node.c               |  2 +-
 lib/librte_hash/rte_cuckoo_hash.c                  |  3 +-
 lib/librte_mempool/rte_mempool_ring.c              |  4 +-
 lib/librte_port/rte_port_frag.c                    |  3 +-
 lib/librte_port/rte_port_ring.c                    |  6 +-
 lib/librte_ring/rte_ring.h                         | 90 +++++++++++-----------
 test/test-pipeline/runtime.c                       |  6 +-
 test/test/test_link_bonding_mode4.c                |  3 +-
 test/test/test_pmd_ring_perf.c                     |  7 +-
 test/test/test_ring.c                              | 54 ++++++-------
 test/test/test_ring_perf.c                         | 20 +++--
 test/test/test_table_acl.c                         |  2 +-
 test/test/test_table_pipeline.c                    |  2 +-
 test/test/test_table_ports.c                       |  8 +-
 test/test/virtual_pmd.c                            |  4 +-
 26 files changed, 145 insertions(+), 114 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index b88090d..3b13753 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -496,7 +496,7 @@ pdump_rxtx(struct rte_ring *ring, uint8_t vdev_id, struct pdump_stats *stats)
 
 	/* first dequeue packets from ring of primary process */
 	const uint16_t nb_in_deq = rte_ring_dequeue_burst(ring,
-			(void *)rxtx_bufs, BURST_SIZE);
+			(void *)rxtx_bufs, BURST_SIZE, NULL);
 	stats->dequeue_pkts += nb_in_deq;
 
 	if (nb_in_deq) {
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 249ad6e..563a74c 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -123,6 +123,8 @@ API Changes
   * added an extra parameter to the burst/bulk enqueue functions to
     return the number of free spaces in the ring after enqueue. This can
     be used by an application to implement its own watermark functionality.
+  * added an extra parameter to the burst/bulk dequeue functions to return
+    the number elements remaining in the ring after dequeue.
   * changed the return value of the enqueue and dequeue bulk functions to
     match that of the burst equivalents. In all cases, ring functions which
     operate on multiple packets now return the number of elements enqueued
@@ -135,6 +137,12 @@ API Changes
     - ``rte_ring_sc_dequeue_bulk``
     - ``rte_ring_dequeue_bulk``
 
+    NOTE: the above functions all have different parameters as well as
+    different return values, due to the other listed changes above. This
+    means that all instances of the functions in existing code will be
+    flagged by the compiler. The return value usage should be checked
+    while fixing the compiler error due to the extra parameter.
+
 ABI Changes
 -----------
 
diff --git a/drivers/crypto/null/null_crypto_pmd.c b/drivers/crypto/null/null_crypto_pmd.c
index ed5a9fc..f68ec8d 100644
--- a/drivers/crypto/null/null_crypto_pmd.c
+++ b/drivers/crypto/null/null_crypto_pmd.c
@@ -155,7 +155,7 @@ null_crypto_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	unsigned nb_dequeued;
 
 	nb_dequeued = rte_ring_dequeue_burst(qp->processed_pkts,
-			(void **)ops, nb_ops);
+			(void **)ops, nb_ops, NULL);
 	qp->qp_stats.dequeued_count += nb_dequeued;
 
 	return nb_dequeued;
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
index f3ac9e2..96638af 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1008,7 +1008,8 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct rte_mbuf **bufs,
 		struct port *port = &mode_8023ad_ports[slaves[i]];
 
 		slave_slow_nb_pkts[i] = rte_ring_dequeue_burst(port->tx_ring,
-				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS);
+				slow_pkts, BOND_MODE_8023AX_SLAVE_TX_PKTS,
+				NULL);
 		slave_nb_pkts[i] = slave_slow_nb_pkts[i];
 
 		for (j = 0; j < slave_slow_nb_pkts[i]; j++)
diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index adbf478..77ef3a1 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -88,7 +88,7 @@ eth_ring_rx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
 	void **ptrs = (void *)&bufs[0];
 	struct ring_queue *r = q;
 	const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng,
-			ptrs, nb_bufs);
+			ptrs, nb_bufs, NULL);
 	if (r->rng->flags & RING_F_SC_DEQ)
 		r->rx_pkts.cnt += nb_rx;
 	else
diff --git a/examples/distributor/main.c b/examples/distributor/main.c
index cfd360b..5cb6185 100644
--- a/examples/distributor/main.c
+++ b/examples/distributor/main.c
@@ -330,7 +330,7 @@ lcore_tx(struct rte_ring *in_r)
 
 			struct rte_mbuf *bufs[BURST_SIZE];
 			const uint16_t nb_rx = rte_ring_dequeue_burst(in_r,
-					(void *)bufs, BURST_SIZE);
+					(void *)bufs, BURST_SIZE, NULL);
 			app_stats.tx.dequeue_pkts += nb_rx;
 
 			/* if we get no traffic, flush anything we have */
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 1645994..8192c08 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -349,7 +349,8 @@ app_lcore_io_tx(
 			ret = rte_ring_sc_dequeue_bulk(
 				ring,
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
-				bsz_rd);
+				bsz_rd,
+				NULL);
 
 			if (unlikely(ret == 0))
 				continue;
@@ -504,7 +505,8 @@ app_lcore_worker(
 		ret = rte_ring_sc_dequeue_bulk(
 			ring_in,
 			(void **) lp->mbuf_in.array,
-			bsz_rd);
+			bsz_rd,
+			NULL);
 
 		if (unlikely(ret == 0))
 			continue;
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index dca9eb9..01b535c 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -279,7 +279,8 @@ main(int argc, char *argv[])
 		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts,
+				PKT_READ_SIZE, NULL);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c
index d268350..7719dad 100644
--- a/examples/packet_ordering/main.c
+++ b/examples/packet_ordering/main.c
@@ -462,7 +462,7 @@ worker_thread(void *args_ptr)
 
 		/* dequeue the mbufs from rx_to_workers ring */
 		burst_size = rte_ring_dequeue_burst(ring_in,
-				(void *)burst_buffer, MAX_PKTS_BURST);
+				(void *)burst_buffer, MAX_PKTS_BURST, NULL);
 		if (unlikely(burst_size == 0))
 			continue;
 
@@ -510,7 +510,7 @@ send_thread(struct send_thread_args *args)
 
 		/* deque the mbufs from workers_to_tx ring */
 		nb_dq_mbufs = rte_ring_dequeue_burst(args->ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(nb_dq_mbufs == 0))
 			continue;
@@ -595,7 +595,7 @@ tx_thread(struct rte_ring *ring_in)
 
 		/* deque the mbufs from workers_to_tx ring */
 		dqnum = rte_ring_dequeue_burst(ring_in,
-				(void *)mbufs, MAX_PKTS_BURST);
+				(void *)mbufs, MAX_PKTS_BURST, NULL);
 
 		if (unlikely(dqnum == 0))
 			continue;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 0c81a15..15f117f 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -179,7 +179,7 @@ app_tx_thread(struct thread_conf **confs)
 
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
-					burst_conf.qos_dequeue);
+					burst_conf.qos_dequeue, NULL);
 		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
@@ -218,7 +218,7 @@ app_worker_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
@@ -254,7 +254,7 @@ app_mixed_thread(struct thread_conf **confs)
 
 		/* Read packet from the ring */
 		nb_pkt = rte_ring_sc_dequeue_burst(conf->rx_ring, (void **)mbufs,
-					burst_conf.ring_burst);
+					burst_conf.ring_burst, NULL);
 		if (likely(nb_pkt)) {
 			int nb_sent = rte_sched_port_enqueue(conf->sched_port, mbufs,
 					nb_pkt);
diff --git a/examples/quota_watermark/qw/main.c b/examples/quota_watermark/qw/main.c
index 57df8ef..2dcddea 100644
--- a/examples/quota_watermark/qw/main.c
+++ b/examples/quota_watermark/qw/main.c
@@ -247,7 +247,8 @@ pipeline_stage(__attribute__((unused)) void *args)
 			}
 
 			/* Dequeue up to quota mbuf from rx */
-			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts, *quota);
+			nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts,
+					*quota, NULL);
 			if (unlikely(nb_dq_pkts < 0))
 				continue;
 
@@ -305,7 +306,7 @@ send_stage(__attribute__((unused)) void *args)
 
 			/* Dequeue packets from tx and send them */
 			nb_dq_pkts = (uint16_t) rte_ring_dequeue_burst(tx,
-					(void *) tx_pkts, *quota);
+					(void *) tx_pkts, *quota, NULL);
 			rte_eth_tx_burst(dest_port_id, 0, tx_pkts, nb_dq_pkts);
 
 			/* TODO: Check if nb_dq_pkts == nb_tx_pkts? */
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index 9ec6a05..f780b92 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) == 0))
+					rx_pkts, NULL) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index 6552199..645c0cf 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -536,7 +536,8 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
 		if (cached_free_slots->len == 0) {
 			/* Need to get another burst of free slots from global ring */
 			n_slots = rte_ring_mc_dequeue_burst(h->free_slots,
-					cached_free_slots->objs, LCORE_CACHE_SIZE);
+					cached_free_slots->objs,
+					LCORE_CACHE_SIZE, NULL);
 			if (n_slots == 0)
 				return -ENOSPC;
 
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index 9b8fd2b..5c132bf 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -58,14 +58,14 @@ static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_mc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
 	return rte_ring_sc_dequeue_bulk(mp->pool_data,
-			obj_table, n) == 0 ? -ENOBUFS : 0;
+			obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_port/rte_port_frag.c b/lib/librte_port/rte_port_frag.c
index 0fcace9..320407e 100644
--- a/lib/librte_port/rte_port_frag.c
+++ b/lib/librte_port/rte_port_frag.c
@@ -186,7 +186,8 @@ rte_port_ring_reader_frag_rx(void *port,
 		/* If "pkts" buffer is empty, read packet burst from ring */
 		if (p->n_pkts == 0) {
 			p->n_pkts = rte_ring_sc_dequeue_burst(p->ring,
-				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX);
+				(void **) p->pkts, RTE_PORT_IN_BURST_SIZE_MAX,
+				NULL);
 			RTE_PORT_RING_READER_FRAG_STATS_PKTS_IN_ADD(p, p->n_pkts);
 			if (p->n_pkts == 0)
 				return n_pkts_out;
diff --git a/lib/librte_port/rte_port_ring.c b/lib/librte_port/rte_port_ring.c
index 9fadac7..492b0e7 100644
--- a/lib/librte_port/rte_port_ring.c
+++ b/lib/librte_port/rte_port_ring.c
@@ -111,7 +111,8 @@ rte_port_ring_reader_rx(void *port, struct rte_mbuf **pkts, uint32_t n_pkts)
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_sc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
@@ -124,7 +125,8 @@ rte_port_ring_multi_reader_rx(void *port, struct rte_mbuf **pkts,
 	struct rte_port_ring_reader *p = (struct rte_port_ring_reader *) port;
 	uint32_t nb_rx;
 
-	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts, n_pkts);
+	nb_rx = rte_ring_mc_dequeue_burst(p->ring, (void **) pkts,
+			n_pkts, NULL);
 	RTE_PORT_RING_READER_STATS_PKTS_IN_ADD(p, nb_rx);
 
 	return nb_rx;
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 73b1c26..ca25dd7 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -491,7 +491,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -500,11 +501,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	unsigned int i;
 	uint32_t mask = r->mask;
 
-	/* Avoid the unnecessary cmpset operation below, which is also
-	 * potentially harmful when n equals 0. */
-	if (n == 0)
-		return 0;
-
 	/* move cons.head atomically */
 	do {
 		/* Restore n as it may change every loop */
@@ -519,15 +515,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		entries = (prod_tail - cons_head);
 
 		/* Set the actual entries for dequeue */
-		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED)
-				return 0;
-			else {
-				if (unlikely(entries == 0))
-					return 0;
-				n = entries;
-			}
-		}
+		if (n > entries)
+			n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+		if (unlikely(n == 0))
+			goto end;
 
 		cons_next = cons_head + n;
 		success = rte_atomic32_cmpset(&r->cons.head, cons_head,
@@ -546,7 +538,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		rte_pause();
 
 	r->cons.tail = cons_next;
-
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -570,7 +564,8 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  */
 static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
-		 unsigned n, enum rte_ring_queue_behavior behavior)
+		 unsigned int n, enum rte_ring_queue_behavior behavior,
+		 unsigned int *available)
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
@@ -585,15 +580,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * and size(ring)-1. */
 	entries = prod_tail - cons_head;
 
-	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED)
-			return 0;
-		else {
-			if (unlikely(entries == 0))
-				return 0;
-			n = entries;
-		}
-	}
+	if (n > entries)
+		n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : entries;
+
+	if (unlikely(entries == 0))
+		goto end;
 
 	cons_next = cons_head + n;
 	r->cons.head = cons_next;
@@ -603,6 +594,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
+end:
+	if (available != NULL)
+		*available = entries - n;
 	return n;
 }
 
@@ -749,9 +743,11 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -768,9 +764,11 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED,
+			available);
 }
 
 /**
@@ -790,12 +788,13 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects dequeued, either 0 or n
  */
 static inline unsigned int __attribute__((always_inline))
-rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned int n,
+		unsigned int *available)
 {
 	if (r->cons.sc_dequeue)
-		return rte_ring_sc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_sc_dequeue_bulk(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_bulk(r, obj_table, n);
+		return rte_ring_mc_dequeue_bulk(r, obj_table, n, available);
 }
 
 /**
@@ -816,7 +815,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1, NULL)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -834,7 +833,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -856,7 +855,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
+	return rte_ring_dequeue_bulk(r, obj_p, 1, NULL) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -1046,9 +1045,11 @@ rte_ring_enqueue_burst(struct rte_ring *r, void * const *obj_table,
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_mc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1066,9 +1067,11 @@ rte_ring_mc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   - n: Actual number of objects dequeued, 0 if ring is empty
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
-	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_VARIABLE);
+	return __rte_ring_sc_do_dequeue(r, obj_table, n,
+			RTE_RING_QUEUE_VARIABLE, available);
 }
 
 /**
@@ -1088,12 +1091,13 @@ rte_ring_sc_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
  *   - Number of objects dequeued
  */
 static inline unsigned __attribute__((always_inline))
-rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table, unsigned n)
+rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table,
+		unsigned int n, unsigned int *available)
 {
 	if (r->cons.sc_dequeue)
-		return rte_ring_sc_dequeue_burst(r, obj_table, n);
+		return rte_ring_sc_dequeue_burst(r, obj_table, n, available);
 	else
-		return rte_ring_mc_dequeue_burst(r, obj_table, n);
+		return rte_ring_mc_dequeue_burst(r, obj_table, n, available);
 }
 
 #ifdef __cplusplus
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index c06ff54..8970e1c 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -121,7 +121,8 @@ app_main_loop_worker(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_rx[i],
 			(void **) worker_mbuf->array,
-			app.burst_size_worker_read);
+			app.burst_size_worker_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
@@ -151,7 +152,8 @@ app_main_loop_tx(void) {
 		ret = rte_ring_sc_dequeue_bulk(
 			app.rings_tx[i],
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
-			app.burst_size_tx_read);
+			app.burst_size_tx_read,
+			NULL);
 
 		if (ret == 0)
 			continue;
diff --git a/test/test/test_link_bonding_mode4.c b/test/test/test_link_bonding_mode4.c
index 8df28b4..15091b1 100644
--- a/test/test/test_link_bonding_mode4.c
+++ b/test/test/test_link_bonding_mode4.c
@@ -193,7 +193,8 @@ static uint8_t lacpdu_rx_count[RTE_MAX_ETHPORTS] = {0, };
 static int
 slave_get_pkts(struct slave_conf *slave, struct rte_mbuf **buf, uint16_t size)
 {
-	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf, size);
+	return rte_ring_dequeue_burst(slave->tx_queue, (void **)buf,
+			size, NULL);
 }
 
 /*
diff --git a/test/test/test_pmd_ring_perf.c b/test/test/test_pmd_ring_perf.c
index 045a7f2..004882a 100644
--- a/test/test/test_pmd_ring_perf.c
+++ b/test/test/test_pmd_ring_perf.c
@@ -67,7 +67,7 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t eth_start = rte_rdtsc();
@@ -99,7 +99,7 @@ test_single_enqueue_dequeue(void)
 	rte_compiler_barrier();
 	for (i = 0; i < iterations; i++) {
 		rte_ring_enqueue_bulk(r, &burst, 1, NULL);
-		rte_ring_dequeue_bulk(r, &burst, 1);
+		rte_ring_dequeue_bulk(r, &burst, 1, NULL);
 	}
 	const uint64_t sc_end = rte_rdtsc_precise();
 	rte_compiler_barrier();
@@ -133,7 +133,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, (void *)burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, (void *)burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, (void *)burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index b0ca88b..858ebc1 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -119,7 +119,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		    __func__, i, rand);
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand,
 				NULL) != 0);
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand,
+				NULL) == rand);
 
 		/* fill the ring */
 		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz, NULL) != 0);
@@ -129,7 +130,8 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz,
+				NULL) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -186,19 +188,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -232,19 +234,19 @@ test_ring_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if (ret == 0)
 		goto fail;
@@ -265,7 +267,7 @@ test_ring_basic(void)
 		cur_src += MAX_BULK;
 		if (ret == 0)
 			goto fail;
-		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if (ret == 0)
 			goto fail;
@@ -303,13 +305,13 @@ test_ring_basic(void)
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
-	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
+	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems, NULL);
 	cur_dst += num_elems;
 	if (ret == 0) {
 		printf("Cannot dequeue2\n");
@@ -390,19 +392,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1) ;
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -451,19 +453,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available memory space for the exact MAX_BULK entries */
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -505,19 +507,19 @@ test_ring_burst_basic(void)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 1, NULL);
 	cur_dst += 1;
 	if ((ret & RTE_RING_SZ_MASK) != 1)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 		goto fail;
@@ -539,7 +541,7 @@ test_ring_burst_basic(void)
 		cur_src += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
@@ -578,19 +580,19 @@ test_ring_burst_basic(void)
 
 	printf("Test dequeue without enough objects \n");
 	for (i = 0; i<RING_SIZE/MAX_BULK - 1; i++) {
-		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+		ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 		cur_dst += MAX_BULK;
 		if ((ret & RTE_RING_SZ_MASK) != MAX_BULK)
 			goto fail;
 	}
 
 	/* Available objects - the exact MAX_BULK */
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK);
+	ret = rte_ring_mc_dequeue_burst(r, cur_dst, MAX_BULK, NULL);
 	cur_dst += MAX_BULK - 3;
 	if ((ret & RTE_RING_SZ_MASK) != MAX_BULK - 3)
 		goto fail;
@@ -613,7 +615,7 @@ test_ring_burst_basic(void)
 	if ((ret & RTE_RING_SZ_MASK) != 2)
 		goto fail;
 
-	ret = rte_ring_dequeue_burst(r, cur_dst, 2);
+	ret = rte_ring_dequeue_burst(r, cur_dst, 2, NULL);
 	cur_dst += 2;
 	if (ret != 2)
 		goto fail;
@@ -753,7 +755,7 @@ test_ring_basic_ex(void)
 		goto fail_test;
 	}
 
-	ret = rte_ring_dequeue_burst(rp, obj, 2);
+	ret = rte_ring_dequeue_burst(rp, obj, 2, NULL);
 	if (ret != 2) {
 		printf("test_ring_basic_ex: rte_ring_dequeue_burst fails \n");
 		goto fail_test;
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index f95a8e9..ed89896 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -152,12 +152,12 @@ test_empty_dequeue(void)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0]);
+		rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[0], NULL);
 	const uint64_t mc_end = rte_rdtsc();
 
 	printf("SC empty dequeue: %.2F\n",
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size, NULL) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
@@ -325,7 +325,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -333,7 +334,8 @@ test_burst_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_burst(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_burst(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_burst(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
@@ -361,7 +363,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_sp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_sc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_sc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t sc_end = rte_rdtsc();
 
@@ -369,7 +372,8 @@ test_bulk_enqueue_dequeue(void)
 		for (i = 0; i < iterations; i++) {
 			rte_ring_mp_enqueue_bulk(r, burst,
 					bulk_sizes[sz], NULL);
-			rte_ring_mc_dequeue_bulk(r, burst, bulk_sizes[sz]);
+			rte_ring_mc_dequeue_bulk(r, burst,
+					bulk_sizes[sz], NULL);
 		}
 		const uint64_t mc_end = rte_rdtsc();
 
diff --git a/test/test/test_table_acl.c b/test/test/test_table_acl.c
index b3bfda4..4d43be7 100644
--- a/test/test/test_table_acl.c
+++ b/test/test/test_table_acl.c
@@ -713,7 +713,7 @@ test_pipeline_single_filter(int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0) {
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_pipeline.c b/test/test/test_table_pipeline.c
index 36bfeda..b58aa5d 100644
--- a/test/test/test_table_pipeline.c
+++ b/test/test/test_table_pipeline.c
@@ -494,7 +494,7 @@ test_pipeline_single_filter(int test_type, int expected_count)
 		void *objs[RING_TX_SIZE];
 		struct rte_mbuf *mbuf;
 
-		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10);
+		ret = rte_ring_sc_dequeue_burst(rings_tx[i], objs, 10, NULL);
 		if (ret <= 0)
 			printf("Got no objects from ring %d - error code %d\n",
 				i, ret);
diff --git a/test/test/test_table_ports.c b/test/test/test_table_ports.c
index 395f4f3..39592ce 100644
--- a/test/test/test_table_ports.c
+++ b/test/test/test_table_ports.c
@@ -163,7 +163,7 @@ test_port_ring_writer(void)
 	rte_port_ring_writer_ops.f_flush(port);
 	expected_pkts = 1;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -7;
@@ -178,7 +178,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -193,7 +193,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -8;
@@ -208,7 +208,7 @@ test_port_ring_writer(void)
 
 	expected_pkts = RTE_PORT_IN_BURST_SIZE_MAX;
 	received_pkts = rte_ring_sc_dequeue_burst(port_ring_writer_params.ring,
-		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz);
+		(void **)res_mbuf, port_ring_writer_params.tx_burst_sz, NULL);
 
 	if (received_pkts < expected_pkts)
 		return -9;
diff --git a/test/test/virtual_pmd.c b/test/test/virtual_pmd.c
index 39e070c..b209355 100644
--- a/test/test/virtual_pmd.c
+++ b/test/test/virtual_pmd.c
@@ -342,7 +342,7 @@ virtual_ethdev_rx_burst_success(void *queue __rte_unused,
 	dev_private = vrtl_eth_dev->data->dev_private;
 
 	rx_count = rte_ring_dequeue_burst(dev_private->rx_queue, (void **) bufs,
-			nb_pkts);
+			nb_pkts, NULL);
 
 	/* increments ipackets count */
 	dev_private->eth_stats.ipackets += rx_count;
@@ -508,7 +508,7 @@ virtual_ethdev_get_mbufs_from_tx_queue(uint8_t port_id,
 
 	dev_private = vrtl_eth_dev->data->dev_private;
 	return rte_ring_dequeue_burst(dev_private->tx_queue, (void **)pkt_burst,
-		burst_length);
+		burst_length, NULL);
 }
 
 static uint8_t
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v2 07/14] ring: make bulk and burst fn return vals consistent
                       ` (4 preceding siblings ...)
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 06/14] ring: remove watermark support Bruce Richardson
@ 2017-03-07 11:32  2%   ` Bruce Richardson
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
    7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

The bulk fns for rings returns 0 for all elements enqueued and negative
for no space. Change that to make them consistent with the burst functions
in returning the number of elements enqueued/dequeued, i.e. 0 or N.
This change also allows the return value from enq/deq to be used directly
without a branch for error checking.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/rel_notes/release_17_05.rst             |  11 +++
 doc/guides/sample_app_ug/server_node_efd.rst       |   2 +-
 examples/load_balancer/runtime.c                   |  16 ++-
 .../client_server_mp/mp_client/client.c            |   8 +-
 .../client_server_mp/mp_server/main.c              |   2 +-
 examples/qos_sched/app_thread.c                    |   8 +-
 examples/server_node_efd/node/node.c               |   2 +-
 examples/server_node_efd/server/main.c             |   2 +-
 lib/librte_mempool/rte_mempool_ring.c              |  12 ++-
 lib/librte_ring/rte_ring.h                         | 109 +++++++--------------
 test/test-pipeline/pipeline_hash.c                 |   2 +-
 test/test-pipeline/runtime.c                       |   8 +-
 test/test/test_ring.c                              |  46 +++++----
 test/test/test_ring_perf.c                         |   8 +-
 14 files changed, 106 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 4e748dc..2b11765 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -120,6 +120,17 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
   * removed the function ``rte_ring_set_water_mark`` as part of a general
     removal of watermarks support in the library.
+  * changed the return value of the enqueue and dequeue bulk functions to
+    match that of the burst equivalents. In all cases, ring functions which
+    operate on multiple packets now return the number of elements enqueued
+    or dequeued, as appropriate. The updated functions are:
+
+    - ``rte_ring_mp_enqueue_bulk``
+    - ``rte_ring_sp_enqueue_bulk``
+    - ``rte_ring_enqueue_bulk``
+    - ``rte_ring_mc_dequeue_bulk``
+    - ``rte_ring_sc_dequeue_bulk``
+    - ``rte_ring_dequeue_bulk``
 
 ABI Changes
 -----------
diff --git a/doc/guides/sample_app_ug/server_node_efd.rst b/doc/guides/sample_app_ug/server_node_efd.rst
index 9b69cfe..e3a63c8 100644
--- a/doc/guides/sample_app_ug/server_node_efd.rst
+++ b/doc/guides/sample_app_ug/server_node_efd.rst
@@ -286,7 +286,7 @@ repeated infinitely.
 
         cl = &nodes[node];
         if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-                cl_rx_buf[node].count) != 0){
+                cl_rx_buf[node].count) != cl_rx_buf[node].count){
             for (j = 0; j < cl_rx_buf[node].count; j++)
                 rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
             cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/examples/load_balancer/runtime.c b/examples/load_balancer/runtime.c
index 6944325..82b10bc 100644
--- a/examples/load_balancer/runtime.c
+++ b/examples/load_balancer/runtime.c
@@ -146,7 +146,7 @@ app_lcore_io_rx_buffer_to_send (
 		(void **) lp->rx.mbuf_out[worker].array,
 		bsz);
 
-	if (unlikely(ret == -ENOBUFS)) {
+	if (unlikely(ret == 0)) {
 		uint32_t k;
 		for (k = 0; k < bsz; k ++) {
 			struct rte_mbuf *m = lp->rx.mbuf_out[worker].array[k];
@@ -312,7 +312,7 @@ app_lcore_io_rx_flush(struct app_lcore_params_io *lp, uint32_t n_workers)
 			(void **) lp->rx.mbuf_out[worker].array,
 			lp->rx.mbuf_out[worker].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->rx.mbuf_out[worker].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->rx.mbuf_out[worker].array[k];
@@ -349,9 +349,8 @@ app_lcore_io_tx(
 				(void **) &lp->tx.mbuf_out[port].array[n_mbufs],
 				bsz_rd);
 
-			if (unlikely(ret == -ENOENT)) {
+			if (unlikely(ret == 0))
 				continue;
-			}
 
 			n_mbufs += bsz_rd;
 
@@ -505,9 +504,8 @@ app_lcore_worker(
 			(void **) lp->mbuf_in.array,
 			bsz_rd);
 
-		if (unlikely(ret == -ENOENT)) {
+		if (unlikely(ret == 0))
 			continue;
-		}
 
 #if APP_WORKER_DROP_ALL_PACKETS
 		for (j = 0; j < bsz_rd; j ++) {
@@ -559,7 +557,7 @@ app_lcore_worker(
 
 #if APP_STATS
 			lp->rings_out_iters[port] ++;
-			if (ret == 0) {
+			if (ret > 0) {
 				lp->rings_out_count[port] += 1;
 			}
 			if (lp->rings_out_iters[port] == APP_STATS){
@@ -572,7 +570,7 @@ app_lcore_worker(
 			}
 #endif
 
-			if (unlikely(ret == -ENOBUFS)) {
+			if (unlikely(ret == 0)) {
 				uint32_t k;
 				for (k = 0; k < bsz_wr; k ++) {
 					struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
@@ -609,7 +607,7 @@ app_lcore_worker_flush(struct app_lcore_params_worker *lp)
 			(void **) lp->mbuf_out[port].array,
 			lp->mbuf_out[port].n_mbufs);
 
-		if (unlikely(ret < 0)) {
+		if (unlikely(ret == 0)) {
 			uint32_t k;
 			for (k = 0; k < lp->mbuf_out[port].n_mbufs; k ++) {
 				struct rte_mbuf *pkt_to_free = lp->mbuf_out[port].array[k];
diff --git a/examples/multi_process/client_server_mp/mp_client/client.c b/examples/multi_process/client_server_mp/mp_client/client.c
index d4f9ca3..dca9eb9 100644
--- a/examples/multi_process/client_server_mp/mp_client/client.c
+++ b/examples/multi_process/client_server_mp/mp_client/client.c
@@ -276,14 +276,10 @@ main(int argc, char *argv[])
 	printf("[Press Ctrl-C to quit ...]\n");
 
 	for (;;) {
-		uint16_t i, rx_pkts = PKT_READ_SIZE;
+		uint16_t i, rx_pkts;
 		uint8_t port;
 
-		/* try dequeuing max possible packets first, if that fails, get the
-		 * most we can. Loop body should only execute once, maximum */
-		while (rx_pkts > 0 &&
-				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts, rx_pkts) != 0))
-			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring), PKT_READ_SIZE);
+		rx_pkts = rte_ring_dequeue_burst(rx_ring, pkts, PKT_READ_SIZE);
 
 		if (unlikely(rx_pkts == 0)){
 			if (need_flush)
diff --git a/examples/multi_process/client_server_mp/mp_server/main.c b/examples/multi_process/client_server_mp/mp_server/main.c
index a6dc12d..19c95b2 100644
--- a/examples/multi_process/client_server_mp/mp_server/main.c
+++ b/examples/multi_process/client_server_mp/mp_server/main.c
@@ -227,7 +227,7 @@ flush_rx_queue(uint16_t client)
 
 	cl = &clients[client];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[client].buffer,
-			cl_rx_buf[client].count) != 0){
+			cl_rx_buf[client].count) == 0){
 		for (j = 0; j < cl_rx_buf[client].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[client].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[client].count;
diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index 70fdcdb..dab4594 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -107,7 +107,7 @@ app_rx_thread(struct thread_conf **confs)
 			}
 
 			if (unlikely(rte_ring_sp_enqueue_bulk(conf->rx_ring,
-								(void **)rx_mbufs, nb_rx) != 0)) {
+					(void **)rx_mbufs, nb_rx) == 0)) {
 				for(i = 0; i < nb_rx; i++) {
 					rte_pktmbuf_free(rx_mbufs[i]);
 
@@ -180,7 +180,7 @@ app_tx_thread(struct thread_conf **confs)
 	while ((conf = confs[conf_idx])) {
 		retval = rte_ring_sc_dequeue_bulk(conf->tx_ring, (void **)mbufs,
 					burst_conf.qos_dequeue);
-		if (likely(retval == 0)) {
+		if (likely(retval != 0)) {
 			app_send_packets(conf, mbufs, burst_conf.qos_dequeue);
 
 			conf->counter = 0; /* reset empty read loop counter */
@@ -230,7 +230,9 @@ app_worker_thread(struct thread_conf **confs)
 		nb_pkt = rte_sched_port_dequeue(conf->sched_port, mbufs,
 					burst_conf.qos_dequeue);
 		if (likely(nb_pkt > 0))
-			while (rte_ring_sp_enqueue_bulk(conf->tx_ring, (void **)mbufs, nb_pkt) != 0);
+			while (rte_ring_sp_enqueue_bulk(conf->tx_ring,
+					(void **)mbufs, nb_pkt) == 0)
+				; /* empty body */
 
 		conf_idx++;
 		if (confs[conf_idx] == NULL)
diff --git a/examples/server_node_efd/node/node.c b/examples/server_node_efd/node/node.c
index a6c0c70..9ec6a05 100644
--- a/examples/server_node_efd/node/node.c
+++ b/examples/server_node_efd/node/node.c
@@ -392,7 +392,7 @@ main(int argc, char *argv[])
 		 */
 		while (rx_pkts > 0 &&
 				unlikely(rte_ring_dequeue_bulk(rx_ring, pkts,
-					rx_pkts) != 0))
+					rx_pkts) == 0))
 			rx_pkts = (uint16_t)RTE_MIN(rte_ring_count(rx_ring),
 					PKT_READ_SIZE);
 
diff --git a/examples/server_node_efd/server/main.c b/examples/server_node_efd/server/main.c
index 1a54d1b..3eb7fac 100644
--- a/examples/server_node_efd/server/main.c
+++ b/examples/server_node_efd/server/main.c
@@ -247,7 +247,7 @@ flush_rx_queue(uint16_t node)
 
 	cl = &nodes[node];
 	if (rte_ring_enqueue_bulk(cl->rx_q, (void **)cl_rx_buf[node].buffer,
-			cl_rx_buf[node].count) != 0){
+			cl_rx_buf[node].count) != cl_rx_buf[node].count){
 		for (j = 0; j < cl_rx_buf[node].count; j++)
 			rte_pktmbuf_free(cl_rx_buf[node].buffer[j]);
 		cl->stats.rx_drop += cl_rx_buf[node].count;
diff --git a/lib/librte_mempool/rte_mempool_ring.c b/lib/librte_mempool/rte_mempool_ring.c
index b9aa64d..409b860 100644
--- a/lib/librte_mempool/rte_mempool_ring.c
+++ b/lib/librte_mempool/rte_mempool_ring.c
@@ -42,26 +42,30 @@ static int
 common_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_mp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sp_enqueue(struct rte_mempool *mp, void * const *obj_table,
 		unsigned n)
 {
-	return rte_ring_sp_enqueue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sp_enqueue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_mc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_mc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
 {
-	return rte_ring_sc_dequeue_bulk(mp->pool_data, obj_table, n);
+	return rte_ring_sc_dequeue_bulk(mp->pool_data,
+			obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static unsigned
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index e7061be..5f6589f 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -352,14 +352,10 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -391,7 +387,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOBUFS;
+				return 0;
 			else {
 				/* No free entry available */
 				if (unlikely(free_entries == 0))
@@ -417,7 +413,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -433,14 +429,10 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects enqueued.
+ *   Actual number of objects enqueued.
+ *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 			 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -460,7 +452,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOBUFS;
+			return 0;
 		else {
 			/* No free entry available */
 			if (unlikely(free_entries == 0))
@@ -477,7 +469,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	r->prod.tail = prod_next;
-	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
+	return n;
 }
 
 /**
@@ -498,16 +490,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
 
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -539,7 +526,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
 			if (behavior == RTE_RING_QUEUE_FIXED)
-				return -ENOENT;
+				return 0;
 			else {
 				if (unlikely(entries == 0))
 					return 0;
@@ -565,7 +552,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	r->cons.tail = cons_next;
 
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -583,15 +570,10 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
  *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring
  * @return
- *   Depend on the behavior value
- *   if behavior = RTE_RING_QUEUE_FIXED
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
- *   if behavior = RTE_RING_QUEUE_VARIABLE
- *   - n: Actual number of objects dequeued.
+ *   - Actual number of objects dequeued.
+ *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned n, enum rte_ring_queue_behavior behavior)
 {
@@ -610,7 +592,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 	if (n > entries) {
 		if (behavior == RTE_RING_QUEUE_FIXED)
-			return -ENOENT;
+			return 0;
 		else {
 			if (unlikely(entries == 0))
 				return 0;
@@ -626,7 +608,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	rte_smp_rmb();
 
 	r->cons.tail = cons_next;
-	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
+	return n;
 }
 
 /**
@@ -642,10 +624,9 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueue.
- *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -662,10 +643,9 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 			 unsigned n)
 {
@@ -686,10 +666,9 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  * @param n
  *   The number of objects to add in the ring from the obj_table.
  * @return
- *   - 0: Success; objects enqueued.
- *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
+ *   The number of objects enqueued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 		      unsigned n)
 {
@@ -716,7 +695,7 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
 static inline int __attribute__((always_inline))
 rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_mp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_mp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -733,7 +712,7 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 {
-	return rte_ring_sp_enqueue_bulk(r, &obj, 1);
+	return rte_ring_sp_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -754,10 +733,7 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
 static inline int __attribute__((always_inline))
 rte_ring_enqueue(struct rte_ring *r, void *obj)
 {
-	if (r->prod.sp_enqueue)
-		return rte_ring_sp_enqueue(r, obj);
-	else
-		return rte_ring_mp_enqueue(r, obj);
+	return rte_ring_enqueue_bulk(r, &obj, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -773,11 +749,9 @@ rte_ring_enqueue(struct rte_ring *r, void *obj)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_mc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -794,11 +768,9 @@ rte_ring_mc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  *   The number of objects to dequeue from the ring to the obj_table,
  *   must be strictly positive.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue; no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	return __rte_ring_sc_do_dequeue(r, obj_table, n, RTE_RING_QUEUE_FIXED);
@@ -818,11 +790,9 @@ rte_ring_sc_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
  * @param n
  *   The number of objects to dequeue from the ring to the obj_table.
  * @return
- *   - 0: Success; objects dequeued.
- *   - -ENOENT: Not enough entries in the ring to dequeue, no object is
- *     dequeued.
+ *   The number of objects dequeued, either 0 or n
  */
-static inline int __attribute__((always_inline))
+static inline unsigned int __attribute__((always_inline))
 rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 {
 	if (r->cons.sc_dequeue)
@@ -849,7 +819,7 @@ rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned n)
 static inline int __attribute__((always_inline))
 rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_mc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_mc_dequeue_bulk(r, obj_p, 1)  ? 0 : -ENOBUFS;
 }
 
 /**
@@ -867,7 +837,7 @@ rte_ring_mc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 {
-	return rte_ring_sc_dequeue_bulk(r, obj_p, 1);
+	return rte_ring_sc_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
@@ -889,10 +859,7 @@ rte_ring_sc_dequeue(struct rte_ring *r, void **obj_p)
 static inline int __attribute__((always_inline))
 rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 {
-	if (r->cons.sc_dequeue)
-		return rte_ring_sc_dequeue(r, obj_p);
-	else
-		return rte_ring_mc_dequeue(r, obj_p);
+	return rte_ring_dequeue_bulk(r, obj_p, 1) ? 0 : -ENOBUFS;
 }
 
 /**
diff --git a/test/test-pipeline/pipeline_hash.c b/test/test-pipeline/pipeline_hash.c
index 10d2869..1ac0aa8 100644
--- a/test/test-pipeline/pipeline_hash.c
+++ b/test/test-pipeline/pipeline_hash.c
@@ -547,6 +547,6 @@ app_main_loop_rx_metadata(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
diff --git a/test/test-pipeline/runtime.c b/test/test-pipeline/runtime.c
index 42a6142..4e20669 100644
--- a/test/test-pipeline/runtime.c
+++ b/test/test-pipeline/runtime.c
@@ -98,7 +98,7 @@ app_main_loop_rx(void) {
 				app.rings_rx[i],
 				(void **) app.mbuf_rx.array,
 				n_mbufs);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -123,7 +123,7 @@ app_main_loop_worker(void) {
 			(void **) worker_mbuf->array,
 			app.burst_size_worker_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		do {
@@ -131,7 +131,7 @@ app_main_loop_worker(void) {
 				app.rings_tx[i ^ 1],
 				(void **) worker_mbuf->array,
 				app.burst_size_worker_write);
-		} while (ret < 0);
+		} while (ret == 0);
 	}
 }
 
@@ -152,7 +152,7 @@ app_main_loop_tx(void) {
 			(void **) &app.mbuf_tx[i].array[n_mbufs],
 			app.burst_size_tx_read);
 
-		if (ret == -ENOENT)
+		if (ret == 0)
 			continue;
 
 		n_mbufs += app.burst_size_tx_read;
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 666a451..112433b 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -117,20 +117,18 @@ test_ring_basic_full_empty(void * const src[], void *dst[])
 		rand = RTE_MAX(rte_rand() % RING_SIZE, 1UL);
 		printf("%s: iteration %u, random shift: %u;\n",
 		    __func__, i, rand);
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rand));
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rand));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rand) != 0);
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rand) == rand);
 
 		/* fill the ring */
-		TEST_RING_VERIFY(-ENOBUFS != rte_ring_enqueue_bulk(r, src,
-		    rsz));
+		TEST_RING_VERIFY(rte_ring_enqueue_bulk(r, src, rsz) != 0);
 		TEST_RING_VERIFY(0 == rte_ring_free_count(r));
 		TEST_RING_VERIFY(rsz == rte_ring_count(r));
 		TEST_RING_VERIFY(rte_ring_full(r));
 		TEST_RING_VERIFY(0 == rte_ring_empty(r));
 
 		/* empty the ring */
-		TEST_RING_VERIFY(0 == rte_ring_dequeue_bulk(r, dst, rsz));
+		TEST_RING_VERIFY(rte_ring_dequeue_bulk(r, dst, rsz) == rsz);
 		TEST_RING_VERIFY(rsz == rte_ring_free_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_count(r));
 		TEST_RING_VERIFY(0 == rte_ring_full(r));
@@ -171,37 +169,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_sp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -217,37 +215,37 @@ test_ring_basic(void)
 	printf("enqueue 1 obj\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 1);
 	cur_src += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue 2 objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, 2);
 	cur_src += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("enqueue MAX_BULK objs\n");
 	ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 	cur_src += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 1 obj\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 1);
 	cur_dst += 1;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue 2 objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, 2);
 	cur_dst += 2;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	printf("dequeue MAX_BULK objs\n");
 	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 	cur_dst += MAX_BULK;
-	if (ret != 0)
+	if (ret == 0)
 		goto fail;
 
 	/* check data */
@@ -264,11 +262,11 @@ test_ring_basic(void)
 	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
 		ret = rte_ring_mp_enqueue_bulk(r, cur_src, MAX_BULK);
 		cur_src += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 		ret = rte_ring_mc_dequeue_bulk(r, cur_dst, MAX_BULK);
 		cur_dst += MAX_BULK;
-		if (ret != 0)
+		if (ret == 0)
 			goto fail;
 	}
 
@@ -294,25 +292,25 @@ test_ring_basic(void)
 
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
 	cur_dst += num_elems;
-	if (ret != 0) {
+	if (ret == 0) {
 		printf("Cannot dequeue2\n");
 		goto fail;
 	}
diff --git a/test/test/test_ring_perf.c b/test/test/test_ring_perf.c
index 320c20c..8ccbdef 100644
--- a/test/test/test_ring_perf.c
+++ b/test/test/test_ring_perf.c
@@ -195,13 +195,13 @@ enqueue_bulk(void *p)
 
 	const uint64_t sp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_sp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sp_end = rte_rdtsc();
 
 	const uint64_t mp_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mp_enqueue_bulk(r, burst, size) != 0)
+		while (rte_ring_mp_enqueue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mp_end = rte_rdtsc();
 
@@ -230,13 +230,13 @@ dequeue_bulk(void *p)
 
 	const uint64_t sc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_sc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_sc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t sc_end = rte_rdtsc();
 
 	const uint64_t mc_start = rte_rdtsc();
 	for (i = 0; i < iterations; i++)
-		while (rte_ring_mc_dequeue_bulk(r, burst, size) != 0)
+		while (rte_ring_mc_dequeue_bulk(r, burst, size) == 0)
 			rte_pause();
 	const uint64_t mc_end = rte_rdtsc();
 
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v2 03/14] ring: eliminate duplication of size and mask fields
    2017-03-07 11:32  4%   ` [dpdk-dev] [PATCH v2 01/14] ring: remove split cacheline build setting Bruce Richardson
@ 2017-03-07 11:32  3%   ` Bruce Richardson
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 04/14] ring: remove debug setting Bruce Richardson
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

The size and mask fields are duplicated in both the producer and
consumer data structures. Move them out of that into the top level
structure so they are not duplicated.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_ring/rte_ring.c | 20 ++++++++++----------
 lib/librte_ring/rte_ring.h | 32 ++++++++++++++++----------------
 test/test/test_ring.c      |  6 +++---
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 4bc6da1..80fc356 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -144,11 +144,11 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->prod.watermark = count;
+	r->watermark = count;
 	r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ);
 	r->cons.sc_dequeue = !!(flags & RING_F_SC_DEQ);
-	r->prod.size = r->cons.size = count;
-	r->prod.mask = r->cons.mask = count-1;
+	r->size = count;
+	r->mask = count - 1;
 	r->prod.head = r->cons.head = 0;
 	r->prod.tail = r->cons.tail = 0;
 
@@ -269,14 +269,14 @@ rte_ring_free(struct rte_ring *r)
 int
 rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 {
-	if (count >= r->prod.size)
+	if (count >= r->size)
 		return -EINVAL;
 
 	/* if count is 0, disable the watermarking */
 	if (count == 0)
-		count = r->prod.size;
+		count = r->size;
 
-	r->prod.watermark = count;
+	r->watermark = count;
 	return 0;
 }
 
@@ -291,17 +291,17 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  ct=%"PRIu32"\n", r->cons.tail);
 	fprintf(f, "  ch=%"PRIu32"\n", r->cons.head);
 	fprintf(f, "  pt=%"PRIu32"\n", r->prod.tail);
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->prod.watermark == r->prod.size)
+	if (r->watermark == r->size)
 		fprintf(f, "  watermark=0\n");
 	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->prod.watermark);
+		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 
 	/* sum and dump statistics */
 #ifdef RTE_LIBRTE_RING_DEBUG
@@ -318,7 +318,7 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
 		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
 	}
-	fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
+	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
 	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
 	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 659c6d0..61c0982 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -151,13 +151,10 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 struct rte_ring_headtail {
 	volatile uint32_t head;  /**< Prod/consumer head. */
 	volatile uint32_t tail;  /**< Prod/consumer tail. */
-	uint32_t size;           /**< Size of ring. */
-	uint32_t mask;           /**< Mask (size-1) of ring. */
 	union {
 		uint32_t sp_enqueue; /**< True, if single producer. */
 		uint32_t sc_dequeue; /**< True, if single consumer. */
 	};
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 };
 
 /**
@@ -177,9 +174,12 @@ struct rte_ring {
 	 * next time the ABI changes
 	 */
 	char name[RTE_MEMZONE_NAMESIZE];    /**< Name of the ring. */
-	int flags;                       /**< Flags supplied at creation. */
+	int flags;               /**< Flags supplied at creation. */
 	const struct rte_memzone *memzone;
 			/**< Memzone, if any, containing the rte_ring */
+	uint32_t size;           /**< Size of ring. */
+	uint32_t mask;           /**< Mask (size-1) of ring. */
+	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -358,7 +358,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * Placed here since identical code needed in both
  * single and multi producer enqueue functions */
 #define ENQUEUE_PTRS() do { \
-	const uint32_t size = r->prod.size; \
+	const uint32_t size = r->size; \
 	uint32_t idx = prod_head & mask; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & ((~(unsigned)0x3))); i+=4, idx+=4) { \
@@ -385,7 +385,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * single and multi consumer dequeue functions */
 #define DEQUEUE_PTRS() do { \
 	uint32_t idx = cons_head & mask; \
-	const uint32_t size = r->cons.size; \
+	const uint32_t size = r->size; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & (~(unsigned)0x3)); i+=4, idx+=4) {\
 			obj_table[i] = r->ring[idx]; \
@@ -440,7 +440,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -488,7 +488,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -547,7 +547,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 	int ret;
 
 	prod_head = r->prod.head;
@@ -583,7 +583,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->prod.watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
 		__RING_STAT_ADD(r, enq_quota, n);
@@ -633,7 +633,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	const unsigned max = n;
 	int success;
 	unsigned i, rep = 0;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -730,7 +730,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
 	unsigned i;
-	uint32_t mask = r->prod.mask;
+	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
 	prod_tail = r->prod.tail;
@@ -1059,7 +1059,7 @@ rte_ring_full(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return ((cons_tail - prod_tail - 1) & r->prod.mask) == 0;
+	return ((cons_tail - prod_tail - 1) & r->mask) == 0;
 }
 
 /**
@@ -1092,7 +1092,7 @@ rte_ring_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (prod_tail - cons_tail) & r->prod.mask;
+	return (prod_tail - cons_tail) & r->mask;
 }
 
 /**
@@ -1108,7 +1108,7 @@ rte_ring_free_count(const struct rte_ring *r)
 {
 	uint32_t prod_tail = r->prod.tail;
 	uint32_t cons_tail = r->cons.tail;
-	return (cons_tail - prod_tail - 1) & r->prod.mask;
+	return (cons_tail - prod_tail - 1) & r->mask;
 }
 
 /**
@@ -1122,7 +1122,7 @@ rte_ring_free_count(const struct rte_ring *r)
 static inline unsigned int
 rte_ring_get_size(const struct rte_ring *r)
 {
-	return r->prod.size;
+	return r->size;
 }
 
 /**
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index ebcb896..5f09097 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -148,7 +148,7 @@ check_live_watermark_change(__attribute__((unused)) void *dummy)
 		}
 
 		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->prod.watermark;
+		watermark = r->watermark;
 		if (watermark != watermark_old &&
 		    (watermark_old != 16 || watermark != 32)) {
 			printf("Bad watermark change %u -> %u\n", watermark_old,
@@ -213,7 +213,7 @@ test_set_watermark( void ){
 		printf( " ring lookup failed\n" );
 		goto error;
 	}
-	count = r->prod.size*2;
+	count = r->size * 2;
 	setwm = rte_ring_set_water_mark(r, count);
 	if (setwm != -EINVAL){
 		printf("Test failed to detect invalid watermark count value\n");
@@ -222,7 +222,7 @@ test_set_watermark( void ){
 
 	count = 0;
 	rte_ring_set_water_mark(r, count);
-	if (r->prod.watermark != r->prod.size) {
+	if (r->watermark != r->size) {
 		printf("Test failed to detect invalid watermark count value\n");
 		goto error;
 	}
-- 
2.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 06/14] ring: remove watermark support
                       ` (3 preceding siblings ...)
  2017-03-07 11:32  4%   ` [dpdk-dev] [PATCH v2 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
@ 2017-03-07 11:32  2%   ` Bruce Richardson
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

Remove the watermark support. A future commit will add support for having
enqueue functions return the amount of free space in the ring, which will
allow applications to implement their own watermark checks, while also
being more useful to the app.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

---
V2: fix missed references to watermarks in v1

---
 doc/guides/prog_guide/ring_lib.rst     |   8 --
 doc/guides/rel_notes/release_17_05.rst |   2 +
 examples/Makefile                      |   2 +-
 lib/librte_ring/rte_ring.c             |  23 -----
 lib/librte_ring/rte_ring.h             |  58 +------------
 test/test/autotest_test_funcs.py       |   7 --
 test/test/commands.c                   |  52 ------------
 test/test/test_ring.c                  | 149 +--------------------------------
 8 files changed, 8 insertions(+), 293 deletions(-)

diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index d4ab502..b31ab7a 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -102,14 +102,6 @@ Name
 A ring is identified by a unique name.
 It is not possible to create two rings with the same name (rte_ring_create() returns NULL if this is attempted).
 
-Water Marking
-~~~~~~~~~~~~~
-
-The ring can have a high water mark (threshold).
-Once an enqueue operation reaches the high water mark, the producer is notified, if the water mark is configured.
-
-This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index c69ca8f..4e748dc 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -118,6 +118,8 @@ API Changes
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
   * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
+  * removed the function ``rte_ring_set_water_mark`` as part of a general
+    removal of watermarks support in the library.
 
 ABI Changes
 -----------
diff --git a/examples/Makefile b/examples/Makefile
index da2bfdd..19cd5ad 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -81,7 +81,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += packet_ordering
 DIRS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ptpclient
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
-DIRS-y += quota_watermark
+#DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
 ifeq ($(CONFIG_RTE_LIBRTE_HASH),y)
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 90ee63f..18fb644 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -138,7 +138,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	if (ret < 0 || ret >= (int)sizeof(r->name))
 		return -ENAMETOOLONG;
 	r->flags = flags;
-	r->watermark = count;
 	r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ);
 	r->cons.sc_dequeue = !!(flags & RING_F_SC_DEQ);
 	r->size = count;
@@ -256,24 +255,6 @@ rte_ring_free(struct rte_ring *r)
 	rte_free(te);
 }
 
-/*
- * change the high water mark. If *count* is 0, water marking is
- * disabled
- */
-int
-rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
-{
-	if (count >= r->size)
-		return -EINVAL;
-
-	/* if count is 0, disable the watermarking */
-	if (count == 0)
-		count = r->size;
-
-	r->watermark = count;
-	return 0;
-}
-
 /* dump the status of the ring on the console */
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
@@ -287,10 +268,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
-	if (r->watermark == r->size)
-		fprintf(f, "  watermark=0\n");
-	else
-		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 2177954..e7061be 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -156,7 +156,6 @@ struct rte_ring {
 			/**< Memzone, if any, containing the rte_ring */
 	uint32_t size;           /**< Size of ring. */
 	uint32_t mask;           /**< Mask (size-1) of ring. */
-	uint32_t watermark;      /**< Max items before EDQUOT in producer. */
 
 	/** Ring producer status. */
 	struct rte_ring_headtail prod __rte_aligned(PROD_ALIGN);
@@ -171,7 +170,6 @@ struct rte_ring {
 
 #define RING_F_SP_ENQ 0x0001 /**< The default enqueue is "single-producer". */
 #define RING_F_SC_DEQ 0x0002 /**< The default dequeue is "single-consumer". */
-#define RTE_RING_QUOT_EXCEED (1 << 31)  /**< Quota exceed for burst ops */
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
@@ -277,26 +275,6 @@ struct rte_ring *rte_ring_create(const char *name, unsigned count,
 void rte_ring_free(struct rte_ring *r);
 
 /**
- * Change the high water mark.
- *
- * If *count* is 0, water marking is disabled. Otherwise, it is set to the
- * *count* value. The *count* value must be greater than 0 and less
- * than the ring size.
- *
- * This function can be called at any time (not necessarily at
- * initialization).
- *
- * @param r
- *   A pointer to the ring structure.
- * @param count
- *   The new water mark value.
- * @return
- *   - 0: Success; water mark changed.
- *   - -EINVAL: Invalid water mark value.
- */
-int rte_ring_set_water_mark(struct rte_ring *r, unsigned count);
-
-/**
  * Dump the status of the ring to a file.
  *
  * @param f
@@ -377,8 +355,6 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -393,7 +369,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	int success;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
 	 * potentially harmful when n equals 0. */
@@ -434,13 +409,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-				(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	/*
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
@@ -449,7 +417,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		rte_pause();
 
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -468,8 +436,6 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
  *   Depend on the behavior value
  *   if behavior = RTE_RING_QUEUE_FIXED
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  *   if behavior = RTE_RING_QUEUE_VARIABLE
  *   - n: Actual number of objects enqueued.
@@ -482,7 +448,6 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t prod_next, free_entries;
 	unsigned int i;
 	uint32_t mask = r->mask;
-	int ret;
 
 	prod_head = r->prod.head;
 	cons_tail = r->cons.tail;
@@ -511,15 +476,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	ENQUEUE_PTRS();
 	rte_smp_wmb();
 
-	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
-			(int)(n | RTE_RING_QUOT_EXCEED);
-	else
-		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-
 	r->prod.tail = prod_next;
-	return ret;
+	return (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
 }
 
 /**
@@ -685,8 +643,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueue.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue, no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -707,8 +663,6 @@ rte_ring_mp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -733,8 +687,6 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   The number of objects to add in the ring from the obj_table.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -759,8 +711,6 @@ rte_ring_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -778,8 +728,6 @@ rte_ring_mp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
@@ -801,8 +749,6 @@ rte_ring_sp_enqueue(struct rte_ring *r, void *obj)
  *   A pointer to the object to be added.
  * @return
  *   - 0: Success; objects enqueued.
- *   - -EDQUOT: Quota exceeded. The objects have been enqueued, but the
- *     high water mark is exceeded.
  *   - -ENOBUFS: Not enough room in the ring to enqueue; no object is enqueued.
  */
 static inline int __attribute__((always_inline))
diff --git a/test/test/autotest_test_funcs.py b/test/test/autotest_test_funcs.py
index 1c5f390..8da8fcd 100644
--- a/test/test/autotest_test_funcs.py
+++ b/test/test/autotest_test_funcs.py
@@ -292,11 +292,4 @@ def ring_autotest(child, test_name):
     elif index == 2:
         return -1, "Fail [Timeout]"
 
-    child.sendline("set_watermark test 100")
-    child.sendline("dump_ring test")
-    index = child.expect(["  watermark=100",
-                          pexpect.TIMEOUT], timeout=1)
-    if index != 0:
-        return -1, "Fail [Bad watermark]"
-
     return 0, "Success"
diff --git a/test/test/commands.c b/test/test/commands.c
index 2df46b0..551c81d 100644
--- a/test/test/commands.c
+++ b/test/test/commands.c
@@ -228,57 +228,6 @@ cmdline_parse_inst_t cmd_dump_one = {
 
 /****************/
 
-struct cmd_set_ring_result {
-	cmdline_fixed_string_t set;
-	cmdline_fixed_string_t name;
-	uint32_t value;
-};
-
-static void cmd_set_ring_parsed(void *parsed_result, struct cmdline *cl,
-				__attribute__((unused)) void *data)
-{
-	struct cmd_set_ring_result *res = parsed_result;
-	struct rte_ring *r;
-	int ret;
-
-	r = rte_ring_lookup(res->name);
-	if (r == NULL) {
-		cmdline_printf(cl, "Cannot find ring\n");
-		return;
-	}
-
-	if (!strcmp(res->set, "set_watermark")) {
-		ret = rte_ring_set_water_mark(r, res->value);
-		if (ret != 0)
-			cmdline_printf(cl, "Cannot set water mark\n");
-	}
-}
-
-cmdline_parse_token_string_t cmd_set_ring_set =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, set,
-				 "set_watermark");
-
-cmdline_parse_token_string_t cmd_set_ring_name =
-	TOKEN_STRING_INITIALIZER(struct cmd_set_ring_result, name, NULL);
-
-cmdline_parse_token_num_t cmd_set_ring_value =
-	TOKEN_NUM_INITIALIZER(struct cmd_set_ring_result, value, UINT32);
-
-cmdline_parse_inst_t cmd_set_ring = {
-	.f = cmd_set_ring_parsed,  /* function to call */
-	.data = NULL,      /* 2nd arg of func */
-	.help_str = "set watermark: "
-			"set_watermark <ring_name> <value>",
-	.tokens = {        /* token list, NULL terminated */
-		(void *)&cmd_set_ring_set,
-		(void *)&cmd_set_ring_name,
-		(void *)&cmd_set_ring_value,
-		NULL,
-	},
-};
-
-/****************/
-
 struct cmd_quit_result {
 	cmdline_fixed_string_t quit;
 };
@@ -419,7 +368,6 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_autotest,
 	(cmdline_parse_inst_t *)&cmd_dump,
 	(cmdline_parse_inst_t *)&cmd_dump_one,
-	(cmdline_parse_inst_t *)&cmd_set_ring,
 	(cmdline_parse_inst_t *)&cmd_quit,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx,
 	(cmdline_parse_inst_t *)&cmd_set_rxtx_anchor,
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 3891f5d..666a451 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -78,21 +78,6 @@
  *      - Dequeue one object, two objects, MAX_BULK objects
  *      - Check that dequeued pointers are correct
  *
- *    - Test watermark and default bulk enqueue/dequeue:
- *
- *      - Set watermark
- *      - Set default bulk value
- *      - Enqueue objects, check that -EDQUOT is returned when
- *        watermark is exceeded
- *      - Check that dequeued pointers are correct
- *
- * #. Check live watermark change
- *
- *    - Start a loop on another lcore that will enqueue and dequeue
- *      objects in a ring. It will monitor the value of watermark.
- *    - At the same time, change the watermark on the master lcore.
- *    - The slave lcore will check that watermark changes from 16 to 32.
- *
  * #. Performance tests.
  *
  * Tests done in test_ring_perf.c
@@ -115,123 +100,6 @@ static struct rte_ring *r;
 
 #define	TEST_RING_FULL_EMTPY_ITER	8
 
-static int
-check_live_watermark_change(__attribute__((unused)) void *dummy)
-{
-	uint64_t hz = rte_get_timer_hz();
-	void *obj_table[MAX_BULK];
-	unsigned watermark, watermark_old = 16;
-	uint64_t cur_time, end_time;
-	int64_t diff = 0;
-	int i, ret;
-	unsigned count = 4;
-
-	/* init the object table */
-	memset(obj_table, 0, sizeof(obj_table));
-	end_time = rte_get_timer_cycles() + (hz / 4);
-
-	/* check that bulk and watermark are 4 and 32 (respectively) */
-	while (diff >= 0) {
-
-		/* add in ring until we reach watermark */
-		ret = 0;
-		for (i = 0; i < 16; i ++) {
-			if (ret != 0)
-				break;
-			ret = rte_ring_enqueue_bulk(r, obj_table, count);
-		}
-
-		if (ret != -EDQUOT) {
-			printf("Cannot enqueue objects, or watermark not "
-			       "reached (ret=%d)\n", ret);
-			return -1;
-		}
-
-		/* read watermark, the only change allowed is from 16 to 32 */
-		watermark = r->watermark;
-		if (watermark != watermark_old &&
-		    (watermark_old != 16 || watermark != 32)) {
-			printf("Bad watermark change %u -> %u\n", watermark_old,
-			       watermark);
-			return -1;
-		}
-		watermark_old = watermark;
-
-		/* dequeue objects from ring */
-		while (i--) {
-			ret = rte_ring_dequeue_bulk(r, obj_table, count);
-			if (ret != 0) {
-				printf("Cannot dequeue (ret=%d)\n", ret);
-				return -1;
-			}
-		}
-
-		cur_time = rte_get_timer_cycles();
-		diff = end_time - cur_time;
-	}
-
-	if (watermark_old != 32 ) {
-		printf(" watermark was not updated (wm=%u)\n",
-		       watermark_old);
-		return -1;
-	}
-
-	return 0;
-}
-
-static int
-test_live_watermark_change(void)
-{
-	unsigned lcore_id = rte_lcore_id();
-	unsigned lcore_id2 = rte_get_next_lcore(lcore_id, 0, 1);
-
-	printf("Test watermark live modification\n");
-	rte_ring_set_water_mark(r, 16);
-
-	/* launch a thread that will enqueue and dequeue, checking
-	 * watermark and quota */
-	rte_eal_remote_launch(check_live_watermark_change, NULL, lcore_id2);
-
-	rte_delay_ms(100);
-	rte_ring_set_water_mark(r, 32);
-	rte_delay_ms(100);
-
-	if (rte_eal_wait_lcore(lcore_id2) < 0)
-		return -1;
-
-	return 0;
-}
-
-/* Test for catch on invalid watermark values */
-static int
-test_set_watermark( void ){
-	unsigned count;
-	int setwm;
-
-	struct rte_ring *r = rte_ring_lookup("test_ring_basic_ex");
-	if(r == NULL){
-		printf( " ring lookup failed\n" );
-		goto error;
-	}
-	count = r->size * 2;
-	setwm = rte_ring_set_water_mark(r, count);
-	if (setwm != -EINVAL){
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-
-	count = 0;
-	rte_ring_set_water_mark(r, count);
-	if (r->watermark != r->size) {
-		printf("Test failed to detect invalid watermark count value\n");
-		goto error;
-	}
-	return 0;
-
-error:
-	return -1;
-}
-
 /*
  * helper routine for test_ring_basic
  */
@@ -418,8 +286,7 @@ test_ring_basic(void)
 	cur_src = src;
 	cur_dst = dst;
 
-	printf("test watermark and default bulk enqueue / dequeue\n");
-	rte_ring_set_water_mark(r, 20);
+	printf("test default bulk enqueue / dequeue\n");
 	num_elems = 16;
 
 	cur_src = src;
@@ -433,8 +300,8 @@ test_ring_basic(void)
 	}
 	ret = rte_ring_enqueue_bulk(r, cur_src, num_elems);
 	cur_src += num_elems;
-	if (ret != -EDQUOT) {
-		printf("Watermark not exceeded\n");
+	if (ret != 0) {
+		printf("Cannot enqueue\n");
 		goto fail;
 	}
 	ret = rte_ring_dequeue_bulk(r, cur_dst, num_elems);
@@ -930,16 +797,6 @@ test_ring(void)
 		return -1;
 
 	/* basic operations */
-	if (test_live_watermark_change() < 0)
-		return -1;
-
-	if ( test_set_watermark() < 0){
-		printf ("Test failed to detect invalid parameter\n");
-		return -1;
-	}
-	else
-		printf ( "Test detected forced bad watermark values\n");
-
 	if ( test_create_count_odd() < 0){
 			printf ("Test failed to detect odd count\n");
 			return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v2 01/14] ring: remove split cacheline build setting
  @ 2017-03-07 11:32  4%   ` Bruce Richardson
  2017-03-07 11:32  3%   ` [dpdk-dev] [PATCH v2 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure.  Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

---

V2: Limit the cacheline * 2 alignment to platforms with < 128B line size
---
 config/common_base                     |  1 -
 doc/guides/rel_notes/release_17_05.rst |  6 ++++++
 lib/librte_ring/rte_ring.c             |  2 --
 lib/librte_ring/rte_ring.h             | 16 ++++++++++------
 4 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/config/common_base b/config/common_base
index aeee13e..099ffda 100644
--- a/config/common_base
+++ b/config/common_base
@@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 #
 CONFIG_RTE_LIBRTE_RING=y
 CONFIG_RTE_LIBRTE_RING_DEBUG=n
-CONFIG_RTE_RING_SPLIT_PROD_CONS=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index e25ea9f..ea45e0c 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -110,6 +110,12 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Reworked rte_ring library**
+
+  The rte_ring library has been reworked and updated. The following changes
+  have been made to it:
+
+  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index ca0a108..4bc6da1 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_RING_SPLIT_PROD_CONS
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
 #ifdef RTE_LIBRTE_RING_DEBUG
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 72ccca5..399ae3b 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -139,6 +139,14 @@ struct rte_ring_debug_stats {
 
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
+#if RTE_CACHE_LINE_SIZE < 128
+#define PROD_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#define CONS_ALIGN (RTE_CACHE_LINE_SIZE * 2)
+#else
+#define PROD_ALIGN RTE_CACHE_LINE_SIZE
+#define CONS_ALIGN RTE_CACHE_LINE_SIZE
+#endif
+
 /**
  * An RTE ring structure.
  *
@@ -168,7 +176,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Producer head. */
 		volatile uint32_t tail;  /**< Producer tail. */
-	} prod __rte_cache_aligned;
+	} prod __rte_aligned(PROD_ALIGN);
 
 	/** Ring consumer status. */
 	struct cons {
@@ -177,11 +185,7 @@ struct rte_ring {
 		uint32_t mask;           /**< Mask (size-1) of ring. */
 		volatile uint32_t head;  /**< Consumer head. */
 		volatile uint32_t tail;  /**< Consumer tail. */
-#ifdef RTE_RING_SPLIT_PROD_CONS
-	} cons __rte_cache_aligned;
-#else
-	} cons;
-#endif
+	} cons __rte_aligned(CONS_ALIGN);
 
 #ifdef RTE_LIBRTE_RING_DEBUG
 	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-- 
2.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 05/14] ring: remove the yield when waiting for tail update
                       ` (2 preceding siblings ...)
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 04/14] ring: remove debug setting Bruce Richardson
@ 2017-03-07 11:32  4%   ` Bruce Richardson
  2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 06/14] ring: remove watermark support Bruce Richardson
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 config/common_base                              |  1 -
 doc/guides/prog_guide/env_abstraction_layer.rst |  5 ----
 doc/guides/rel_notes/release_17_05.rst          |  1 +
 lib/librte_ring/rte_ring.h                      | 35 +++++--------------------
 4 files changed, 7 insertions(+), 35 deletions(-)

diff --git a/config/common_base b/config/common_base
index b3d8272..d5beadd 100644
--- a/config/common_base
+++ b/config/common_base
@@ -447,7 +447,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
 # Compile librte_mempool
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 10a10a8..7c39cd2 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -352,11 +352,6 @@ Known Issues
 
   3. It MUST not be used by multi-producer/consumer pthreads, whose scheduling policies are SCHED_FIFO or SCHED_RR.
 
-  ``RTE_RING_PAUSE_REP_COUNT`` is defined for rte_ring to reduce contention. It's mainly for case 2, a yield is issued after number of times pause repeat.
-
-  It adds a sched_yield() syscall if the thread spins for too long while waiting on the other thread to finish its operations on the ring.
-  This gives the preempted thread a chance to proceed and finish with the ring enqueue/dequeue operation.
-
 + rte_timer
 
   Running  ``rte_timer_manager()`` on a non-EAL pthread is not allowed. However, resetting/stopping the timer from a non-EAL pthread is allowed.
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index e0ebd71..c69ca8f 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -117,6 +117,7 @@ API Changes
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
   * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
+  * removed the build-time setting ``CONFIG_RTE_RING_PAUSE_REP_COUNT``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index af7b7d4..2177954 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -114,11 +114,6 @@ enum rte_ring_queue_behavior {
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
 			   sizeof(RTE_RING_MZ_PREFIX) + 1)
 
-#ifndef RTE_RING_PAUSE_REP_COUNT
-#define RTE_RING_PAUSE_REP_COUNT 0 /**< Yield after pause num of times, no yield
-                                    *   if RTE_RING_PAUSE_REP not defined. */
-#endif
-
 struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
 #if RTE_CACHE_LINE_SIZE < 128
@@ -396,7 +391,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	uint32_t cons_tail, free_entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -450,18 +445,9 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	 * If there are other enqueues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->prod.tail != prod_head)) {
+	while (unlikely(r->prod.tail != prod_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->prod.tail = prod_next;
 	return ret;
 }
@@ -494,7 +480,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 {
 	uint32_t prod_head, cons_tail;
 	uint32_t prod_next, free_entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 	int ret;
 
@@ -571,7 +557,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	uint32_t cons_next, entries;
 	const unsigned max = n;
 	int success;
-	unsigned i, rep = 0;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	/* Avoid the unnecessary cmpset operation below, which is also
@@ -616,18 +602,9 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 	 * If there are other dequeues in progress that preceded us,
 	 * we need to wait for them to complete
 	 */
-	while (unlikely(r->cons.tail != cons_head)) {
+	while (unlikely(r->cons.tail != cons_head))
 		rte_pause();
 
-		/* Set RTE_RING_PAUSE_REP_COUNT to avoid spin too long waiting
-		 * for other thread finish. It gives pre-empted thread a chance
-		 * to proceed and finish with ring dequeue operation. */
-		if (RTE_RING_PAUSE_REP_COUNT &&
-		    ++rep == RTE_RING_PAUSE_REP_COUNT) {
-			rep = 0;
-			sched_yield();
-		}
-	}
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -662,7 +639,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 {
 	uint32_t cons_head, prod_tail;
 	uint32_t cons_next, entries;
-	unsigned i;
+	unsigned int i;
 	uint32_t mask = r->mask;
 
 	cons_head = r->cons.head;
-- 
2.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 04/14] ring: remove debug setting
    2017-03-07 11:32  4%   ` [dpdk-dev] [PATCH v2 01/14] ring: remove split cacheline build setting Bruce Richardson
  2017-03-07 11:32  3%   ` [dpdk-dev] [PATCH v2 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
@ 2017-03-07 11:32  2%   ` Bruce Richardson
  2017-03-07 11:32  4%   ` [dpdk-dev] [PATCH v2 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-07 11:32 UTC (permalink / raw)
  To: olivier.matz; +Cc: jerin.jacob, dev, Bruce Richardson

The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 config/common_base                     |   1 -
 doc/guides/prog_guide/ring_lib.rst     |   7 -
 doc/guides/rel_notes/release_17_05.rst |   1 +
 lib/librte_ring/rte_ring.c             |  41 ----
 lib/librte_ring/rte_ring.h             |  97 +-------
 test/test/test_ring.c                  | 410 ---------------------------------
 6 files changed, 13 insertions(+), 544 deletions(-)

diff --git a/config/common_base b/config/common_base
index 099ffda..b3d8272 100644
--- a/config/common_base
+++ b/config/common_base
@@ -447,7 +447,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_LIBRTE_RING_DEBUG=n
 CONFIG_RTE_RING_PAUSE_REP_COUNT=0
 
 #
diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst
index 9f69753..d4ab502 100644
--- a/doc/guides/prog_guide/ring_lib.rst
+++ b/doc/guides/prog_guide/ring_lib.rst
@@ -110,13 +110,6 @@ Once an enqueue operation reaches the high water mark, the producer is notified,
 
 This mechanism can be used, for example, to exert a back pressure on I/O to inform the LAN to PAUSE.
 
-Debug
-~~~~~
-
-When debug is enabled (CONFIG_RTE_LIBRTE_RING_DEBUG is set),
-the library stores some per-ring statistic counters about the number of enqueues/dequeues.
-These statistics are per-core to avoid concurrent accesses or atomic operations.
-
 Use Cases
 ---------
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index ea45e0c..e0ebd71 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -116,6 +116,7 @@ API Changes
   have been made to it:
 
   * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
+  * removed the build-time setting ``CONFIG_RTE_LIBRTE_RING_DEBUG``
 
 ABI Changes
 -----------
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index 80fc356..90ee63f 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -131,12 +131,6 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 			  RTE_CACHE_LINE_MASK) != 0);
 	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
 			  RTE_CACHE_LINE_MASK) != 0);
-#ifdef RTE_LIBRTE_RING_DEBUG
-	RTE_BUILD_BUG_ON((sizeof(struct rte_ring_debug_stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, stats) &
-			  RTE_CACHE_LINE_MASK) != 0);
-#endif
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
@@ -284,11 +278,6 @@ rte_ring_set_water_mark(struct rte_ring *r, unsigned count)
 void
 rte_ring_dump(FILE *f, const struct rte_ring *r)
 {
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats sum;
-	unsigned lcore_id;
-#endif
-
 	fprintf(f, "ring <%s>@%p\n", r->name, r);
 	fprintf(f, "  flags=%x\n", r->flags);
 	fprintf(f, "  size=%"PRIu32"\n", r->size);
@@ -302,36 +291,6 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 		fprintf(f, "  watermark=0\n");
 	else
 		fprintf(f, "  watermark=%"PRIu32"\n", r->watermark);
-
-	/* sum and dump statistics */
-#ifdef RTE_LIBRTE_RING_DEBUG
-	memset(&sum, 0, sizeof(sum));
-	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
-		sum.enq_success_bulk += r->stats[lcore_id].enq_success_bulk;
-		sum.enq_success_objs += r->stats[lcore_id].enq_success_objs;
-		sum.enq_quota_bulk += r->stats[lcore_id].enq_quota_bulk;
-		sum.enq_quota_objs += r->stats[lcore_id].enq_quota_objs;
-		sum.enq_fail_bulk += r->stats[lcore_id].enq_fail_bulk;
-		sum.enq_fail_objs += r->stats[lcore_id].enq_fail_objs;
-		sum.deq_success_bulk += r->stats[lcore_id].deq_success_bulk;
-		sum.deq_success_objs += r->stats[lcore_id].deq_success_objs;
-		sum.deq_fail_bulk += r->stats[lcore_id].deq_fail_bulk;
-		sum.deq_fail_objs += r->stats[lcore_id].deq_fail_objs;
-	}
-	fprintf(f, "  size=%"PRIu32"\n", r->size);
-	fprintf(f, "  enq_success_bulk=%"PRIu64"\n", sum.enq_success_bulk);
-	fprintf(f, "  enq_success_objs=%"PRIu64"\n", sum.enq_success_objs);
-	fprintf(f, "  enq_quota_bulk=%"PRIu64"\n", sum.enq_quota_bulk);
-	fprintf(f, "  enq_quota_objs=%"PRIu64"\n", sum.enq_quota_objs);
-	fprintf(f, "  enq_fail_bulk=%"PRIu64"\n", sum.enq_fail_bulk);
-	fprintf(f, "  enq_fail_objs=%"PRIu64"\n", sum.enq_fail_objs);
-	fprintf(f, "  deq_success_bulk=%"PRIu64"\n", sum.deq_success_bulk);
-	fprintf(f, "  deq_success_objs=%"PRIu64"\n", sum.deq_success_objs);
-	fprintf(f, "  deq_fail_bulk=%"PRIu64"\n", sum.deq_fail_bulk);
-	fprintf(f, "  deq_fail_objs=%"PRIu64"\n", sum.deq_fail_objs);
-#else
-	fprintf(f, "  no statistics available\n");
-#endif
 }
 
 /* dump the status of all rings on the console */
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 61c0982..af7b7d4 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -109,24 +109,6 @@ enum rte_ring_queue_behavior {
 	RTE_RING_QUEUE_VARIABLE   /* Enq/Deq as many items as possible from ring */
 };
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-/**
- * A structure that stores the ring statistics (per-lcore).
- */
-struct rte_ring_debug_stats {
-	uint64_t enq_success_bulk; /**< Successful enqueues number. */
-	uint64_t enq_success_objs; /**< Objects successfully enqueued. */
-	uint64_t enq_quota_bulk;   /**< Successful enqueues above watermark. */
-	uint64_t enq_quota_objs;   /**< Objects enqueued above watermark. */
-	uint64_t enq_fail_bulk;    /**< Failed enqueues number. */
-	uint64_t enq_fail_objs;    /**< Objects that failed to be enqueued. */
-	uint64_t deq_success_bulk; /**< Successful dequeues number. */
-	uint64_t deq_success_objs; /**< Objects successfully dequeued. */
-	uint64_t deq_fail_bulk;    /**< Failed dequeues number. */
-	uint64_t deq_fail_objs;    /**< Objects that failed to be dequeued. */
-} __rte_cache_aligned;
-#endif
-
 #define RTE_RING_MZ_PREFIX "RG_"
 /**< The maximum length of a ring name. */
 #define RTE_RING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \
@@ -187,10 +169,6 @@ struct rte_ring {
 	/** Ring consumer status. */
 	struct rte_ring_headtail cons __rte_aligned(CONS_ALIGN);
 
-#ifdef RTE_LIBRTE_RING_DEBUG
-	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
-#endif
-
 	void *ring[] __rte_cache_aligned;   /**< Memory space of ring starts here.
 	                                     * not volatile so need to be careful
 	                                     * about compiler re-ordering */
@@ -202,27 +180,6 @@ struct rte_ring {
 #define RTE_RING_SZ_MASK  (unsigned)(0x0fffffff) /**< Ring size mask */
 
 /**
- * @internal When debug is enabled, store ring statistics.
- * @param r
- *   A pointer to the ring.
- * @param name
- *   The name of the statistics field to increment in the ring.
- * @param n
- *   The number to add to the object-oriented statistics.
- */
-#ifdef RTE_LIBRTE_RING_DEBUG
-#define __RING_STAT_ADD(r, name, n) do {                        \
-		unsigned __lcore_id = rte_lcore_id();           \
-		if (__lcore_id < RTE_MAX_LCORE) {               \
-			r->stats[__lcore_id].name##_objs += n;  \
-			r->stats[__lcore_id].name##_bulk += 1;  \
-		}                                               \
-	} while(0)
-#else
-#define __RING_STAT_ADD(r, name, n) do {} while(0)
-#endif
-
-/**
  * Calculate the memory size needed for a ring
  *
  * This function returns the number of bytes needed for a ring, given
@@ -463,17 +420,12 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 		/* check that we have enough room in ring */
 		if (unlikely(n > free_entries)) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOBUFS;
-			}
 			else {
 				/* No free entry available */
-				if (unlikely(free_entries == 0)) {
-					__RING_STAT_ADD(r, enq_fail, n);
+				if (unlikely(free_entries == 0))
 					return 0;
-				}
-
 				n = free_entries;
 			}
 		}
@@ -488,15 +440,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 				(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	/*
 	 * If there are other enqueues in progress that preceded us,
@@ -560,17 +508,12 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 
 	/* check that we have enough room in ring */
 	if (unlikely(n > free_entries)) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, enq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOBUFS;
-		}
 		else {
 			/* No free entry available */
-			if (unlikely(free_entries == 0)) {
-				__RING_STAT_ADD(r, enq_fail, n);
+			if (unlikely(free_entries == 0))
 				return 0;
-			}
-
 			n = free_entries;
 		}
 	}
@@ -583,15 +526,11 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const *obj_table,
 	rte_smp_wmb();
 
 	/* if we exceed the watermark */
-	if (unlikely(((mask + 1) - free_entries + n) > r->watermark)) {
+	if (unlikely(((mask + 1) - free_entries + n) > r->watermark))
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? -EDQUOT :
 			(int)(n | RTE_RING_QUOT_EXCEED);
-		__RING_STAT_ADD(r, enq_quota, n);
-	}
-	else {
+	else
 		ret = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : n;
-		__RING_STAT_ADD(r, enq_success, n);
-	}
 
 	r->prod.tail = prod_next;
 	return ret;
@@ -655,16 +594,11 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 
 		/* Set the actual entries for dequeue */
 		if (n > entries) {
-			if (behavior == RTE_RING_QUEUE_FIXED) {
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (behavior == RTE_RING_QUEUE_FIXED)
 				return -ENOENT;
-			}
 			else {
-				if (unlikely(entries == 0)){
-					__RING_STAT_ADD(r, deq_fail, n);
+				if (unlikely(entries == 0))
 					return 0;
-				}
-
 				n = entries;
 			}
 		}
@@ -694,7 +628,6 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void **obj_table,
 			sched_yield();
 		}
 	}
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
@@ -741,16 +674,11 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	entries = prod_tail - cons_head;
 
 	if (n > entries) {
-		if (behavior == RTE_RING_QUEUE_FIXED) {
-			__RING_STAT_ADD(r, deq_fail, n);
+		if (behavior == RTE_RING_QUEUE_FIXED)
 			return -ENOENT;
-		}
 		else {
-			if (unlikely(entries == 0)){
-				__RING_STAT_ADD(r, deq_fail, n);
+			if (unlikely(entries == 0))
 				return 0;
-			}
-
 			n = entries;
 		}
 	}
@@ -762,7 +690,6 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void **obj_table,
 	DEQUEUE_PTRS();
 	rte_smp_rmb();
 
-	__RING_STAT_ADD(r, deq_success, n);
 	r->cons.tail = cons_next;
 	return behavior == RTE_RING_QUEUE_FIXED ? 0 : n;
 }
diff --git a/test/test/test_ring.c b/test/test/test_ring.c
index 5f09097..3891f5d 100644
--- a/test/test/test_ring.c
+++ b/test/test/test_ring.c
@@ -763,412 +763,6 @@ test_ring_burst_basic(void)
 	return -1;
 }
 
-static int
-test_ring_stats(void)
-{
-
-#ifndef RTE_LIBRTE_RING_DEBUG
-	printf("Enable RTE_LIBRTE_RING_DEBUG to test ring stats.\n");
-	return 0;
-#else
-	void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
-	int ret;
-	unsigned i;
-	unsigned num_items            = 0;
-	unsigned failed_enqueue_ops   = 0;
-	unsigned failed_enqueue_items = 0;
-	unsigned failed_dequeue_ops   = 0;
-	unsigned failed_dequeue_items = 0;
-	unsigned last_enqueue_ops     = 0;
-	unsigned last_enqueue_items   = 0;
-	unsigned last_quota_ops       = 0;
-	unsigned last_quota_items     = 0;
-	unsigned lcore_id = rte_lcore_id();
-	struct rte_ring_debug_stats *ring_stats = &r->stats[lcore_id];
-
-	printf("Test the ring stats.\n");
-
-	/* Reset the watermark in case it was set in another test. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Allocate some dummy object pointers. */
-	src = malloc(RING_SIZE*2*sizeof(void *));
-	if (src == NULL)
-		goto fail;
-
-	for (i = 0; i < RING_SIZE*2 ; i++) {
-		src[i] = (void *)(unsigned long)i;
-	}
-
-	/* Allocate some memory for copied objects. */
-	dst = malloc(RING_SIZE*2*sizeof(void *));
-	if (dst == NULL)
-		goto fail;
-
-	memset(dst, 0, RING_SIZE*2*sizeof(void *));
-
-	/* Set the head and tail pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	/* Do Enqueue tests. */
-	printf("Test the dequeue stats.\n");
-
-	/* Fill the ring up to RING_SIZE -1. */
-	printf("Fill the ring.\n");
-	for (i = 0; i< (RING_SIZE/MAX_BULK); i++) {
-		rte_ring_sp_enqueue_burst(r, cur_src, MAX_BULK);
-		cur_src += MAX_BULK;
-	}
-
-	/* Adjust for final enqueue = MAX_BULK -1. */
-	cur_src--;
-
-	printf("Verify that the ring is full.\n");
-	if (rte_ring_full(r) != 1)
-		goto fail;
-
-
-	printf("Verify the enqueue success stats.\n");
-	/* Stats should match above enqueue operations to fill the ring. */
-	if (ring_stats->enq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Current max objects is RING_SIZE -1. */
-	if (ring_stats->enq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any failures yet. */
-	if (ring_stats->enq_fail_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_fail_objs != 0)
-		goto fail;
-
-
-	printf("Test stats for SP burst enqueue to a full ring.\n");
-	num_items = 2;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for SP bulk enqueue to a full ring.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP burst enqueue to a full ring.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	printf("Test stats for MP bulk enqueue to a full ring.\n");
-	num_items = 16;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -ENOBUFS)
-		goto fail;
-
-	failed_enqueue_ops   += 1;
-	failed_enqueue_items += num_items;
-
-	/* The enqueue should have failed. */
-	if (ring_stats->enq_fail_bulk != failed_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_fail_objs != failed_enqueue_items)
-		goto fail;
-
-
-	/* Do Dequeue tests. */
-	printf("Test the dequeue stats.\n");
-
-	printf("Empty the ring.\n");
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* There was only RING_SIZE -1 objects to dequeue. */
-	cur_dst++;
-
-	printf("Verify ring is empty.\n");
-	if (1 != rte_ring_empty(r))
-		goto fail;
-
-	printf("Verify the dequeue success stats.\n");
-	/* Stats should match above dequeue operations. */
-	if (ring_stats->deq_success_bulk != (RING_SIZE/MAX_BULK))
-		goto fail;
-
-	/* Objects dequeued is RING_SIZE -1. */
-	if (ring_stats->deq_success_objs != RING_SIZE -1)
-		goto fail;
-
-	/* Shouldn't have any dequeue failure stats yet. */
-	if (ring_stats->deq_fail_bulk != 0)
-		goto fail;
-
-	printf("Test stats for SC burst dequeue with an empty ring.\n");
-	num_items = 2;
-	ret = rte_ring_sc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for SC bulk dequeue with an empty ring.\n");
-	num_items = 4;
-	ret = rte_ring_sc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC burst dequeue with an empty ring.\n");
-	num_items = 8;
-	ret = rte_ring_mc_dequeue_burst(r, cur_dst, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != 0)
-		goto fail;
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test stats for MC bulk dequeue with an empty ring.\n");
-	num_items = 16;
-	ret = rte_ring_mc_dequeue_bulk(r, cur_dst, num_items);
-	if (ret != -ENOENT)
-		goto fail;
-
-	failed_dequeue_ops   += 1;
-	failed_dequeue_items += num_items;
-
-	/* The dequeue should have failed. */
-	if (ring_stats->deq_fail_bulk != failed_dequeue_ops)
-		goto fail;
-	if (ring_stats->deq_fail_objs != failed_dequeue_items)
-		goto fail;
-
-
-	printf("Test total enqueue/dequeue stats.\n");
-	/* At this point the enqueue and dequeue stats should be the same. */
-	if (ring_stats->enq_success_bulk != ring_stats->deq_success_bulk)
-		goto fail;
-	if (ring_stats->enq_success_objs != ring_stats->deq_success_objs)
-		goto fail;
-	if (ring_stats->enq_fail_bulk    != ring_stats->deq_fail_bulk)
-		goto fail;
-	if (ring_stats->enq_fail_objs    != ring_stats->deq_fail_objs)
-		goto fail;
-
-
-	/* Watermark Tests. */
-	printf("Test the watermark/quota stats.\n");
-
-	printf("Verify the initial watermark stats.\n");
-	/* Watermark stats should be 0 since there is no watermark. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Set a watermark. */
-	rte_ring_set_water_mark(r, 16);
-
-	/* Reset pointers. */
-	cur_src = src;
-	cur_dst = dst;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue below watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should still be 0. */
-	if (ring_stats->enq_quota_bulk != 0)
-		goto fail;
-	if (ring_stats->enq_quota_objs != 0)
-		goto fail;
-
-	/* Success stats should have increased. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops + 1)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items + num_items)
-		goto fail;
-
-	last_enqueue_ops   = ring_stats->enq_success_bulk;
-	last_enqueue_items = ring_stats->enq_success_objs;
-
-
-	printf("Test stats for SP burst enqueue at watermark.\n");
-	num_items = 8;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != 1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP burst enqueue above watermark.\n");
-	num_items = 1;
-	ret = rte_ring_sp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP burst enqueue above watermark.\n");
-	num_items = 2;
-	ret = rte_ring_mp_enqueue_burst(r, cur_src, num_items);
-	if ((ret & RTE_RING_SZ_MASK) != num_items)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for SP bulk enqueue above watermark.\n");
-	num_items = 4;
-	ret = rte_ring_sp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	last_quota_ops   = ring_stats->enq_quota_bulk;
-	last_quota_items = ring_stats->enq_quota_objs;
-
-
-	printf("Test stats for MP bulk enqueue above watermark.\n");
-	num_items = 8;
-	ret = rte_ring_mp_enqueue_bulk(r, cur_src, num_items);
-	if (ret != -EDQUOT)
-		goto fail;
-
-	/* Watermark stats should have changed. */
-	if (ring_stats->enq_quota_bulk != last_quota_ops +1)
-		goto fail;
-	if (ring_stats->enq_quota_objs != last_quota_items + num_items)
-		goto fail;
-
-	printf("Test watermark success stats.\n");
-	/* Success stats should be same as last non-watermarked enqueue. */
-	if (ring_stats->enq_success_bulk != last_enqueue_ops)
-		goto fail;
-	if (ring_stats->enq_success_objs != last_enqueue_items)
-		goto fail;
-
-
-	/* Cleanup. */
-
-	/* Empty the ring. */
-	for (i = 0; i<RING_SIZE/MAX_BULK; i++) {
-		rte_ring_sc_dequeue_burst(r, cur_dst, MAX_BULK);
-		cur_dst += MAX_BULK;
-	}
-
-	/* Reset the watermark. */
-	rte_ring_set_water_mark(r, 0);
-
-	/* Reset the ring stats. */
-	memset(&r->stats[lcore_id], 0, sizeof(r->stats[lcore_id]));
-
-	/* Free memory before test completed */
-	free(src);
-	free(dst);
-	return 0;
-
-fail:
-	free(src);
-	free(dst);
-	return -1;
-#endif
-}
-
 /*
  * it will always fail to create ring with a wrong ring size number in this function
  */
@@ -1335,10 +929,6 @@ test_ring(void)
 	if (test_ring_basic() < 0)
 		return -1;
 
-	/* ring stats */
-	if (test_ring_stats() < 0)
-		return -1;
-
 	/* basic operations */
 	if (test_live_watermark_change() < 0)
 		return -1;
-- 
2.9.3

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCH v3 1/2] ethdev: add capability control API
  @ 2017-03-06 20:41  3%       ` Wiles, Keith
  0 siblings, 0 replies; 200+ results
From: Wiles, Keith @ 2017-03-06 20:41 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Dumitrescu, Cristian, DPDK, jerin.jacob,
	balasubramanian.manoharan, hemant.agrawal, shreyansh.jain,
	Richardson, Bruce


> On Mar 6, 2017, at 2:21 PM, Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> 
>> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
>>> 2017-03-06 16:35, Dumitrescu, Cristian:
>>>>>> +int rte_eth_dev_capability_ops_get(uint8_t port_id,
>>>>>> +	enum rte_eth_capability cap, void *arg);
>>>>> 
>>>>> What is the benefit of getting different kind of capabilities with
>>>>> the same function?
>>>>> enum + void* = ioctl
>>>>> A self-explanatory API should have a dedicated function for each kind
>>>>> of features with different argument types.
>>>> 
>>>> The advantage is providing a standard interface to query the capabilities of
>>> the device rather than having each capability provide its own mechanism in a
>>> slightly different way.
>>>> 
>>>> IMO this mechanism is of great help to guide the developers of future
>>> ethdev features on the clean path to add new features in a modular way,
>>> extending the ethdev functionality while doing so in a separate name space
>>> and file (that's why I tend to call this a plugin-like mechanism), as opposed to
>>> the current monolithic approach for ethdev, where we have 100+ API
>>> functions in a single name space and that are split into functional groups just
>>> by blank lines in the header file. It is simply the generalization of the
>>> mechanism introduced by rte_flow in release 17.02 (so all the credit should
>>> go to Adrien and not me).
>>>> 
>>>> IMO, having a standard function as above it cleaner than having a separate
>>> and slightly different function per feature. People can quickly see the set of
>>> standard ethdev capabilities and which ones are supported by a specific
>>> device. Between A) and B) below, I definitely prefer A):
>>>> A) status = rte_eth_dev_capability_ops_get(port_id,
>>> RTE_ETH_CABABILITY_TM, &tm_ops);
>>>> B) status = rte_eth_dev_tm_ops_get(port_id, &tm_ops);
>>> 
>>> I prefer B because instead of tm_ops, you can use some specific tm
>>> arguments,
>>> show their types and properly document each parameter.
>> 
>> Note that rte_flow already returns the flow ops as a void * with no strong argument type checking (approach A from above). Are you saying this is wrong?
>> 
>> 	rte_eth_dev_filter_ctrl(port_id, RTE_ETH_FILTER_GENERIC, RTE_ETH_FILTER_GET, void *eth_flow_ops);
>> 
>> Personally, I am in favour of allowing the standard interface at the expense of strong build-time type checking. Especially that this API function is between ethdev and the drivers, as opposed to between app and ethdev.
> 
> rte_eth_dev_filter_ctrl is going to be specialized in rte_flow operations.
> I agree with you on having independent API blocks in ethdev like rte_flow.
> But this function rte_eth_dev_capability_ops_get that you propose would be
> cross-blocks. I don't see the benefit.
> I especially don't think there is a sense in the enum
> 	enum rte_eth_capability {
> 		RTE_ETH_CAPABILITY_FLOW = 0, /**< Flow */
> 		RTE_ETH_CAPABILITY_TM, /**< Traffic Manager */
> 		RTE_ETH_CAPABILITY_MAX
> 	}
> 
> I won't debate more on this. We have to read opinions of other reviewers.

The benefit is providing a generic API, which we do not need to alter in the future (causing ABI breakage). The PMD can add a capability to the list if not present already and then provide a API structure for the feature.

Being able to add features without having to change DPDK maybe a strong feature for companies that have special needs for its application. They just need to add a rte_eth_capability enum in a range that they want to control (which does not mean they need to change the above structure) and they can provide private features to the application especially if they are very specific features to some HW. I do not like private features, but I also do not want to stick just any old API in DPDK for any given special feature.

Today the structure is just APIs, but it could also provide some special or specific information to the application in that structure or via an API call.

Regards,
Keith

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v9 09/18] lib: add symbol versioning to distributor
  2017-03-06  9:10  2%         ` [dpdk-dev] [PATCH v9 00/18] distributor lib performance enhancements David Hunt
  2017-03-06  9:10  1%           ` [dpdk-dev] [PATCH v9 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-06  9:10  2%           ` David Hunt
  2017-03-10 16:22  0%             ` Bruce Richardson
  1 sibling, 1 reply; 200+ results
From: David Hunt @ 2017-03-06  9:10 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Also bumped up the ABI version number in the Makefile

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/Makefile                    |  2 +-
 lib/librte_distributor/rte_distributor.c           | 57 +++++++++++---
 lib/librte_distributor/rte_distributor_v1705.h     | 89 ++++++++++++++++++++++
 lib/librte_distributor/rte_distributor_v20.c       | 10 +++
 lib/librte_distributor/rte_distributor_version.map | 14 ++++
 5 files changed, 162 insertions(+), 10 deletions(-)
 create mode 100644 lib/librte_distributor/rte_distributor_v1705.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 2b28eff..2f05cf3 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
 
 EXPORT_MAP := rte_distributor_version.map
 
-LIBABIVER := 1
+LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 6e1debf..c4128a0 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -36,6 +36,7 @@
 #include <rte_mbuf.h>
 #include <rte_memory.h>
 #include <rte_cycles.h>
+#include <rte_compat.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
 #include <rte_string_fns.h>
@@ -44,6 +45,7 @@
 #include "rte_distributor_private.h"
 #include "rte_distributor.h"
 #include "rte_distributor_v20.h"
+#include "rte_distributor_v1705.h"
 
 TAILQ_HEAD(rte_dist_burst_list, rte_distributor);
 
@@ -57,7 +59,7 @@ EAL_REGISTER_TAILQ(rte_dist_burst_tailq)
 /**** Burst Packet APIs called by workers ****/
 
 void
-rte_distributor_request_pkt(struct rte_distributor *d,
+rte_distributor_request_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **oldpkt,
 		unsigned int count)
 {
@@ -102,9 +104,14 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 */
 	*retptr64 |= RTE_DISTRIB_GET_BUF;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_request_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(void rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt,
+		unsigned int count),
+		rte_distributor_request_pkt_v1705);
 
 int
-rte_distributor_poll_pkt(struct rte_distributor *d,
+rte_distributor_poll_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **pkts)
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
@@ -138,9 +145,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_poll_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **pkts),
+		rte_distributor_poll_pkt_v1705);
 
 int
-rte_distributor_get_pkt(struct rte_distributor *d,
+rte_distributor_get_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **pkts,
 		struct rte_mbuf **oldpkt, unsigned int return_count)
 {
@@ -168,9 +179,14 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	}
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_get_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **pkts,
+		struct rte_mbuf **oldpkt, unsigned int return_count),
+		rte_distributor_get_pkt_v1705);
 
 int
-rte_distributor_return_pkt(struct rte_distributor *d,
+rte_distributor_return_pkt_v1705(struct rte_distributor *d,
 		unsigned int worker_id, struct rte_mbuf **oldpkt, int num)
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
@@ -197,6 +213,10 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 	return 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_return_pkt, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_return_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt, int num),
+		rte_distributor_return_pkt_v1705);
 
 /**** APIs called on distributor core ***/
 
@@ -342,7 +362,7 @@ release(struct rte_distributor *d, unsigned int wkr)
 
 /* process a set of packets to distribute them to workers */
 int
-rte_distributor_process(struct rte_distributor *d,
+rte_distributor_process_v1705(struct rte_distributor *d,
 		struct rte_mbuf **mbufs, unsigned int num_mbufs)
 {
 	unsigned int next_idx = 0;
@@ -476,10 +496,14 @@ rte_distributor_process(struct rte_distributor *d,
 
 	return num_mbufs;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_process, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs),
+		rte_distributor_process_v1705);
 
 /* return to the caller, packets returned from workers */
 int
-rte_distributor_returned_pkts(struct rte_distributor *d,
+rte_distributor_returned_pkts_v1705(struct rte_distributor *d,
 		struct rte_mbuf **mbufs, unsigned int max_mbufs)
 {
 	struct rte_distributor_returned_pkts *returns = &d->returns;
@@ -504,6 +528,10 @@ rte_distributor_returned_pkts(struct rte_distributor *d,
 
 	return retval;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_returned_pkts, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs),
+		rte_distributor_returned_pkts_v1705);
 
 /*
  * Return the number of packets in-flight in a distributor, i.e. packets
@@ -525,7 +553,7 @@ total_outstanding(const struct rte_distributor *d)
  * queued up.
  */
 int
-rte_distributor_flush(struct rte_distributor *d)
+rte_distributor_flush_v1705(struct rte_distributor *d)
 {
 	const unsigned int flushed = total_outstanding(d);
 	unsigned int wkr;
@@ -549,10 +577,13 @@ rte_distributor_flush(struct rte_distributor *d)
 
 	return flushed;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_flush, _v1705, 17.05);
+MAP_STATIC_SYMBOL(int rte_distributor_flush(struct rte_distributor *d),
+		rte_distributor_returned_pkts_v1705);
 
 /* clears the internal returns array in the distributor */
 void
-rte_distributor_clear_returns(struct rte_distributor *d)
+rte_distributor_clear_returns_v1705(struct rte_distributor *d)
 {
 	unsigned int wkr;
 
@@ -565,10 +596,13 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 	for (wkr = 0; wkr < d->num_workers; wkr++)
 		d->bufs[wkr].retptr64[0] = 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_clear_returns, _v1705, 17.05);
+MAP_STATIC_SYMBOL(void rte_distributor_clear_returns(struct rte_distributor *d),
+		rte_distributor_clear_returns_v1705);
 
 /* creates a distributor instance */
 struct rte_distributor *
-rte_distributor_create(const char *name,
+rte_distributor_create_v1705(const char *name,
 		unsigned int socket_id,
 		unsigned int num_workers,
 		unsigned int alg_type)
@@ -638,3 +672,8 @@ rte_distributor_create(const char *name,
 
 	return d;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_create, _v1705, 17.05);
+MAP_STATIC_SYMBOL(struct rte_distributor *rte_distributor_create(
+		const char *name, unsigned int socket_id,
+		unsigned int num_workers, unsigned int alg_type),
+		rte_distributor_create_v1705);
diff --git a/lib/librte_distributor/rte_distributor_v1705.h b/lib/librte_distributor/rte_distributor_v1705.h
new file mode 100644
index 0000000..81b2691
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v1705.h
@@ -0,0 +1,89 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V1705_H_
+#define _RTE_DISTRIB_V1705_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct rte_distributor *
+rte_distributor_create_v1705(const char *name, unsigned int socket_id,
+		unsigned int num_workers,
+		unsigned int alg_type);
+
+int
+rte_distributor_process_v1705(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+int
+rte_distributor_returned_pkts_v1705(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+int
+rte_distributor_flush_v1705(struct rte_distributor *d);
+
+void
+rte_distributor_clear_returns_v1705(struct rte_distributor *d);
+
+int
+rte_distributor_get_pkt_v1705(struct rte_distributor *d,
+	unsigned int worker_id, struct rte_mbuf **pkts,
+	struct rte_mbuf **oldpkt, unsigned int retcount);
+
+int
+rte_distributor_return_pkt_v1705(struct rte_distributor *d,
+	unsigned int worker_id, struct rte_mbuf **oldpkt, int num);
+
+void
+rte_distributor_request_pkt_v1705(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **oldpkt,
+		unsigned int count);
+
+int
+rte_distributor_poll_pkt_v1705(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf **mbufs);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_distributor/rte_distributor_v20.c b/lib/librte_distributor/rte_distributor_v20.c
index 1f406c5..bb6c5d7 100644
--- a/lib/librte_distributor/rte_distributor_v20.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -38,6 +38,7 @@
 #include <rte_memory.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
+#include <rte_compat.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
 #include "rte_distributor_v20.h"
@@ -63,6 +64,7 @@ rte_distributor_request_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	buf->bufptr64 = req;
 }
+VERSION_SYMBOL(rte_distributor_request_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
@@ -76,6 +78,7 @@ rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
 	int64_t ret = buf->bufptr64 >> RTE_DISTRIB_FLAG_BITS;
 	return (struct rte_mbuf *)((uintptr_t)ret);
 }
+VERSION_SYMBOL(rte_distributor_poll_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
@@ -87,6 +90,7 @@ rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	return ret;
 }
+VERSION_SYMBOL(rte_distributor_get_pkt, _v20, 2.0);
 
 int
 rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
@@ -98,6 +102,7 @@ rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
 	buf->bufptr64 = req;
 	return 0;
 }
+VERSION_SYMBOL(rte_distributor_return_pkt, _v20, 2.0);
 
 /**** APIs called on distributor core ***/
 
@@ -314,6 +319,7 @@ rte_distributor_process_v20(struct rte_distributor_v20 *d,
 	d->returns.count = ret_count;
 	return num_mbufs;
 }
+VERSION_SYMBOL(rte_distributor_process, _v20, 2.0);
 
 /* return to the caller, packets returned from workers */
 int
@@ -334,6 +340,7 @@ rte_distributor_returned_pkts_v20(struct rte_distributor_v20 *d,
 
 	return retval;
 }
+VERSION_SYMBOL(rte_distributor_returned_pkts, _v20, 2.0);
 
 /* return the number of packets in-flight in a distributor, i.e. packets
  * being workered on or queued up in a backlog. */
@@ -362,6 +369,7 @@ rte_distributor_flush_v20(struct rte_distributor_v20 *d)
 
 	return flushed;
 }
+VERSION_SYMBOL(rte_distributor_flush, _v20, 2.0);
 
 /* clears the internal returns array in the distributor */
 void
@@ -372,6 +380,7 @@ rte_distributor_clear_returns_v20(struct rte_distributor_v20 *d)
 	memset(d->returns.mbufs, 0, sizeof(d->returns.mbufs));
 #endif
 }
+VERSION_SYMBOL(rte_distributor_clear_returns, _v20, 2.0);
 
 /* creates a distributor instance */
 struct rte_distributor_v20 *
@@ -415,3 +424,4 @@ rte_distributor_create_v20(const char *name,
 
 	return d;
 }
+VERSION_SYMBOL(rte_distributor_create, _v20, 2.0);
diff --git a/lib/librte_distributor/rte_distributor_version.map b/lib/librte_distributor/rte_distributor_version.map
index 73fdc43..3a285b3 100644
--- a/lib/librte_distributor/rte_distributor_version.map
+++ b/lib/librte_distributor/rte_distributor_version.map
@@ -13,3 +13,17 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_17.05 {
+	global:
+
+	rte_distributor_clear_returns;
+	rte_distributor_create;
+	rte_distributor_flush;
+	rte_distributor_get_pkt;
+	rte_distributor_poll_pkt;
+	rte_distributor_process;
+	rte_distributor_request_pkt;
+	rte_distributor_return_pkt;
+	rte_distributor_returned_pkts;
+} DPDK_2.0;
-- 
2.7.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v9 01/18] lib: rename legacy distributor lib files
  2017-03-06  9:10  2%         ` [dpdk-dev] [PATCH v9 00/18] distributor lib performance enhancements David Hunt
@ 2017-03-06  9:10  1%           ` David Hunt
  2017-03-15  6:19  2%             ` [dpdk-dev] [PATCH v10 0/18] distributor library performance enhancements David Hunt
  2017-03-06  9:10  2%           ` [dpdk-dev] [PATCH v9 09/18] " David Hunt
  1 sibling, 1 reply; 200+ results
From: David Hunt @ 2017-03-06  9:10 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Move files out of the way so that we can replace with new
versions of the distributor libtrary. Files are named in
such a way as to match the symbol versioning that we will
apply for backward ABI compatibility.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/Makefile                    |   3 +-
 lib/librte_distributor/rte_distributor.h           | 210 +-----------------
 .../{rte_distributor.c => rte_distributor_v20.c}   |   2 +-
 lib/librte_distributor/rte_distributor_v20.h       | 247 +++++++++++++++++++++
 4 files changed, 251 insertions(+), 211 deletions(-)
 rename lib/librte_distributor/{rte_distributor.c => rte_distributor_v20.c} (99%)
 create mode 100644 lib/librte_distributor/rte_distributor_v20.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 4c9af17..b314ca6 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -42,10 +42,11 @@ EXPORT_MAP := rte_distributor_version.map
 LIBABIVER := 1
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor.c
+SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include := rte_distributor.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include += rte_distributor_v20.h
 
 # this lib needs eal
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += lib/librte_eal
diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 7d36bc8..e41d522 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -34,214 +34,6 @@
 #ifndef _RTE_DISTRIBUTE_H_
 #define _RTE_DISTRIBUTE_H_
 
-/**
- * @file
- * RTE distributor
- *
- * The distributor is a component which is designed to pass packets
- * one-at-a-time to workers, with dynamic load balancing.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
-
-struct rte_distributor;
-struct rte_mbuf;
-
-/**
- * Function to create a new distributor instance
- *
- * Reserves the memory needed for the distributor operation and
- * initializes the distributor to work with the configured number of workers.
- *
- * @param name
- *   The name to be given to the distributor instance.
- * @param socket_id
- *   The NUMA node on which the memory is to be allocated
- * @param num_workers
- *   The maximum number of workers that will request packets from this
- *   distributor
- * @return
- *   The newly created distributor instance
- */
-struct rte_distributor *
-rte_distributor_create(const char *name, unsigned socket_id,
-		unsigned num_workers);
-
-/*  *** APIS to be called on the distributor lcore ***  */
-/*
- * The following APIs are the public APIs which are designed for use on a
- * single lcore which acts as the distributor lcore for a given distributor
- * instance. These functions cannot be called on multiple cores simultaneously
- * without using locking to protect access to the internals of the distributor.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * Process a set of packets by distributing them among workers that request
- * packets. The distributor will ensure that no two packets that have the
- * same flow id, or tag, in the mbuf will be procesed at the same time.
- *
- * The user is advocated to set tag for each mbuf before calling this function.
- * If user doesn't set the tag, the tag value can be various values depending on
- * driver implementation and configuration.
- *
- * This is not multi-thread safe and should only be called on a single lcore.
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs to be distributed
- * @param num_mbufs
- *   The number of mbufs in the mbufs array
- * @return
- *   The number of mbufs processed.
- */
-int
-rte_distributor_process(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned num_mbufs);
-
-/**
- * Get a set of mbufs that have been returned to the distributor by workers
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs pointer array to be filled in
- * @param max_mbufs
- *   The size of the mbufs array
- * @return
- *   The number of mbufs returned in the mbufs array.
- */
-int
-rte_distributor_returned_pkts(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned max_mbufs);
-
-/**
- * Flush the distributor component, so that there are no in-flight or
- * backlogged packets awaiting processing
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @return
- *   The number of queued/in-flight packets that were completed by this call.
- */
-int
-rte_distributor_flush(struct rte_distributor *d);
-
-/**
- * Clears the array of returned packets used as the source for the
- * rte_distributor_returned_pkts() API call.
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- */
-void
-rte_distributor_clear_returns(struct rte_distributor *d);
-
-/*  *** APIS to be called on the worker lcores ***  */
-/*
- * The following APIs are the public APIs which are designed for use on
- * multiple lcores which act as workers for a distributor. Each lcore should use
- * a unique worker id when requesting packets.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * API called by a worker to get a new packet to process. Any previous packet
- * given to the worker is assumed to have completed processing, and may be
- * optionally returned to the distributor via the oldpkt parameter.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- *
- * @return
- *   A new packet to be processed by the worker thread.
- */
-struct rte_mbuf *
-rte_distributor_get_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to return a completed packet without requesting a
- * new packet, for example, because a worker thread is shutting down
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param mbuf
- *   The previous packet being processed by the worker
- */
-int
-rte_distributor_return_pkt(struct rte_distributor *d, unsigned worker_id,
-		struct rte_mbuf *mbuf);
-
-/**
- * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
- * processing, and may be optionally returned to the distributor via
- * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt(), this function does not wait for a new
- * packet to be provided by the distributor.
- *
- * NOTE: after calling this function, rte_distributor_poll_pkt() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt()
- * API should *not* be used to try and retrieve the new packet.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- */
-void
-rte_distributor_request_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to check for a new packet that was previously
- * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
- * not yet been fulfilled by the distributor.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- *
- * @return
- *   A new packet to be processed by the worker thread, or NULL if no
- *   packet is yet available.
- */
-struct rte_mbuf *
-rte_distributor_poll_pkt(struct rte_distributor *d,
-		unsigned worker_id);
-
-#ifdef __cplusplus
-}
-#endif
+#include <rte_distributor_v20.h>
 
 #endif
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor_v20.c
similarity index 99%
rename from lib/librte_distributor/rte_distributor.c
rename to lib/librte_distributor/rte_distributor_v20.c
index f3f778c..b890947 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -40,7 +40,7 @@
 #include <rte_errno.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
-#include "rte_distributor.h"
+#include "rte_distributor_v20.h"
 
 #define NO_FLAGS 0
 #define RTE_DISTRIB_PREFIX "DT_"
diff --git a/lib/librte_distributor/rte_distributor_v20.h b/lib/librte_distributor/rte_distributor_v20.h
new file mode 100644
index 0000000..b69aa27
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v20.h
@@ -0,0 +1,247 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V20_H_
+#define _RTE_DISTRIB_V20_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
+
+struct rte_distributor;
+struct rte_mbuf;
+
+/**
+ * Function to create a new distributor instance
+ *
+ * Reserves the memory needed for the distributor operation and
+ * initializes the distributor to work with the configured number of workers.
+ *
+ * @param name
+ *   The name to be given to the distributor instance.
+ * @param socket_id
+ *   The NUMA node on which the memory is to be allocated
+ * @param num_workers
+ *   The maximum number of workers that will request packets from this
+ *   distributor
+ * @return
+ *   The newly created distributor instance
+ */
+struct rte_distributor *
+rte_distributor_create(const char *name, unsigned int socket_id,
+		unsigned int num_workers);
+
+/*  *** APIS to be called on the distributor lcore ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on a
+ * single lcore which acts as the distributor lcore for a given distributor
+ * instance. These functions cannot be called on multiple cores simultaneously
+ * without using locking to protect access to the internals of the distributor.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * Process a set of packets by distributing them among workers that request
+ * packets. The distributor will ensure that no two packets that have the
+ * same flow id, or tag, in the mbuf will be processed at the same time.
+ *
+ * The user is advocated to set tag for each mbuf before calling this function.
+ * If user doesn't set the tag, the tag value can be various values depending on
+ * driver implementation and configuration.
+ *
+ * This is not multi-thread safe and should only be called on a single lcore.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs to be distributed
+ * @param num_mbufs
+ *   The number of mbufs in the mbufs array
+ * @return
+ *   The number of mbufs processed.
+ */
+int
+rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+/**
+ * Get a set of mbufs that have been returned to the distributor by workers
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs pointer array to be filled in
+ * @param max_mbufs
+ *   The size of the mbufs array
+ * @return
+ *   The number of mbufs returned in the mbufs array.
+ */
+int
+rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+/**
+ * Flush the distributor component, so that there are no in-flight or
+ * backlogged packets awaiting processing
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @return
+ *   The number of queued/in-flight packets that were completed by this call.
+ */
+int
+rte_distributor_flush(struct rte_distributor *d);
+
+/**
+ * Clears the array of returned packets used as the source for the
+ * rte_distributor_returned_pkts() API call.
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ */
+void
+rte_distributor_clear_returns(struct rte_distributor *d);
+
+/*  *** APIS to be called on the worker lcores ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on
+ * multiple lcores which act as workers for a distributor. Each lcore should use
+ * a unique worker id when requesting packets.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * API called by a worker to get a new packet to process. Any previous packet
+ * given to the worker is assumed to have completed processing, and may be
+ * optionally returned to the distributor via the oldpkt parameter.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ *
+ * @return
+ *   A new packet to be processed by the worker thread.
+ */
+struct rte_mbuf *
+rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to return a completed packet without requesting a
+ * new packet, for example, because a worker thread is shutting down
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param mbuf
+ *   The previous packet being processed by the worker
+ */
+int
+rte_distributor_return_pkt(struct rte_distributor *d, unsigned int worker_id,
+		struct rte_mbuf *mbuf);
+
+/**
+ * API called by a worker to request a new packet to process.
+ * Any previous packet given to the worker is assumed to have completed
+ * processing, and may be optionally returned to the distributor via
+ * the oldpkt parameter.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for a new
+ * packet to be provided by the distributor.
+ *
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packet requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packet.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ */
+void
+rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to check for a new packet that was previously
+ * requested by a call to rte_distributor_request_pkt(). It does not wait
+ * for the new packet to be available, but returns NULL if the request has
+ * not yet been fulfilled by the distributor.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ *
+ * @return
+ *   A new packet to be processed by the worker thread, or NULL if no
+ *   packet is yet available.
+ */
+struct rte_mbuf *
+rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.7.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v9 00/18] distributor lib performance enhancements
  2017-03-01  7:47  1%       ` [dpdk-dev] [PATCH v8 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-06  9:10  2%         ` David Hunt
  2017-03-06  9:10  1%           ` [dpdk-dev] [PATCH v9 01/18] lib: rename legacy distributor lib files David Hunt
  2017-03-06  9:10  2%           ` [dpdk-dev] [PATCH v9 09/18] " David Hunt
  0 siblings, 2 replies; 200+ results
From: David Hunt @ 2017-03-06  9:10 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson

This patch aims to improve the throughput of the distributor library.

It uses a similar handshake mechanism to the previous version of
the library, in that bits are used to indicate when packets are ready
to be sent to a worker and ready to be returned from a worker. One main
difference is that instead of sending one packet in a cache line, it makes
use of the 7 free spaces in the same cache line in order to send up to
8 packets at a time to/from a worker.

The flow matching algorithm has had significant re-work, and now keeps an
array of inflight flows and an array of backlog flows, and matches incoming
flows to the inflight/backlog flows of all workers so that flow pinning to
workers can be maintained.

The Flow Match algorithm has both scalar and a vector versions, and a
function pointer is used to select the post appropriate function at run time,
depending on the presence of the SSE2 cpu flag. On non-x86 platforms, the
the scalar match function is selected, which should still gives a good boost
in performance over the non-burst API.

v9 changes:
   * fixed symbol versioning so it will compile on CentOS and RedHat

v8 changes:
   * Changed the patch set to have a more logical order order of
     the changes, but the end result is basically the same.
   * Fixed broken shared library build.
   * Split down the updates to example app more
   * No longer changes the test app and sample app to use a temporary
     API.
   * No longer temporarily re-names the functions in the
     version.map file.

v7 changes:
   * Reorganised patch so there's a more natural progression in the
     changes, and divided them down into easier to review chunks.
   * Previous versions of this patch set were effectively two APIs.
     We now have a single API. Legacy functionality can
     be used by by using the rte_distributor_create API call with the
     RTE_DISTRIBUTOR_SINGLE flag when creating a distributor instance.
   * Added symbol versioning for old API so that ABI is preserved.

v6 changes:
   * Fixed intermittent segfault where num pkts not divisible
     by BURST_SIZE
   * Cleanup due to review comments on mailing list
   * Renamed _priv.h to _private.h.

v5 changes:
   * Removed some un-needed code around retries in worker API calls
   * Cleanup due to review comments on mailing list
   * Cleanup of non-x86 platform compilation, fallback to scalar match

v4 changes:
   * fixed issue building shared libraries

v3 changes:
  * Addressed mailing list review comments
  * Test code removal
  * Split out SSE match into separate file to facilitate NEON addition
  * Cleaned up conditional compilation flags for SSE2
  * Addressed c99 style compilation errors
  * rebased on latest head (Jan 2 2017, Happy New Year to all)

v2 changes:
  * Created a common distributor_priv.h header file with common
    definitions and structures.
  * Added a scalar version so it can be built and used on machines without
    sse2 instruction set
  * Added unit autotests
  * Added perf autotest

Notes:
   Apps must now work in bursts, as up to 8 are given to a worker at a time
   For performance in matching, Flow ID's are 15-bits
   If 32 bits Flow IDs are required, use the packet-at-a-time (SINGLE)
   mode.

Performance Gains
   2.2GHz Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
   2 x XL710 40GbE NICS to 2 x 40Gbps traffic generator channels 64b packets
   separate cores for rx, tx, distributor
    1 worker  - up to 4.8x
    4 workers - up to 2.9x
    8 workers - up to 1.8x
   12 workers - up to 2.1x
   16 workers - up to 1.8x

[01/18] lib: rename legacy distributor lib files
[02/18] lib: create private header file
[03/18] lib: add new burst oriented distributor structs
[04/18] lib: add new distributor code
[05/18] lib: add SIMD flow matching to distributor
[06/18] test/distributor: extra params for autotests
[07/18] lib: switch distributor over to new API
[08/18] lib: make v20 header file private
[09/18] lib: add symbol versioning to distributor
[10/18] test: test single and burst distributor API
[11/18] test: add perf test for distributor burst mode
[12/18] examples/distributor: allow for extra stats
[13/18] sample: distributor: wait for ports to come up
[14/18] examples/distributor: give distributor a core
[15/18] examples/distributor: limit number of Tx rings
[16/18] examples/distributor: give Rx thread a core
[17/18] doc: distributor library changes for new burst API
[18/18] maintainers: add to distributor lib maintainers

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v3 2/2] ethdev: add hierarchical scheduler API
  @ 2017-03-04  1:10  1% ` Cristian Dumitrescu
  2017-03-30 10:32  0%   ` Hemant Agrawal
    1 sibling, 1 reply; 200+ results
From: Cristian Dumitrescu @ 2017-03-04  1:10 UTC (permalink / raw)
  To: dev
  Cc: thomas.monjalon, jerin.jacob, balasubramanian.manoharan,
	hemant.agrawal, shreyansh.jain

This patch introduces the generic ethdev API for the traffic manager
capability, which includes: hierarchical scheduling, traffic shaping,
congestion management, packet marking.

Main features:
- Exposed as ethdev plugin capability (similar to rte_flow approach)
- Capability query API per port, per hierarchy level and per hierarchy node
- Scheduling algorithms: Strict Priority (SP), Weighed Fair Queuing (WFQ),
  Weighted Round Robin (WRR)
- Traffic shaping: single/dual rate, private (per node) and shared (by multiple
  nodes) shapers
- Congestion management for hierarchy leaf nodes: algorithms of tail drop,
  head drop, WRED; private (per node) and shared (by multiple nodes) WRED
  contexts
- Packet marking: IEEE 802.1q (VLAN DEI), IETF RFC 3168 (IPv4/IPv6 ECN for
  TCP and SCTP), IETF RFC 2597 (IPv4 / IPv6 DSCP)

Changes in v3:
- Implemented feedback from Jerin [5]
- Changed naming convention: scheddev -> tm
- Improvements on the capability API:
	- Specification of marking capabilities per color
	- WFQ/WRR groups: sp_n_children_max -> wfq_wrr_n_children_per_group_max,
	  added wfq_wrr_n_groups_max, improved description of both, improved
	  description of wfq_wrr_weight_max
	- Dynamic updates: added KEEP_LEVEL and CHANGE_LEVEL for parent update
- Enforced/documented restrictions for root node (node_add() and update())
- Enforced/documented shaper profile restrictions on PIR: PIR != 0, PIR >= CIR
- Turned repetitive code in rte_tm.c into macro
- Removed dependency on rte_red.h file (added RED params to rte_tm.h)
- Color: removed "e_" from color names enum
- Fixed small Doxygen style issues

Changes in v2:
- Implemented feedback from Hemant [4]
- Improvements on the capability API
	- Added capability API for hierarchy level
	- Merged stats capability into the capability API
	- Added dynamic updates
	- Added non-leaf/leaf union to the node capability structure
	- Renamed sp_priority_min to sp_n_priorities_max, added clarifications
	- Fixed description for sp_n_children_max
- Clarified and enforced rule on node ID range for leaf and non-leaf nodes
	- Added API functions to get node type (i.e. leaf/non-leaf):
	  get_leaf_nodes(), node_type_get()
- Added clarification for the root node: its creation, its parent, its role
	- Macro NODE_ID_NULL as root node's parent
	- Description of the node_add() and node_parent_update() API functions
- Added clarification for the first time add vs. subsequent updates rule
	- Cleaned up the description for the node_add() function
- Statistics API improvements
	- Merged stats capability into the capability API
	- Added API function node_stats_update()
	- Added more stats per packet color
- Added more error types
- Fixed small Doxygen style issues

Changes in v1 (since RFC [1]):
- Implemented as ethdev plugin (similar to rte_flow) as opposed to more
  monolithic additions to ethdev itself
- Implemented feedback from Jerin [2] and Hemant [3]. Implemented all the
  suggested items with only one exception, see the long list below, hopefully
  nothing was forgotten.
    - The item not done (hopefully for a good reason): driver-generated object
      IDs. IMO the choice to have application-generated object IDs adds marginal
      complexity to the driver (search ID function required), but it provides
      huge simplification for the application. The app does not need to worry
      about building & managing tree-like structure for storing driver-generated
      object IDs, the app can use its own convention for node IDs depending on
      the specific hierarchy that it needs. Trivial example: identify all
      level-2 nodes with IDs like 100, 200, 300, … and the level-3 nodes based
      on their level-2 parents: 110, 120, 130, 140, …, 210, 220, 230, 240, …,
      310, 320, 330, … and level-4 nodes based on their level-3 parents: 111,
      112, 113, 114, …, 121, 122, 123, 124, …). Moreover, see the change log for
      the other related simplification that was implemented: leaf nodes now have
      predefined IDs that are the same with their Ethernet TX queue ID (
      therefore no translation is required for leaf nodes).
- Capability API. Done per port and per node as well.
- Dual rate shapers
- Added configuration of private shaper (per node) directly from the shaper
  profile as part of node API (no shaper ID needed for private shapers), while
  the shared shapers are configured outside of the node API using shaper profile
  and communicated to the node using shared shaper ID. So there is no
  configuration overhead for shared shapers if the app does not use any of them.
- Leaf nodes now have predefined IDs that are the same with their Ethernet TX
  queue ID (therefore no translation is required for leaf nodes). This is also
  used to differentiate between a leaf node and a non-leaf node.
- Domain-specific errors to give a precise indication of the error cause (same
  as done by rte_flow)
- Packet marking API
- Packet length optional adjustment for shapers, positive (e.g. for adding
  Ethernet framing overhead of 20 bytes) or negative (e.g. for rate limiting
  based on IP packet bytes)

[1] RFC: http://dpdk.org/ml/archives/dev/2016-November/050956.html
[2] Jerin’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054484.html
[3] Hemant’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054866.html
[4] Hemant's feedback on v1: http://www.dpdk.org/ml/archives/dev/2017-February/058033.html
[5] Jerin's feedback on v1: http://www.dpdk.org/ml/archives/dev/2017-March/058895.html

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 MAINTAINERS                            |    4 +
 lib/librte_ether/Makefile              |    5 +-
 lib/librte_ether/rte_ether_version.map |   30 +
 lib/librte_ether/rte_tm.c              |  436 ++++++++++
 lib/librte_ether/rte_tm.h              | 1466 ++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_tm_driver.h       |  365 ++++++++
 6 files changed, 2305 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_ether/rte_tm.c
 create mode 100644 lib/librte_ether/rte_tm.h
 create mode 100644 lib/librte_ether/rte_tm_driver.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 5030c1c..7893ac6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -247,6 +247,10 @@ Flow API
 M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
 F: lib/librte_ether/rte_flow*
 
+Traffic Manager API
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: lib/librte_ether/rte_tm*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 1d095a9..82faa67 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -45,6 +45,7 @@ LIBABIVER := 6
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
+SRCS-y += rte_tm.c
 
 #
 # Export include files
@@ -54,6 +55,8 @@ SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
 SYMLINK-y-include += rte_flow.h
 SYMLINK-y-include += rte_flow_driver.h
+SYMLINK-y-include += rte_tm.h
+SYMLINK-y-include += rte_tm_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 637317c..42ad3fb 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -159,5 +159,35 @@ DPDK_17.05 {
 	global:
 
 	rte_eth_dev_capability_ops_get;
+	rte_tm_get_leaf_nodes;
+	rte_tm_node_type_get;
+	rte_tm_capabilities_get;
+	rte_tm_level_capabilities_get;
+	rte_tm_node_capabilities_get;
+	rte_tm_wred_profile_add;
+	rte_tm_wred_profile_delete;
+	rte_tm_shared_wred_context_add_update;
+	rte_tm_shared_wred_context_delete;
+	rte_tm_shaper_profile_add;
+	rte_tm_shaper_profile_delete;
+	rte_tm_shared_shaper_add_update;
+	rte_tm_shared_shaper_delete;
+	rte_tm_node_add;
+	rte_tm_node_delete;
+	rte_tm_node_suspend;
+	rte_tm_node_resume;
+	rte_tm_hierarchy_set;
+	rte_tm_node_parent_update;
+	rte_tm_node_shaper_update;
+	rte_tm_node_shared_shaper_update;
+	rte_tm_node_stats_update;
+	rte_tm_node_scheduling_mode_update;
+	rte_tm_node_cman_update;
+	rte_tm_node_wred_context_update;
+	rte_tm_node_shared_wred_context_update;
+	rte_tm_node_stats_read;
+	rte_tm_mark_vlan_dei;
+	rte_tm_mark_ip_ecn;
+	rte_tm_mark_ip_dscp;
 
 } DPDK_17.02;
diff --git a/lib/librte_ether/rte_tm.c b/lib/librte_ether/rte_tm.c
new file mode 100644
index 0000000..f8bd491
--- /dev/null
+++ b/lib/librte_ether/rte_tm.c
@@ -0,0 +1,436 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include "rte_ethdev.h"
+#include "rte_tm_driver.h"
+#include "rte_tm.h"
+
+/* Get generic traffic manager operations structure from a port. */
+const struct rte_tm_ops *
+rte_tm_ops_get(uint8_t port_id, struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_tm_ops *ops;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		rte_tm_error_set(error,
+			ENODEV,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENODEV));
+		return NULL;
+	}
+
+	if ((dev->dev_ops->cap_ops_get == NULL) ||
+		(dev->dev_ops->cap_ops_get(dev, RTE_ETH_CAPABILITY_TM,
+		&ops) != 0) || (ops == NULL)) {
+		rte_tm_error_set(error,
+			ENOSYS,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+		return NULL;
+	}
+
+	return ops;
+}
+
+#define RTE_TM_FUNC(port_id, func)				\
+({								\
+	const struct rte_tm_ops *ops =			\
+		rte_tm_ops_get(port_id, error);		\
+	if (ops == NULL)						\
+		return -rte_errno;				\
+								\
+	if (ops->func == NULL)					\
+		return -rte_tm_error_set(error,		\
+			ENOSYS,					\
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,	\
+			NULL,					\
+			rte_strerror(ENOSYS));			\
+								\
+	ops->func;						\
+})
+
+/* Get number of leaf nodes */
+int
+rte_tm_get_leaf_nodes(uint8_t port_id,
+	uint32_t *n_leaf_nodes,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_tm_ops *ops =
+		rte_tm_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (n_leaf_nodes == NULL) {
+		rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		return -rte_errno;
+	}
+
+	*n_leaf_nodes = dev->data->nb_tx_queues;
+	return 0;
+}
+
+/* Check node ID type (leaf or non-leaf) */
+int
+rte_tm_node_type_get(uint8_t port_id,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_type_get)(dev,
+		node_id, is_leaf, error);
+}
+
+/* Get capabilities */
+int rte_tm_capabilities_get(uint8_t port_id,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, capabilities_get)(dev,
+		cap, error);
+}
+
+/* Get level capabilities */
+int rte_tm_level_capabilities_get(uint8_t port_id,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, level_capabilities_get)(dev,
+		level_id, cap, error);
+}
+
+/* Get node capabilities */
+int rte_tm_node_capabilities_get(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_capabilities_get)(dev,
+		node_id, cap, error);
+}
+
+/* Add WRED profile */
+int rte_tm_wred_profile_add(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, wred_profile_add)(dev,
+		wred_profile_id, profile, error);
+}
+
+/* Delete WRED profile */
+int rte_tm_wred_profile_delete(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, wred_profile_delete)(dev,
+		wred_profile_id, error);
+}
+
+/* Add/update shared WRED context */
+int rte_tm_shared_wred_context_add_update(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, shared_wred_context_add_update)(dev,
+		shared_wred_context_id, wred_profile_id, error);
+}
+
+/* Delete shared WRED context */
+int rte_tm_shared_wred_context_delete(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, shared_wred_context_delete)(dev,
+		shared_wred_context_id, error);
+}
+
+/* Add shaper profile */
+int rte_tm_shaper_profile_add(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, shaper_profile_add)(dev,
+		shaper_profile_id, profile, error);
+}
+
+/* Delete WRED profile */
+int rte_tm_shaper_profile_delete(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, shaper_profile_delete)(dev,
+		shaper_profile_id, error);
+}
+
+/* Add shared shaper */
+int rte_tm_shared_shaper_add_update(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, shared_shaper_add_update)(dev,
+		shared_shaper_id, shaper_profile_id, error);
+}
+
+/* Delete shared shaper */
+int rte_tm_shared_shaper_delete(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, shared_shaper_delete)(dev,
+		shared_shaper_id, error);
+}
+
+/* Add node to port traffic manager hierarchy */
+int rte_tm_node_add(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_add)(dev,
+		node_id, parent_node_id, priority, weight, params, error);
+}
+
+/* Delete node from traffic manager hierarchy */
+int rte_tm_node_delete(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_delete)(dev,
+		node_id, error);
+}
+
+/* Suspend node */
+int rte_tm_node_suspend(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_suspend)(dev,
+		node_id, error);
+}
+
+/* Resume node */
+int rte_tm_node_resume(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_resume)(dev,
+		node_id, error);
+}
+
+/* Set the initial port traffic manager hierarchy */
+int rte_tm_hierarchy_set(uint8_t port_id,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, hierarchy_set)(dev,
+		clear_on_fail, error);
+}
+
+/* Update node parent  */
+int rte_tm_node_parent_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_parent_update)(dev,
+		node_id, parent_node_id, priority, weight, error);
+}
+
+/* Update node private shaper */
+int rte_tm_node_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_shaper_update)(dev,
+		node_id, shaper_profile_id, error);
+}
+
+/* Update node shared shapers */
+int rte_tm_node_shared_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_shaper_id,
+	int add,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_shared_shaper_update)(dev,
+		node_id, shared_shaper_id, add, error);
+}
+
+/* Update node stats */
+int rte_tm_node_stats_update(uint8_t port_id,
+	uint32_t node_id,
+	uint64_t stats_mask,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_stats_update)(dev,
+		node_id, stats_mask, error);
+}
+
+/* Update scheduling mode */
+int rte_tm_node_scheduling_mode_update(uint8_t port_id,
+	uint32_t node_id,
+	int *scheduling_mode_per_priority,
+	uint32_t n_priorities,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_scheduling_mode_update)(dev,
+		node_id, scheduling_mode_per_priority, n_priorities, error);
+}
+
+/* Update node congestion management mode */
+int rte_tm_node_cman_update(uint8_t port_id,
+	uint32_t node_id,
+	enum rte_tm_cman_mode cman,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_cman_update)(dev,
+		node_id, cman, error);
+}
+
+/* Update node private WRED context */
+int rte_tm_node_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_wred_context_update)(dev,
+		node_id, wred_profile_id, error);
+}
+
+/* Update node shared WRED context */
+int rte_tm_node_shared_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_wred_context_id,
+	int add,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_shared_wred_context_update)(dev,
+		node_id, shared_wred_context_id, add, error);
+}
+
+/* Read and/or clear stats counters for specific node */
+int rte_tm_node_stats_read(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, node_stats_read)(dev,
+		node_id, stats, stats_mask, clear, error);
+}
+
+/* Packet marking - VLAN DEI */
+int rte_tm_mark_vlan_dei(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, mark_vlan_dei)(dev,
+		mark_green, mark_yellow, mark_red, error);
+}
+
+/* Packet marking - IPv4/IPv6 ECN */
+int rte_tm_mark_ip_ecn(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, mark_ip_ecn)(dev,
+		mark_green, mark_yellow, mark_red, error);
+}
+
+/* Packet marking - IPv4/IPv6 DSCP */
+int rte_tm_mark_ip_dscp(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	return RTE_TM_FUNC(port_id, mark_ip_dscp)(dev,
+		mark_green, mark_yellow, mark_red, error);
+}
diff --git a/lib/librte_ether/rte_tm.h b/lib/librte_ether/rte_tm.h
new file mode 100644
index 0000000..64ef5dd
--- /dev/null
+++ b/lib/librte_ether/rte_tm.h
@@ -0,0 +1,1466 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_TM_H__
+#define __INCLUDE_RTE_TM_H__
+
+/**
+ * @file
+ * RTE Generic Traffic Manager API
+ *
+ * This interface provides the ability to configure the traffic manager in a
+ * generic way. It includes features such as: hierarchical scheduling,
+ * traffic shaping, congestion management, packet marking, etc.
+ */
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/** Ethernet framing overhead
+ *
+ * Overhead fields per Ethernet frame:
+ * 1. Preamble:                                            7 bytes;
+ * 2. Start of Frame Delimiter (SFD):                      1 byte;
+ * 3. Inter-Frame Gap (IFG):                              12 bytes.
+ */
+#define RTE_TM_ETH_FRAMING_OVERHEAD                  20
+
+/**
+ * Ethernet framing overhead plus Frame Check Sequence (FCS). Useful when FCS
+ * is generated and added at the end of the Ethernet frame on TX side without
+ * any SW intervention.
+ */
+#define RTE_TM_ETH_FRAMING_OVERHEAD_FCS              24
+
+/**< Invalid WRED profile ID */
+#define RTE_TM_WRED_PROFILE_ID_NONE                  UINT32_MAX
+
+/**< Invalid shaper profile ID */
+#define RTE_TM_SHAPER_PROFILE_ID_NONE                UINT32_MAX
+
+/**< Node ID for the parent of the root node */
+#define RTE_TM_NODE_ID_NULL                          UINT32_MAX
+
+/**
+ * Color
+ */
+enum rte_tm_color {
+	RTE_TM_GREEN = 0, /**< Green */
+	RTE_TM_YELLOW, /**< Yellow */
+	RTE_TM_RED, /**< Red */
+	RTE_TM_COLORS /**< Number of colors */
+};
+
+/**
+ * Node statistics counter type
+ */
+enum rte_tm_stats_type {
+	/**< Number of packets scheduled from current node. */
+	RTE_TM_STATS_N_PKTS = 1 << 0,
+
+	/**< Number of bytes scheduled from current node. */
+	RTE_TM_STATS_N_BYTES = 1 << 1,
+
+	/**< Number of green packets dropped by current leaf node.  */
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED = 1 << 2,
+
+	/**< Number of yellow packets dropped by current leaf node.  */
+	RTE_TM_STATS_N_PKTS_YELLOW_DROPPED = 1 << 3,
+
+	/**< Number of red packets dropped by current leaf node.  */
+	RTE_TM_STATS_N_PKTS_RED_DROPPED = 1 << 4,
+
+	/**< Number of green bytes dropped by current leaf node.  */
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED = 1 << 5,
+
+	/**< Number of yellow bytes dropped by current leaf node.  */
+	RTE_TM_STATS_N_BYTES_YELLOW_DROPPED = 1 << 6,
+
+	/**< Number of red bytes dropped by current leaf node.  */
+	RTE_TM_STATS_N_BYTES_RED_DROPPED = 1 << 7,
+
+	/**< Number of packets currently waiting in the packet queue of current
+	 * leaf node.
+	 */
+	RTE_TM_STATS_N_PKTS_QUEUED = 1 << 8,
+
+	/**< Number of bytes currently waiting in the packet queue of current
+	 * leaf node.
+	 */
+	RTE_TM_STATS_N_BYTES_QUEUED = 1 << 9,
+};
+
+/**
+ * Node statistics counters
+ */
+struct rte_tm_node_stats {
+	/**< Number of packets scheduled from current node. */
+	uint64_t n_pkts;
+
+	/**< Number of bytes scheduled from current node. */
+	uint64_t n_bytes;
+
+	/**< Statistics counters for leaf nodes only. */
+	struct {
+		/**< Number of packets dropped by current leaf node per each
+		 * color.
+		 */
+		uint64_t n_pkts_dropped[RTE_TM_COLORS];
+
+		/**< Number of bytes dropped by current leaf node per each
+		 * color.
+		 */
+		uint64_t n_bytes_dropped[RTE_TM_COLORS];
+
+		/**< Number of packets currently waiting in the packet queue of
+		 * current leaf node.
+		 */
+		uint64_t n_pkts_queued;
+
+		/**< Number of bytes currently waiting in the packet queue of
+		 * current leaf node.
+		 */
+		uint64_t n_bytes_queued;
+	} leaf;
+};
+
+/**
+ * Traffic manager dynamic updates
+ */
+enum rte_tm_dynamic_update_type {
+	/**< Dynamic parent node update. The new parent node is located on same
+	 * hierarchy level as the former parent node. Consequently, the node
+	 * whose parent is changed preserves its hierarchy level.
+	 */
+	RTE_TM_UPDATE_NODE_PARENT_KEEP_LEVEL = 1 << 0,
+
+	/**< Dynamic parent node update. The new parent node is located on
+	 * different hierarchy level than the former parent node. Consequently,
+	 * the node whose parent is changed also changes its hierarchy level.
+	 */
+	RTE_TM_UPDATE_NODE_PARENT_CHANGE_LEVEL = 1 << 1,
+
+	/**< Dynamic node add/delete. */
+	RTE_TM_UPDATE_NODE_ADD_DELETE = 1 << 2,
+
+	/**< Suspend/resume nodes. */
+	RTE_TM_UPDATE_NODE_SUSPEND_RESUME = 1 << 3,
+
+	/**< Dynamic switch between WFQ and WRR per node SP priority level. */
+	RTE_TM_UPDATE_NODE_SCHEDULING_MODE = 1 << 4,
+
+	/**< Dynamic update of the set of enabled stats counter types. */
+	RTE_TM_UPDATE_NODE_STATS = 1 << 5,
+
+	/**< Dynamic update of congestion management mode for leaf nodes. */
+	RTE_TM_UPDATE_NODE_CMAN = 1 << 6,
+};
+
+/**
+ * Traffic manager node capabilities
+ */
+struct rte_tm_node_capabilities {
+	/**< Private shaper support. */
+	int shaper_private_supported;
+
+	/**< Dual rate shaping support for private shaper. Valid only when
+	 * private shaper is supported.
+	 */
+	int shaper_private_dual_rate_supported;
+
+	/**< Minimum committed/peak rate (bytes per second) for private
+	 * shaper. Valid only when private shaper is supported.
+	 */
+	uint64_t shaper_private_rate_min;
+
+	/**< Maximum committed/peak rate (bytes per second) for private
+	 * shaper. Valid only when private shaper is supported.
+	 */
+	uint64_t shaper_private_rate_max;
+
+	/**< Maximum number of supported shared shapers. The value of zero
+	 * indicates that shared shapers are not supported.
+	 */
+	uint32_t shaper_shared_n_max;
+
+	/**< Mask of supported statistics counter types. */
+	uint64_t stats_mask;
+
+	union {
+		/**< Items valid only for non-leaf nodes. */
+		struct {
+			/**< Maximum number of children nodes. */
+			uint32_t n_children_max;
+
+			/**< Maximum number of supported priority levels. The
+			 * value of zero is invalid. The value of 1 indicates
+			 * that only priority 0 is supported, which essentially
+			 * means that Strict Priority (SP) algorithm is not
+			 * supported.
+			 */
+			uint32_t sp_n_priorities_max;
+
+			/**< Maximum number of sibling nodes that can have the
+			 * same priority at any given time, i.e. maximum size
+			 * of the WFQ/WRR sibling node group. The value of zero
+			 * is invalid. The value of 1 indicates that WFQ/WRR
+			 * algorithms are not supported. The maximum value is
+			 * *n_children_max*.
+			 */
+			uint32_t wfq_wrr_n_children_per_group_max;
+
+			/**< Maximum number of priority levels that can have
+			 * more than one child node at any given time, i.e.
+			 * maximum number of WFQ/WRR sibling node groups that
+			 * have two or more members. The value of zero states
+			 * that WFQ/WRR algorithms are not supported. The value
+			 * of 1 indicates that (*sp_n_priorities_max* - 1)
+			 * priority levels have at most one child node, so
+			 * there can be only one priority level with two or
+			 * more sibling nodes making up a WFQ/WRR group. The
+			 * maximum value is: min(floor(*n_children_max* / 2),
+			 * *sp_n_priorities_max*).
+			 */
+			uint32_t wfq_wrr_n_groups_max;
+
+			/**< WFQ algorithm support. */
+			int wfq_supported;
+
+			/**< WRR algorithm support. */
+			int wrr_supported;
+
+			/**< Maximum WFQ/WRR weight. The value of 1 indicates
+			 * that all sibling nodes with same priority have the
+			 * same WFQ/WRR weight, so WFQ/WRR is reduced to FQ/RR.
+			 */
+			uint32_t wfq_wrr_weight_max;
+		} nonleaf;
+
+		/**< Items valid only for leaf nodes. */
+		struct {
+			/**< Head drop algorithm support. */
+			int cman_head_drop_supported;
+
+			/**< Private WRED context support. */
+			int cman_wred_context_private_supported;
+
+			/**< Maximum number of shared WRED contexts supported.
+			 * The value of zero indicates that shared WRED
+			 * contexts are not supported.
+			 */
+			uint32_t cman_wred_context_shared_n_max;
+		} leaf;
+	};
+};
+
+/**
+ * Traffic manager level capabilities
+ */
+struct rte_tm_level_capabilities {
+	/**< Maximum number of nodes for the current hierarchy level. */
+	uint32_t n_nodes_max;
+
+	/**< Maximum number of non-leaf nodes for the current hierarchy level.
+	 * The value of 0 indicates that current level only supports leaf
+	 * nodes. The maximum value is *n_nodes_max*.
+	 */
+	uint32_t n_nodes_nonleaf_max;
+
+	/**< Maximum number of leaf nodes for the current hierarchy level. The
+	 * value of 0 indicates that current level only supports non-leaf
+	 * nodes. The maximum value is *n_nodes_max*.
+	 */
+	uint32_t n_nodes_leaf_max;
+
+	/**< Summary of node-level capabilities across all the non-leaf nodes
+	 * of the current hierarchy level. Valid only when
+	 * *n_nodes_nonleaf_max* is greater than 0.
+	 */
+	struct rte_tm_node_capabilities nonleaf;
+
+	/**< Summary of node-level capabilities across all the leaf nodes of
+	 * the current hierarchy level. Valid only when *n_nodes_leaf_max* is
+	 * greater than 0.
+	 */
+	struct rte_tm_node_capabilities leaf;
+};
+
+/**
+ * Traffic manager capabilities
+ */
+struct rte_tm_capabilities {
+	/**< Maximum number of nodes. */
+	uint32_t n_nodes_max;
+
+	/**< Maximum number of levels (i.e. number of nodes connecting the root
+	 * node with any leaf node, including the root and the leaf).
+	 */
+	uint32_t n_levels_max;
+
+	/**< Maximum number of shapers, either private or shared. In case the
+	 * implementation does not share any resource between private and
+	 * shared shapers, it is typically equal to the sum between
+	 * *shaper_private_n_max* and *shaper_shared_n_max*.
+	 */
+	uint32_t shaper_n_max;
+
+	/**< Maximum number of private shapers. Indicates the maximum number of
+	 * nodes that can concurrently have the private shaper enabled.
+	 */
+	uint32_t shaper_private_n_max;
+
+	/**< Maximum number of shared shapers. The value of zero indicates that
+	 * shared shapers are not supported.
+	 */
+	uint32_t shaper_shared_n_max;
+
+	/**< Maximum number of nodes that can share the same shared shaper.
+	 * Only valid when shared shapers are supported.
+	 */
+	uint32_t shaper_shared_n_nodes_max;
+
+	/**< Maximum number of shared shapers that can be configured with dual
+	 * rate shaping. The value of zero indicates that dual rate shaping
+	 * support is not available for shared shapers.
+	 */
+	uint32_t shaper_shared_dual_rate_n_max;
+
+	/**< Minimum committed/peak rate (bytes per second) for shared shapers.
+	 * Only valid when shared shapers are supported.
+	 */
+	uint64_t shaper_shared_rate_min;
+
+	/**< Maximum committed/peak rate (bytes per second) for shared shaper.
+	 * Only valid when shared shapers are supported.
+	 */
+	uint64_t shaper_shared_rate_max;
+
+	/**< Minimum value allowed for packet length adjustment for
+	 * private/shared shapers.
+	 */
+	int shaper_pkt_length_adjust_min;
+
+	/**< Maximum value allowed for packet length adjustment for
+	 * private/shared shapers.
+	 */
+	int shaper_pkt_length_adjust_max;
+
+	/**< Maximum number of WRED contexts. */
+	uint32_t cman_wred_context_n_max;
+
+	/**< Maximum number of private WRED contexts. Indicates the maximum
+	 * number of leaf nodes that can concurrently have the private WRED
+	 * context enabled.
+	 */
+	uint32_t cman_wred_context_private_n_max;
+
+	/**< Maximum number of shared WRED contexts. The value of zero
+	 * indicates that shared WRED contexts are not supported.
+	 */
+	uint32_t cman_wred_context_shared_n_max;
+
+	/**< Maximum number of leaf nodes that can share the same WRED context.
+	 * Only valid when shared WRED contexts are supported.
+	 */
+	uint32_t cman_wred_context_shared_n_nodes_max;
+
+	/**< Support for VLAN DEI packet marking (per color). */
+	int mark_vlan_dei_supported[RTE_TM_COLORS];
+
+	/**< Support for IPv4/IPv6 ECN marking of TCP packets (per color). */
+	int mark_ip_ecn_tcp_supported[RTE_TM_COLORS];
+
+	/**< Support for IPv4/IPv6 ECN marking of SCTP packets (per color). */
+	int mark_ip_ecn_sctp_supported[RTE_TM_COLORS];
+
+	/**< Support for IPv4/IPv6 DSCP packet marking (per color). */
+	int mark_ip_dscp_supported[RTE_TM_COLORS];
+
+	/**< Set of supported dynamic update operations
+	 * (see enum rte_tm_dynamic_update_type).
+	 */
+	uint64_t dynamic_update_mask;
+
+	/**< Summary of node-level capabilities across all non-leaf nodes. */
+	struct rte_tm_node_capabilities nonleaf;
+
+	/**< Summary of node-level capabilities across all leaf nodes. */
+	struct rte_tm_node_capabilities leaf;
+};
+
+/**
+ * Congestion management (CMAN) mode
+ *
+ * This is used for controlling the admission of packets into a packet queue or
+ * group of packet queues on congestion. On request of writing a new packet
+ * into the current queue while the queue is full, the *tail drop* algorithm
+ * drops the new packet while leaving the queue unmodified, as opposed to *head
+ * drop* algorithm, which drops the packet at the head of the queue (the oldest
+ * packet waiting in the queue) and admits the new packet at the tail of the
+ * queue.
+ *
+ * The *Random Early Detection (RED)* algorithm works by proactively dropping
+ * more and more input packets as the queue occupancy builds up. When the queue
+ * is full or almost full, RED effectively works as *tail drop*. The *Weighted
+ * RED* algorithm uses a separate set of RED thresholds for each packet color.
+ */
+enum rte_tm_cman_mode {
+	RTE_TM_CMAN_TAIL_DROP = 0, /**< Tail drop */
+	RTE_TM_CMAN_HEAD_DROP, /**< Head drop */
+	RTE_TM_CMAN_WRED, /**< Weighted Random Early Detection (WRED) */
+};
+
+/**
+ * Random Early Detection (RED) profile
+ */
+struct rte_tm_red_params {
+	/**< Minimum queue threshold */
+	uint16_t min_th;
+
+	/**< Maximum queue threshold */
+	uint16_t max_th;
+
+	/**< Inverse of packet marking probability maximum value (maxp), i.e.
+	 * maxp_inv = 1 / maxp
+	 */
+	uint16_t maxp_inv;
+
+	/**< Negated log2 of queue weight (wq), i.e. wq = 1 / (2 ^ wq_log2) */
+	uint16_t wq_log2;
+};
+
+/**
+ * Weighted RED (WRED) profile
+ *
+ * Multiple WRED contexts can share the same WRED profile. Each leaf node with
+ * WRED enabled as its congestion management mode has zero or one private WRED
+ * context (only one leaf node using it) and/or zero, one or several shared
+ * WRED contexts (multiple leaf nodes use the same WRED context). A private
+ * WRED context is used to perform congestion management for a single leaf
+ * node, while a shared WRED context is used to perform congestion management
+ * for a group of leaf nodes.
+ */
+struct rte_tm_wred_params {
+	/**< One set of RED parameters per packet color */
+	struct rte_tm_red_params red_params[RTE_TM_COLORS];
+};
+
+/**
+ * Token bucket
+ */
+struct rte_tm_token_bucket {
+	/**< Token bucket rate (bytes per second) */
+	uint64_t rate;
+
+	/**< Token bucket size (bytes), a.k.a. max burst size */
+	uint64_t size;
+};
+
+/**
+ * Shaper (rate limiter) profile
+ *
+ * Multiple shaper instances can share the same shaper profile. Each node has
+ * zero or one private shaper (only one node using it) and/or zero, one or
+ * several shared shapers (multiple nodes use the same shaper instance).
+ * A private shaper is used to perform traffic shaping for a single node, while
+ * a shared shaper is used to perform traffic shaping for a group of nodes.
+ *
+ * Single rate shapers use a single token bucket. A single rate shaper can be
+ * configured by setting the rate of the committed bucket to zero, which
+ * effectively disables this bucket. The peak bucket is used to limit the rate
+ * and the burst size for the current shaper.
+ *
+ * Dual rate shapers use both the committed and the peak token buckets. The
+ * rate of the peak bucket has to be bigger than zero, as well as greater than
+ * or equal to the rate of the committed bucket.
+ */
+struct rte_tm_shaper_params {
+	/**< Committed token bucket */
+	struct rte_tm_token_bucket committed;
+
+	/**< Peak token bucket */
+	struct rte_tm_token_bucket peak;
+
+	/**< Signed value to be added to the length of each packet for the
+	 * purpose of shaping. Can be used to correct the packet length with
+	 * the framing overhead bytes that are also consumed on the wire (e.g.
+	 * RTE_TM_ETH_FRAMING_OVERHEAD_FCS).
+	 */
+	int32_t pkt_length_adjust;
+};
+
+/**
+ * Node parameters
+ *
+ * Each hierarchy node has multiple inputs (children nodes of the current
+ * parent node) and a single output (which is input to its parent node). The
+ * current node arbitrates its inputs using Strict Priority (SP), Weighted Fair
+ * Queuing (WFQ) and Weighted Round Robin (WRR) algorithms to schedule input
+ * packets on its output while observing its shaping (rate limiting)
+ * constraints.
+ *
+ * Algorithms such as byte-level WRR, Deficit WRR (DWRR), etc are considered
+ * approximations of the ideal of WFQ and are assimilated to WFQ, although an
+ * associated implementation-dependent trade-off on accuracy, performance and
+ * resource usage might exist.
+ *
+ * Children nodes with different priorities are scheduled using the SP
+ * algorithm, based on their priority, with zero (0) as the highest priority.
+ * Children with same priority are scheduled using the WFQ or WRR algorithm,
+ * based on their weight, which is relative to the sum of the weights of all
+ * siblings with same priority, with one (1) as the lowest weight.
+ *
+ * Each leaf node sits on on top of a TX queue of the current Ethernet port.
+ * Therefore, the leaf nodes are predefined with the node IDs of 0 .. (N-1),
+ * where N is the number of TX queues configured for the current Ethernet port.
+ * The non-leaf nodes have their IDs generated by the application.
+ */
+struct rte_tm_node_params {
+	/**< Shaper profile for the private shaper. The absence of the private
+	 * shaper for the current node is indicated by setting this parameter
+	 * to RTE_TM_SHAPER_PROFILE_ID_NONE.
+	 */
+	uint32_t shaper_profile_id;
+
+	/**< User allocated array of valid shared shaper IDs. */
+	uint32_t *shared_shaper_id;
+
+	/**< Number of shared shaper IDs in the *shared_shaper_id* array. */
+	uint32_t n_shared_shapers;
+
+	/**< Mask of statistics counter types to be enabled for this node. This
+	 * needs to be a subset of the statistics counter types available for
+	 * the current node. Any statistics counter type not included in this
+	 * set is to be disabled for the current node.
+	 */
+	uint64_t stats_mask;
+
+	union {
+		/**< Parameters only valid for non-leaf nodes. */
+		struct {
+			/**< For each priority, indicates whether the children
+			 * nodes sharing the same priority are to be scheduled
+			 * by WFQ or by WRR. When NULL, it indicates that WFQ
+			 * is to be used for all priorities. When non-NULL, it
+			 * points to a pre-allocated array of *n_priority*
+			 * elements, with a non-zero value element indicating
+			 * WFQ and a zero value element for WRR.
+			 */
+			int *scheduling_mode_per_priority;
+
+			/**< Number of priorities. */
+			uint32_t n_priorities;
+		} nonleaf;
+
+		/**< Parameters only valid for leaf nodes. */
+		struct {
+			/**< Congestion management mode */
+			enum rte_tm_cman_mode cman;
+
+			/**< WRED parameters (valid when *cman* is WRED). */
+			struct {
+				/**< WRED profile for private WRED context. */
+				uint32_t wred_profile_id;
+
+				/**< User allocated array of shared WRED
+				 * context IDs. The absence of a private WRED
+				 * context for current leaf node is indicated
+				 * by value RTE_TM_WRED_PROFILE_ID_NONE.
+				 */
+				uint32_t *shared_wred_context_id;
+
+				/**< Number of shared WRED context IDs in the
+				 * *shared_wred_context_id* array.
+				 */
+				uint32_t n_shared_wred_contexts;
+			} wred;
+		} leaf;
+	};
+};
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_tm_error::cause.
+ */
+enum rte_tm_error_type {
+	RTE_TM_ERROR_TYPE_NONE, /**< No error. */
+	RTE_TM_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_TM_ERROR_TYPE_CAPABILITIES,
+	RTE_TM_ERROR_TYPE_LEVEL_ID,
+	RTE_TM_ERROR_TYPE_WRED_PROFILE,
+	RTE_TM_ERROR_TYPE_WRED_PROFILE_GREEN,
+	RTE_TM_ERROR_TYPE_WRED_PROFILE_YELLOW,
+	RTE_TM_ERROR_TYPE_WRED_PROFILE_RED,
+	RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+	RTE_TM_ERROR_TYPE_SHARED_WRED_CONTEXT_ID,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_SIZE,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+	RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+	RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+	RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+	RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+	RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_SCHEDULING_MODE,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_N_PRIORITIES,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_WRED_CONTEXT_ID,
+	RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+	RTE_TM_ERROR_TYPE_NODE_ID,
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_tm_error {
+	enum rte_tm_error_type type; /**< Cause field and error type. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Traffic manager get number of leaf nodes
+ *
+ * Each leaf node sits on on top of a TX queue of the current Ethernet port.
+ * Therefore, the set of leaf nodes is predefined, their number is always equal
+ * to N (where N is the number of TX queues configured for the current port)
+ * and their IDs are 0 .. (N-1).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param n_leaf_nodes
+ *   Number of leaf nodes for the current port.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_get_leaf_nodes(uint8_t port_id,
+	uint32_t *n_leaf_nodes,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node type (i.e. leaf or non-leaf) get
+ *
+ * The leaf nodes have predefined IDs in the range of 0 .. (N-1), where N is
+ * the number of TX queues of the current Ethernet port. The non-leaf nodes
+ * have their IDs generated by the application outside of the above range,
+ * which is reserved for leaf nodes.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID value. Needs to be valid.
+ * @param is_leaf
+ *   Set to non-zero value when node is leaf and to zero otherwise (non-leaf).
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_type_get(uint8_t port_id,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager capabilities get
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param cap
+ *   Traffic manager capabilities. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_capabilities_get(uint8_t port_id,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager level capabilities get
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param level_id
+ *   The hierarchy level identifier. The value of 0 identifies the level of the
+ *   root node.
+ * @param cap
+ *   Traffic manager level capabilities. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_level_capabilities_get(uint8_t port_id,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node capabilities get
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param cap
+ *   Traffic manager node capabilities. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_capabilities_get(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager WRED profile add
+ *
+ * Create a new WRED profile with ID set to *wred_profile_id*. The new profile
+ * is used to create one or several WRED contexts.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param wred_profile_id
+ *   WRED profile ID for the new profile. Needs to be unused.
+ * @param profile
+ *   WRED profile parameters. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_wred_profile_add(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager WRED profile delete
+ *
+ * Delete an existing WRED profile. This operation fails when there is
+ * currently at least one user (i.e. WRED context) of this WRED profile.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param wred_profile_id
+ *   WRED profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_wred_profile_delete(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager shared WRED context add or update
+ *
+ * When *shared_wred_context_id* is invalid, a new WRED context with this ID is
+ * created by using the WRED profile identified by *wred_profile_id*.
+ *
+ * When *shared_wred_context_id* is valid, this WRED context is no longer using
+ * the profile previously assigned to it and is updated to use the profile
+ * identified by *wred_profile_id*.
+ *
+ * A valid shared WRED context can be assigned to several hierarchy leaf nodes
+ * configured to use WRED as the congestion management mode.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_wred_context_id
+ *   Shared WRED context ID
+ * @param wred_profile_id
+ *   WRED profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_shared_wred_context_add_update(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager shared WRED context delete
+ *
+ * Delete an existing shared WRED context. This operation fails when there is
+ * currently at least one user (i.e. hierarchy leaf node) of this shared WRED
+ * context.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_wred_context_id
+ *   Shared WRED context ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_shared_wred_context_delete(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager shaper profile add
+ *
+ * Create a new shaper profile with ID set to *shaper_profile_id*. The new
+ * shaper profile is used to create one or several shapers.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shaper_profile_id
+ *   Shaper profile ID for the new profile. Needs to be unused.
+ * @param profile
+ *   Shaper profile parameters. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_shaper_profile_add(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager shaper profile delete
+ *
+ * Delete an existing shaper profile. This operation fails when there is
+ * currently at least one user (i.e. shaper) of this shaper profile.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shaper_profile_id
+ *   Shaper profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_shaper_profile_delete(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager shared shaper add or update
+ *
+ * When *shared_shaper_id* is not a valid shared shaper ID, a new shared shaper
+ * with this ID is created using the shaper profile identified by
+ * *shaper_profile_id*.
+ *
+ * When *shared_shaper_id* is a valid shared shaper ID, this shared shaper is
+ * no longer using the shaper profile previously assigned to it and is updated
+ * to use the shaper profile identified by *shaper_profile_id*.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_shaper_id
+ *   Shared shaper ID
+ * @param shaper_profile_id
+ *   Shaper profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_shared_shaper_add_update(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager shared shaper delete
+ *
+ * Delete an existing shared shaper. This operation fails when there is
+ * currently at least one user (i.e. hierarchy node) of this shared shaper.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_shaper_id
+ *   Shared shaper ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_shared_shaper_delete(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node add
+ *
+ * Create new node and connect it as child of an existing node. The new node is
+ * further identified by *node_id*, which needs to be unused by any of the
+ * existing nodes. The parent node is identified by *parent_node_id*, which
+ * needs to be the valid ID of an existing non-leaf node. The parent node is
+ * going to use the provided SP *priority* and WFQ/WRR *weight* to schedule its
+ * new child node.
+ *
+ * This function has to be called for both leaf and non-leaf nodes. In the case
+ * of leaf nodes (i.e. *node_id* is within the range of 0 .. (N-1), with N as
+ * the number of configured TX queues of the current port), the leaf node is
+ * configured rather than created (as the set of leaf nodes is predefined) and
+ * it is also connected as child of an existing node.
+ *
+ * The first node that is added becomes the root node and all the nodes that
+ * are subsequently added have to be added as descendants of the root node. The
+ * parent of the root node has to be specified as RTE_TM_NODE_ID_NULL and there
+ * can only be one node with this parent ID (i.e. the root node). Further
+ * restrictions for root node: needs to be non-leaf, its private shaper profile
+ * needs to be valid and single rate, cannot use any shared shapers.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be unused by any of the existing nodes.
+ * @param parent_node_id
+ *   Parent node ID. Needs to be the valid.
+ * @param priority
+ *   Node priority. The highest node priority is zero. Used by the SP algorithm
+ *   running on the parent of the current node for scheduling this child node.
+ * @param weight
+ *   Node weight. The node weight is relative to the weight sum of all siblings
+ *   that have the same priority. The lowest weight is one. Used by the WFQ/WRR
+ *   algorithm running on the parent of the current node for scheduling this
+ *   child node.
+ * @param params
+ *   Node parameters. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_add(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node delete
+ *
+ * Delete an existing node. This operation fails when this node currently has
+ * at least one user (i.e. child node).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_delete(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node suspend
+ *
+ * Suspend an existing node.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_suspend(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node resume
+ *
+ * Resume an existing node that was previously suspended.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_resume(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager hierarchy set
+ *
+ * This function is called during the port initialization phase (before the
+ * Ethernet port is started) to freeze the start-up hierarchy.
+ *
+ * This function fails when the currently configured hierarchy is not supported
+ * by the Ethernet port, in which case the user can abort or try out another
+ * hierarchy configuration (e.g. a hierarchy with less leaf nodes), which can
+ * be build from scratch (when *clear_on_fail* is enabled) or by modifying the
+ * existing hierarchy configuration (when *clear_on_fail* is disabled).
+ *
+ * Note that, even when the configured hierarchy is supported (so this function
+ * is successful), the Ethernet port start might still fail due to e.g. not
+ * enough memory being available in the system, etc.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param clear_on_fail
+ *   On function call failure, hierarchy is cleared when this parameter is
+ *   non-zero and preserved when this parameter is equal to zero.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_hierarchy_set(uint8_t port_id,
+	int clear_on_fail,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node parent update
+ *
+ * Restriction for root node: its parent cannot be changed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param parent_node_id
+ *   Node ID for the new parent. Needs to be valid.
+ * @param priority
+ *   Node priority. The highest node priority is zero. Used by the SP algorithm
+ *   running on the parent of the current node for scheduling this child node.
+ * @param weight
+ *   Node weight. The node weight is relative to the weight sum of all siblings
+ *   that have the same priority. The lowest weight is zero. Used by the
+ *   WFQ/WRR algorithm running on the parent of the current node for scheduling
+ *   this child node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_parent_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node private shaper update
+ *
+ * Restriction for root node: its private shaper profile needs to be valid and
+ * single rate.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param shaper_profile_id
+ *   Shaper profile ID for the private shaper of the current node. Needs to be
+ *   either valid shaper profile ID or RTE_TM_SHAPER_PROFILE_ID_NONE, with
+ *   the latter disabling the private shaper of the current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node shared shapers update
+ *
+ * Restriction for root node: cannot use any shared rate shapers.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param shared_shaper_id
+ *   Shared shaper ID. Needs to be valid.
+ * @param add
+ *   Set to non-zero value to add this shared shaper to current node or to zero
+ *   to delete this shared shaper from current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_shared_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_shaper_id,
+	int add,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node enabled statistics counters update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param stats_mask
+ *   Mask of statistics counter types to be enabled for the current node. This
+ *   needs to be a subset of the statistics counter types available for the
+ *   current node. Any statistics counter type not included in this set is to
+ *   be disabled for the current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_stats_update(uint8_t port_id,
+	uint32_t node_id,
+	uint64_t stats_mask,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node scheduling mode update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param scheduling_mode_per_priority
+ *   For each priority, indicates whether the children nodes sharing the same
+ *   priority are to be scheduled by WFQ or by WRR. When NULL, it indicates
+ *   that WFQ is to be used for all priorities. When non-NULL, it points to a
+ *   pre-allocated array of *n_priority* elements, with a non-zero value
+ *   element indicating WFQ and a zero value element for WRR.
+ * @param n_priorities
+ *   Number of priorities.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_scheduling_mode_update(uint8_t port_id,
+	uint32_t node_id,
+	int *scheduling_mode_per_priority,
+	uint32_t n_priorities,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node congestion management mode update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param cman
+ *   Congestion management mode.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_cman_update(uint8_t port_id,
+	uint32_t node_id,
+	enum rte_tm_cman_mode cman,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node private WRED context update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param wred_profile_id
+ *   WRED profile ID for the private WRED context of the current node. Needs to
+ *   be either valid WRED profile ID or RTE_TM_WRED_PROFILE_ID_NONE, with
+ *   the latter disabling the private WRED context of the current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node shared WRED context update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param shared_wred_context_id
+ *   Shared WRED context ID. Needs to be valid.
+ * @param add
+ *   Set to non-zero value to add this shared WRED context to current node or
+ *   to zero to delete this shared WRED context from current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_shared_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_wred_context_id,
+	int add,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager node statistics counters read
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param stats
+ *   When non-NULL, it contains the current value for the statistics counters
+ *   enabled for the current node.
+ * @param stats_mask
+ *   When non-NULL, it contains the mask of statistics counter types that are
+ *   currently enabled for this node, indicating which of the counters
+ *   retrieved with the *stats* structure are valid.
+ * @param clear
+ *   When this parameter has a non-zero value, the statistics counters are
+ *   cleared (i.e. set to zero) immediately after they have been read,
+ *   otherwise the statistics counters are left untouched.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_node_stats_read(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager packet marking - VLAN DEI (IEEE 802.1Q)
+ *
+ * IEEE 802.1p maps the traffic class to the VLAN Priority Code Point (PCP)
+ * field (3 bits), while IEEE 802.1q maps the drop priority to the VLAN Drop
+ * Eligible Indicator (DEI) field (1 bit), which was previously named Canonical
+ * Format Indicator (CFI).
+ *
+ * All VLAN frames of a given color get their DEI bit set if marking is enabled
+ * for this color; otherwise, their DEI bit is left as is (either set or not).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mark_green
+ *   Set to non-zero value to enable marking of green packets and to zero to
+ *   disable it.
+ * @param mark_yellow
+ *   Set to non-zero value to enable marking of yellow packets and to zero to
+ *   disable it.
+ * @param mark_red
+ *   Set to non-zero value to enable marking of red packets and to zero to
+ *   disable it.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_mark_vlan_dei(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager packet marking - IPv4 / IPv6 ECN (IETF RFC 3168)
+ *
+ * IETF RFCs 2474 and 3168 reorganize the IPv4 Type of Service (TOS) field
+ * (8 bits) and the IPv6 Traffic Class (TC) field (8 bits) into Differentiated
+ * Services Codepoint (DSCP) field (6 bits) and Explicit Congestion
+ * Notification (ECN) field (2 bits). The DSCP field is typically used to
+ * encode the traffic class and/or drop priority (RFC 2597), while the ECN
+ * field is used by RFC 3168 to implement a congestion notification mechanism
+ * to be leveraged by transport layer protocols such as TCP and SCTP that have
+ * congestion control mechanisms.
+ *
+ * When congestion is experienced, as alternative to dropping the packet,
+ * routers can change the ECN field of input packets from 2'b01 or 2'b10
+ * (values indicating that source endpoint is ECN-capable) to 2'b11 (meaning
+ * that congestion is experienced). The destination endpoint can use the
+ * ECN-Echo (ECE) TCP flag to relay the congestion indication back to the
+ * source endpoint, which acknowledges it back to the destination endpoint with
+ * the Congestion Window Reduced (CWR) TCP flag.
+ *
+ * All IPv4/IPv6 packets of a given color with ECN set to 2’b01 or 2’b10
+ * carrying TCP or SCTP have their ECN set to 2’b11 if the marking feature is
+ * enabled for the current color, otherwise the ECN field is left as is.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mark_green
+ *   Set to non-zero value to enable marking of green packets and to zero to
+ *   disable it.
+ * @param mark_yellow
+ *   Set to non-zero value to enable marking of yellow packets and to zero to
+ *   disable it.
+ * @param mark_red
+ *   Set to non-zero value to enable marking of red packets and to zero to
+ *   disable it.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_mark_ip_ecn(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error);
+
+/**
+ * Traffic manager packet marking - IPv4 / IPv6 DSCP (IETF RFC 2597)
+ *
+ * IETF RFC 2597 maps the traffic class and the drop priority to the IPv4/IPv6
+ * Differentiated Services Codepoint (DSCP) field (6 bits). Here are the DSCP
+ * values proposed by this RFC:
+ *
+ *                       Class 1    Class 2    Class 3    Class 4
+ *                     +----------+----------+----------+----------+
+ *    Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
+ *    Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
+ *    High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
+ *                     +----------+----------+----------+----------+
+ *
+ * There are 4 traffic classes (classes 1 .. 4) encoded by DSCP bits 1 and 2,
+ * as well as 3 drop priorities (low/medium/high) encoded by DSCP bits 3 and 4.
+ *
+ * All IPv4/IPv6 packets have their color marked into DSCP bits 3 and 4 as
+ * follows: green mapped to Low Drop Precedence (2’b01), yellow to Medium
+ * (2’b10) and red to High (2’b11). Marking needs to be explicitly enabled
+ * for each color; when not enabled for a given color, the DSCP field of all
+ * packets with that color is left as is.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mark_green
+ *   Set to non-zero value to enable marking of green packets and to zero to
+ *   disable it.
+ * @param mark_yellow
+ *   Set to non-zero value to enable marking of yellow packets and to zero to
+ *   disable it.
+ * @param mark_red
+ *   Set to non-zero value to enable marking of red packets and to zero to
+ *   disable it.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_tm_mark_ip_dscp(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_TM_H__ */
diff --git a/lib/librte_ether/rte_tm_driver.h b/lib/librte_ether/rte_tm_driver.h
new file mode 100644
index 0000000..b3c9c15
--- /dev/null
+++ b/lib/librte_ether/rte_tm_driver.h
@@ -0,0 +1,365 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_TM_DRIVER_H__
+#define __INCLUDE_RTE_TM_DRIVER_H__
+
+/**
+ * @file
+ * RTE Generic Traffic Manager API (Driver Side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include "rte_ethdev.h"
+#include "rte_tm.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef int (*rte_tm_node_type_get_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node type get */
+
+typedef int (*rte_tm_capabilities_get_t)(struct rte_eth_dev *dev,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager capabilities get */
+
+typedef int (*rte_tm_level_capabilities_get_t)(struct rte_eth_dev *dev,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager level capabilities get */
+
+typedef int (*rte_tm_node_capabilities_get_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node capabilities get */
+
+typedef int (*rte_tm_wred_profile_add_t)(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager WRED profile add */
+
+typedef int (*rte_tm_wred_profile_delete_t)(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager WRED profile delete */
+
+typedef int (*rte_tm_shared_wred_context_add_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t shared_wred_context_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager shared WRED context add */
+
+typedef int (*rte_tm_shared_wred_context_delete_t)(
+	struct rte_eth_dev *dev,
+	uint32_t shared_wred_context_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager shared WRED context delete */
+
+typedef int (*rte_tm_shaper_profile_add_t)(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager shaper profile add */
+
+typedef int (*rte_tm_shaper_profile_delete_t)(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager shaper profile delete */
+
+typedef int (*rte_tm_shared_shaper_add_update_t)(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager shared shaper add/update */
+
+typedef int (*rte_tm_shared_shaper_delete_t)(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager shared shaper delete */
+
+typedef int (*rte_tm_node_add_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node add */
+
+typedef int (*rte_tm_node_delete_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node delete */
+
+typedef int (*rte_tm_node_suspend_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node suspend */
+
+typedef int (*rte_tm_node_resume_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node resume */
+
+typedef int (*rte_tm_hierarchy_set_t)(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager hierarchy set */
+
+typedef int (*rte_tm_node_parent_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node parent update */
+
+typedef int (*rte_tm_node_shaper_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node shaper update */
+
+typedef int (*rte_tm_node_shared_shaper_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shared_shaper_id,
+	int32_t add,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node shaper update */
+
+typedef int (*rte_tm_node_stats_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint64_t stats_mask,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node stats update */
+
+typedef int (*rte_tm_node_scheduling_mode_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *scheduling_mode_per_priority,
+	uint32_t n_priorities,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node scheduling mode update */
+
+typedef int (*rte_tm_node_cman_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	enum rte_tm_cman_mode cman,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node congestion management mode update */
+
+typedef int (*rte_tm_node_wred_context_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node WRED context update */
+
+typedef int (*rte_tm_node_shared_wred_context_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shared_wred_context_id,
+	int add,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager node WRED context update */
+
+typedef int (*rte_tm_node_stats_read_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager read stats counters for specific node */
+
+typedef int (*rte_tm_mark_vlan_dei_t)(struct rte_eth_dev *dev,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager packet marking - VLAN DEI */
+
+typedef int (*rte_tm_mark_ip_ecn_t)(struct rte_eth_dev *dev,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager packet marking - IPv4/IPv6 ECN */
+
+typedef int (*rte_tm_mark_ip_dscp_t)(struct rte_eth_dev *dev,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_tm_error *error);
+/**< @internal Traffic manager packet marking - IPv4/IPv6 DSCP */
+
+struct rte_tm_ops {
+	/** Traffic manager node type get */
+	rte_tm_node_type_get_t node_type_get;
+
+	/** Traffic manager capabilities_get */
+	rte_tm_capabilities_get_t capabilities_get;
+	/** Traffic manager level capabilities_get */
+	rte_tm_level_capabilities_get_t level_capabilities_get;
+	/** Traffic manager node capabilities get */
+	rte_tm_node_capabilities_get_t node_capabilities_get;
+
+	/** Traffic manager WRED profile add */
+	rte_tm_wred_profile_add_t wred_profile_add;
+	/** Traffic manager WRED profile delete */
+	rte_tm_wred_profile_delete_t wred_profile_delete;
+	/** Traffic manager shared WRED context add/update */
+	rte_tm_shared_wred_context_add_update_t
+		shared_wred_context_add_update;
+	/** Traffic manager shared WRED context delete */
+	rte_tm_shared_wred_context_delete_t
+		shared_wred_context_delete;
+
+	/** Traffic manager shaper profile add */
+	rte_tm_shaper_profile_add_t shaper_profile_add;
+	/** Traffic manager shaper profile delete */
+	rte_tm_shaper_profile_delete_t shaper_profile_delete;
+	/** Traffic manager shared shaper add/update */
+	rte_tm_shared_shaper_add_update_t shared_shaper_add_update;
+	/** Traffic manager shared shaper delete */
+	rte_tm_shared_shaper_delete_t shared_shaper_delete;
+
+	/** Traffic manager node add */
+	rte_tm_node_add_t node_add;
+	/** Traffic manager node delete */
+	rte_tm_node_delete_t node_delete;
+	/** Traffic manager node suspend */
+	rte_tm_node_suspend_t node_suspend;
+	/** Traffic manager node resume */
+	rte_tm_node_resume_t node_resume;
+	/** Traffic manager hierarchy set */
+	rte_tm_hierarchy_set_t hierarchy_set;
+
+	/** Traffic manager node parent update */
+	rte_tm_node_parent_update_t node_parent_update;
+	/** Traffic manager node shaper update */
+	rte_tm_node_shaper_update_t node_shaper_update;
+	/** Traffic manager node shared shaper update */
+	rte_tm_node_shared_shaper_update_t node_shared_shaper_update;
+	/** Traffic manager node stats update */
+	rte_tm_node_stats_update_t node_stats_update;
+	/** Traffic manager node scheduling mode update */
+	rte_tm_node_scheduling_mode_update_t node_scheduling_mode_update;
+	/** Traffic manager node congestion management mode update */
+	rte_tm_node_cman_update_t node_cman_update;
+	/** Traffic manager node WRED context update */
+	rte_tm_node_wred_context_update_t node_wred_context_update;
+	/** Traffic manager node shared WRED context update */
+	rte_tm_node_shared_wred_context_update_t
+		node_shared_wred_context_update;
+	/** Traffic manager read statistics counters for current node */
+	rte_tm_node_stats_read_t node_stats_read;
+
+	/** Traffic manager packet marking - VLAN DEI */
+	rte_tm_mark_vlan_dei_t mark_vlan_dei;
+	/** Traffic manager packet marking - IPv4/IPv6 ECN */
+	rte_tm_mark_ip_ecn_t mark_ip_ecn;
+	/** Traffic manager packet marking - IPv4/IPv6 DSCP */
+	rte_tm_mark_ip_dscp_t mark_ip_dscp;
+};
+
+/**
+ * Initialize generic error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param error
+ *   Pointer to error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error type.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Error code.
+ */
+static inline int
+rte_tm_error_set(struct rte_tm_error *error,
+		   int code,
+		   enum rte_tm_error_type type,
+		   const void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_tm_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return code;
+}
+
+/**
+ * Get generic traffic manager operations structure from a port
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param error
+ *   Error details
+ *
+ * @return
+ *   The traffic manager operations structure associated with port_id on
+ *   success, NULL otherwise.
+ */
+const struct rte_tm_ops *
+rte_tm_ops_get(uint8_t port_id, struct rte_tm_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_TM_DRIVER_H__ */
-- 
2.5.0

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH 1/5] cfgfile: configurable comment character
  2017-03-03 12:17  0%             ` Legacy, Allain
@ 2017-03-03 13:10  0%               ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-03 13:10 UTC (permalink / raw)
  To: Legacy, Allain
  Cc: Dumitrescu, Cristian, Yuanhan Liu, dev, Jolliffe, Ian (Wind River)

On Fri, Mar 03, 2017 at 12:17:47PM +0000, Legacy, Allain wrote:
> > -----Original Message-----
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
>  > Also, for a single parameter like a comment char, I don't think we need to go
> > creating a separate structure. The current flags parameter is unused, so just
> > replace it with the comment char one. With using the structure, any additions
> In my earlier patch, I proprose using a "global" flag to indicate that an unnamed section exists so the flags argument would still be needed.

Ok, good point, I missed that.

> 
> > to the struct would be an ABI change anyway, so I see little point in using it,
> > unless we already know of additional parameters we will be adding in future.
> We already have 2 parameters in mind - flags, and comment char.  I don't feel that combining the two in a single enum is particularly good since it would be better to allow the application the freedom to set an arbitrary comment character and not be locked in to any static list that we choose (see my previous email response).
>
I also agree on not using enums and not limiting comment chars.

I don't particularly like config structs, and would prefer individual
flags and comment char parameters - given it's not a huge list of
params, just 2 - but no big deal either way.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 1/5] cfgfile: configurable comment character
  2017-03-03 12:10  4%           ` Bruce Richardson
@ 2017-03-03 12:17  0%             ` Legacy, Allain
  2017-03-03 13:10  0%               ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Legacy, Allain @ 2017-03-03 12:17 UTC (permalink / raw)
  To: RICHARDSON, BRUCE
  Cc: DUMITRESCU, CRISTIAN FLORIN, Yuanhan Liu, dev, Jolliffe, Ian

> -----Original Message-----
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
 > Also, for a single parameter like a comment char, I don't think we need to go
> creating a separate structure. The current flags parameter is unused, so just
> replace it with the comment char one. With using the structure, any additions
In my earlier patch, I proprose using a "global" flag to indicate that an unnamed section exists so the flags argument would still be needed.  

> to the struct would be an ABI change anyway, so I see little point in using it,
> unless we already know of additional parameters we will be adding in future.
We already have 2 parameters in mind - flags, and comment char.  I don't feel that combining the two in a single enum is particularly good since it would be better to allow the application the freedom to set an arbitrary comment character and not be locked in to any static list that we choose (see my previous email response). 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 1/5] cfgfile: configurable comment character
  @ 2017-03-03 12:10  4%           ` Bruce Richardson
  2017-03-03 12:17  0%             ` Legacy, Allain
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2017-03-03 12:10 UTC (permalink / raw)
  To: Legacy, Allain
  Cc: Dumitrescu, Cristian, Yuanhan Liu, dev, Jolliffe, Ian (Wind River)

On Fri, Mar 03, 2017 at 11:31:11AM +0000, Legacy, Allain wrote:
> > -----Original Message-----
> > From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> > Possible options that I see:
> > 1. Add a new parameters argument to the load functions (e.g. struct
> > cfgfile_params *p), whit the comment char as one (and currently only) field
> > of this struct. Drawbacks: API change that might have to be announced one
> > release before the actual API change.
> 
> I would prefer this option as it provides more flexibility.  We can leave the existing API as is and a wrapper that accepts additional parameters.   Something like the following (with implementations in the .c obviously rather than inline the header like I have it here).  There are several examples of this pattern already in the dpdk (i.e., ring APIs, mempool APIs, etc.) where we use a common function invoked by higher level functions that pass in additional parameters to customize behavior.
> 
> struct rte_cfgfile *_rte_cfgfile_load(const char *filename,
>                                           const struct rte_cfgfile_params *params);
> 
> struct rte_cfgfile *rte_cfgfile_load(const char *filename, int flags)
> {
>         struct rte_cfgfile_params params;
> 
>         rte_cfgfile_set_default_params(&params);
>         params |= flags;
>         return _rte_cfgfile_load(filename, &params);
> }
> 
> struct rte_cfgfile *rte_cfgfile_load_with_params(const char *filename,
>                                                     const struct rte_cfgfile_params *params)
> {
>         return _rte_cfgfile_load(filename, params);
> }

No need for a new API. Just add the extra parameter to the existing load
parameter and use function versioning for ABI compatilibity. Since it's
only one function, I don't think using versioning is a big deal, and
that's what it is there for.

Also, for a single parameter like a comment char, I don't think we need
to go creating a separate structure. The current flags parameter is
unused, so just replace it with the comment char one. With using the
structure, any additions to the struct would be an ABI change anyway, so
I see little point in using it, unless we already know of additional
parameters we will be adding in future. [It's an ABI change even when
adding to the end, since the struct is allocated in the app itself, not
the library.]

/Bruce

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 16/17] vhost: rename header file
  2017-03-03  9:51  4% [dpdk-dev] [PATCH 00/17] vhost: generic vhost API Yuanhan Liu
@ 2017-03-03  9:51  3% ` Yuanhan Liu
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
  1 sibling, 0 replies; 200+ results
From: Yuanhan Liu @ 2017-03-03  9:51 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
 doc/guides/rel_notes/deprecation.rst   |   9 --
 drivers/net/vhost/rte_eth_vhost.c      |   2 +-
 drivers/net/vhost/rte_eth_vhost.h      |   2 +-
 examples/tep_termination/main.c        |   2 +-
 examples/tep_termination/vxlan_setup.c |   2 +-
 examples/vhost/main.c                  |   2 +-
 lib/librte_vhost/Makefile              |   2 +-
 lib/librte_vhost/rte_vhost.h           | 259 +++++++++++++++++++++++++++++++++
 lib/librte_vhost/rte_virtio_net.h      | 259 ---------------------------------
 lib/librte_vhost/vhost.c               |   2 +-
 lib/librte_vhost/vhost.h               |   2 +-
 lib/librte_vhost/vhost_user.h          |   2 +-
 lib/librte_vhost/virtio_net.c          |   2 +-
 13 files changed, 269 insertions(+), 278 deletions(-)
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 9d4dfcc..84c8b9d 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -104,15 +104,6 @@ Deprecation Notices
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
 
-* vhost: API/ABI changes are planned for 17.05, for making DPDK vhost library
-  generic enough so that applications can build different vhost-user drivers
-  (instead of vhost-user net only) on top of that.
-  Specifically, ``virtio_net_device_ops`` will be renamed to ``vhost_device_ops``.
-  Correspondingly, some API's parameter need be changed. Few more functions also
-  need be reworked to let it be device aware. For example, different virtio device
-  has different feature set, meaning functions like ``rte_vhost_feature_disable``
-  need be changed. Last, file rte_virtio_net.h will be renamed to rte_vhost.h.
-
 * kni: Remove :ref:`kni_vhost_backend-label` feature (KNI_VHOST) in 17.05 release.
   :doc:`Vhost Library </prog_guide/vhost_lib>` is currently preferred method for
   guest - host communication. Just for clarification, this is not to remove KNI
diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index df1e386..f7c370e 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -40,7 +40,7 @@
 #include <rte_memcpy.h>
 #include <rte_vdev.h>
 #include <rte_kvargs.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_spinlock.h>
 
 #include "rte_eth_vhost.h"
diff --git a/drivers/net/vhost/rte_eth_vhost.h b/drivers/net/vhost/rte_eth_vhost.h
index ea4bce4..39ca771 100644
--- a/drivers/net/vhost/rte_eth_vhost.h
+++ b/drivers/net/vhost/rte_eth_vhost.h
@@ -41,7 +41,7 @@
 #include <stdint.h>
 #include <stdbool.h>
 
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 /*
  * Event description.
diff --git a/examples/tep_termination/main.c b/examples/tep_termination/main.c
index fa1c7a4..63a5dd3 100644
--- a/examples/tep_termination/main.c
+++ b/examples/tep_termination/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "main.h"
 #include "vxlan.h"
diff --git a/examples/tep_termination/vxlan_setup.c b/examples/tep_termination/vxlan_setup.c
index 8f1f15b..87de74d 100644
--- a/examples/tep_termination/vxlan_setup.c
+++ b/examples/tep_termination/vxlan_setup.c
@@ -49,7 +49,7 @@
 #include <rte_tcp.h>
 
 #include "main.h"
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 #include "vxlan.h"
 #include "vxlan_setup.h"
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 080c60b..a9b5352 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 #include <rte_string_fns.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_ip.h>
 #include <rte_tcp.h>
 
diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 5cf4e93..4847069 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -51,7 +51,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \
 				   virtio_net.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h
 
 # dependencies
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal
diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
new file mode 100644
index 0000000..cfb3507
--- /dev/null
+++ b/lib/librte_vhost/rte_vhost.h
@@ -0,0 +1,259 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VHOST_H_
+#define _RTE_VHOST_H_
+
+/**
+ * @file
+ * Interface to vhost-user
+ */
+
+#include <stdint.h>
+#include <linux/vhost.h>
+#include <linux/virtio_ring.h>
+#include <sys/eventfd.h>
+
+#include <rte_memory.h>
+#include <rte_mempool.h>
+
+#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
+#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
+#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
+
+/**
+ * Information relating to memory regions including offsets to
+ * addresses in QEMUs memory file.
+ */
+struct rte_vhost_mem_region {
+	uint64_t guest_phys_addr;
+	uint64_t guest_user_addr;
+	uint64_t host_user_addr;
+	uint64_t size;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+/**
+ * Memory structure includes region and mapping information.
+ */
+struct rte_vhost_memory {
+	uint32_t nregions;
+	struct rte_vhost_mem_region regions[0];
+};
+
+struct rte_vhost_vring {
+	struct vring_desc	*desc;
+	struct vring_avail	*avail;
+	struct vring_used	*used;
+	uint64_t		log_guest_addr;
+
+	int			callfd;
+	int			kickfd;
+	uint16_t		size;
+};
+
+/**
+ * Device and vring operations.
+ */
+struct vhost_device_ops {
+	int (*new_device)(int vid);		/**< Add device. */
+	void (*destroy_device)(int vid);	/**< Remove device. */
+
+	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
+
+	void *reserved[5]; /**< Reserved for future extension */
+};
+
+/**
+ * Convert guest physical Address to host virtual address
+ */
+static inline uint64_t __attribute__((always_inline))
+rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
+{
+	struct rte_vhost_mem_region *reg;
+	uint32_t i;
+
+	for (i = 0; i < mem->nregions; i++) {
+		reg = &mem->regions[i];
+		if (gpa >= reg->guest_phys_addr &&
+		    gpa <  reg->guest_phys_addr + reg->size) {
+			return gpa - reg->guest_phys_addr +
+			       reg->host_user_addr;
+		}
+	}
+
+	return 0;
+}
+
+int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
+
+/**
+ * Register vhost driver. path could be different for multiple
+ * instance support.
+ */
+int rte_vhost_driver_register(const char *path, uint64_t flags);
+
+/* Unregister vhost driver. This is only meaningful to vhost user. */
+int rte_vhost_driver_unregister(const char *path);
+
+/**
+ * Set feature bits the vhost driver supports.
+ */
+int rte_vhost_driver_set_features(const char *path, uint64_t features);
+uint64_t rte_vhost_driver_get_features(const char *path);
+
+int rte_vhost_driver_enable_features(const char *path, uint64_t features);
+int rte_vhost_driver_disable_features(const char *path, uint64_t features);
+
+/* Register callbacks. */
+int rte_vhost_driver_callback_register(const char *path,
+	struct vhost_device_ops const * const);
+/* Start vhost driver session blocking loop. */
+int rte_vhost_driver_session_start(void);
+
+/**
+ * Get the numa node from which the virtio net device's memory
+ * is allocated.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The numa node, -1 on failure
+ */
+int rte_vhost_get_numa_node(int vid);
+
+/**
+ * @deprecated
+ * Get the number of queues the device supports.
+ *
+ * Note this function is deprecated, as it returns a queue pair number,
+ * which is vhost specific. Instead, rte_vhost_get_vring_num should
+ * be used.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of queues, 0 on failure
+ */
+__rte_deprecated
+uint32_t rte_vhost_get_queue_num(int vid);
+
+/**
+ * Get the number of vrings the device supports.
+ *
+ * @param vid
+ *  vhost device ID
+ *
+ * @return
+ *  The number of vrings, 0 on failure
+ */
+uint16_t rte_vhost_get_vring_num(int vid);
+
+/**
+ * Get the virtio net device's ifname, which is the vhost-user socket
+ * file path.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param buf
+ *  The buffer to stored the queried ifname
+ * @param len
+ *  The length of buf
+ *
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_ifname(int vid, char *buf, size_t len);
+
+/**
+ * Get how many avail entries are left in the queue
+ *
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index
+ *
+ * @return
+ *  num of avail entires left
+ */
+uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
+
+/**
+ * This function adds buffers to the virtio devices RX virtqueue. Buffers can
+ * be received from the physical port or from another virtual device. A packet
+ * count is returned to indicate the number of packets that were succesfully
+ * added to the RX queue.
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param pkts
+ *  array to contain packets to be enqueued
+ * @param count
+ *  packets num to be enqueued
+ * @return
+ *  num of packets enqueued
+ */
+uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
+	struct rte_mbuf **pkts, uint16_t count);
+
+/**
+ * This function gets guest buffers from the virtio device TX virtqueue,
+ * construct host mbufs, copies guest buffer content to host mbufs and
+ * store them in pkts to be processed.
+ * @param vid
+ *  vhost device ID
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param mbuf_pool
+ *  mbuf_pool where host mbuf is allocated.
+ * @param pkts
+ *  array to contain packets to be dequeued
+ * @param count
+ *  packets num to be dequeued
+ * @return
+ *  num of packets dequeued
+ */
+uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
+	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
+
+int rte_vhost_get_vhost_memory(int vid, struct rte_vhost_memory **mem);
+uint64_t rte_vhost_get_negotiated_features(int vid);
+int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
+			      struct rte_vhost_vring *vring);
+
+#endif /* _RTE_VHOST_H_ */
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
deleted file mode 100644
index 2f761da..0000000
--- a/lib/librte_vhost/rte_virtio_net.h
+++ /dev/null
@@ -1,259 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _VIRTIO_NET_H_
-#define _VIRTIO_NET_H_
-
-/**
- * @file
- * Interface to vhost net
- */
-
-#include <stdint.h>
-#include <linux/vhost.h>
-#include <linux/virtio_ring.h>
-#include <sys/eventfd.h>
-
-#include <rte_memory.h>
-#include <rte_mempool.h>
-
-#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
-#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
-#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY	(1ULL << 2)
-
-/**
- * Information relating to memory regions including offsets to
- * addresses in QEMUs memory file.
- */
-struct rte_vhost_mem_region {
-	uint64_t guest_phys_addr;
-	uint64_t guest_user_addr;
-	uint64_t host_user_addr;
-	uint64_t size;
-	void	 *mmap_addr;
-	uint64_t mmap_size;
-	int fd;
-};
-
-/**
- * Memory structure includes region and mapping information.
- */
-struct rte_vhost_memory {
-	uint32_t nregions;
-	struct rte_vhost_mem_region regions[0];
-};
-
-struct rte_vhost_vring {
-	struct vring_desc	*desc;
-	struct vring_avail	*avail;
-	struct vring_used	*used;
-	uint64_t		log_guest_addr;
-
-	int			callfd;
-	int			kickfd;
-	uint16_t		size;
-};
-
-/**
- * Device and vring operations.
- */
-struct vhost_device_ops {
-	int (*new_device)(int vid);		/**< Add device. */
-	void (*destroy_device)(int vid);	/**< Remove device. */
-
-	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
-
-	void *reserved[5]; /**< Reserved for future extension */
-};
-
-/**
- * Convert guest physical Address to host virtual address
- */
-static inline uint64_t __attribute__((always_inline))
-rte_vhost_gpa_to_vva(struct rte_vhost_memory *mem, uint64_t gpa)
-{
-	struct rte_vhost_mem_region *reg;
-	uint32_t i;
-
-	for (i = 0; i < mem->nregions; i++) {
-		reg = &mem->regions[i];
-		if (gpa >= reg->guest_phys_addr &&
-		    gpa <  reg->guest_phys_addr + reg->size) {
-			return gpa - reg->guest_phys_addr +
-			       reg->host_user_addr;
-		}
-	}
-
-	return 0;
-}
-
-int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
-
-/**
- * Register vhost driver. path could be different for multiple
- * instance support.
- */
-int rte_vhost_driver_register(const char *path, uint64_t flags);
-
-/* Unregister vhost driver. This is only meaningful to vhost user. */
-int rte_vhost_driver_unregister(const char *path);
-
-/**
- * Set feature bits the vhost driver supports.
- */
-int rte_vhost_driver_set_features(const char *path, uint64_t features);
-uint64_t rte_vhost_driver_get_features(const char *path);
-
-int rte_vhost_driver_enable_features(const char *path, uint64_t features);
-int rte_vhost_driver_disable_features(const char *path, uint64_t features);
-
-/* Register callbacks. */
-int rte_vhost_driver_callback_register(const char *path,
-	struct vhost_device_ops const * const);
-/* Start vhost driver session blocking loop. */
-int rte_vhost_driver_session_start(void);
-
-/**
- * Get the numa node from which the virtio net device's memory
- * is allocated.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The numa node, -1 on failure
- */
-int rte_vhost_get_numa_node(int vid);
-
-/**
- * @deprecated
- * Get the number of queues the device supports.
- *
- * Note this function is deprecated, as it returns a queue pair number,
- * which is vhost specific. Instead, rte_vhost_get_vring_num should
- * be used.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The number of queues, 0 on failure
- */
-__rte_deprecated
-uint32_t rte_vhost_get_queue_num(int vid);
-
-/**
- * Get the number of vrings the device supports.
- *
- * @param vid
- *  vhost device ID
- *
- * @return
- *  The number of vrings, 0 on failure
- */
-uint16_t rte_vhost_get_vring_num(int vid);
-
-/**
- * Get the virtio net device's ifname, which is the vhost-user socket
- * file path.
- *
- * @param vid
- *  vhost device ID
- * @param buf
- *  The buffer to stored the queried ifname
- * @param len
- *  The length of buf
- *
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_ifname(int vid, char *buf, size_t len);
-
-/**
- * Get how many avail entries are left in the queue
- *
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index
- *
- * @return
- *  num of avail entires left
- */
-uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
-
-/**
- * This function adds buffers to the virtio devices RX virtqueue. Buffers can
- * be received from the physical port or from another virtual device. A packet
- * count is returned to indicate the number of packets that were succesfully
- * added to the RX queue.
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index in mq case
- * @param pkts
- *  array to contain packets to be enqueued
- * @param count
- *  packets num to be enqueued
- * @return
- *  num of packets enqueued
- */
-uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
-	struct rte_mbuf **pkts, uint16_t count);
-
-/**
- * This function gets guest buffers from the virtio device TX virtqueue,
- * construct host mbufs, copies guest buffer content to host mbufs and
- * store them in pkts to be processed.
- * @param vid
- *  vhost device ID
- * @param queue_id
- *  virtio queue index in mq case
- * @param mbuf_pool
- *  mbuf_pool where host mbuf is allocated.
- * @param pkts
- *  array to contain packets to be dequeued
- * @param count
- *  packets num to be dequeued
- * @return
- *  num of packets dequeued
- */
-uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
-	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
-
-int rte_vhost_get_vhost_memory(int vid, struct rte_vhost_memory **mem);
-uint64_t rte_vhost_get_negotiated_features(int vid);
-int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
-			      struct rte_vhost_vring *vring);
-
-#endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 0a27888..e0548fe 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -45,7 +45,7 @@
 #include <rte_string_fns.h>
 #include <rte_memory.h>
 #include <rte_malloc.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 
 #include "vhost.h"
 
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index fc9e431..29132f3 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -46,7 +46,7 @@
 #include <rte_log.h>
 #include <rte_ether.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 #define VHOST_USER_F_PROTOCOL_FEATURES	30
 
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 179e441..f1a7823 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -37,7 +37,7 @@
 #include <stdint.h>
 #include <linux/vhost.h>
 
-#include "rte_virtio_net.h"
+#include "rte_vhost.h"
 
 /* refer to hw/virtio/vhost-user.c */
 
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 8ed2b93..6287c7a 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -39,7 +39,7 @@
 #include <rte_memcpy.h>
 #include <rte_ether.h>
 #include <rte_ip.h>
-#include <rte_virtio_net.h>
+#include <rte_vhost.h>
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_sctp.h>
-- 
1.9.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH 00/17] vhost: generic vhost API
@ 2017-03-03  9:51  4% Yuanhan Liu
  2017-03-03  9:51  3% ` [dpdk-dev] [PATCH 16/17] vhost: rename header file Yuanhan Liu
  2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
  0 siblings, 2 replies; 200+ results
From: Yuanhan Liu @ 2017-03-03  9:51 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Harris James R, Liu Changpeng, Yuanhan Liu

This is a first attempt to make DPDK vhost library be generic enough,
so that user could built its own vhost-user drivers on top of it. For
example, SPDK (Storage Performance Development Kit) is trying to enable
vhost-user SCSI.

The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
the info about the virtio device (i.e. vring address, negotiated features,
etc) and let the specific vhost-user driver to fetch them (by the API
provided by DPDK vhost lib). With those info being provided, the vhost-user
driver then could get/put vring entries, thus, it could exchange data
between the guest and host.

The last patch demonstrates how to use these new APIs to implement a
very simple vhost-user net driver, without any fancy features enabled.

API/ABI Changes summary
=======================

- some renames
  * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
  * "rte_virtio_net.h"  ==> "rte_vhost.h"

- driver related APIs are bond with the socket file
  * rte_vhost_driver_set_features(socket_file, features);
  * rte_vhost_driver_get_features(socket_file, features);
  * rte_vhost_driver_enable_features(socket_file, features)
  * rte_vhost_driver_disable_features(socket_file, features)
  * rte_vhost_driver_callback_register(socket_file, notify_ops);

- new APIs to fetch guest and vring info
  * rte_vhost_get_vhost_memory(int vid, struct rte_vhost_memory **mem);
  * rte_vhost_get_negotiated_features(int vid);
  * rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
			      struct rte_vhost_vring *vring);

- new exported structures 
  * struct rte_vhost_vring
  * struct rte_vhost_mem_region
  * struct rte_vhost_memory

Some design choices
===================

While making this patchset, I met quite few design choices and here are
two of them, with the issue and the reason I made such choices provided.
Please let me know if you have any comments (or better ideas).

Export public structures or not
-------------------------------

I made an ABI refactor last time (v16.07): move all the structures
internally and let applications use a "vid" to reference the internal
struct. With that, I hope we could never worry about the annoying ABI
issues.

It works great (and as expected) since then, as far as we only support
virito-net, as far as we can handle all the descs inside vhost lib. It
becomes problematic when a user wants to implement a vhost-user driver
somewhere. For example, it needs do the GPA to VVA translation. Without
any structs exported, some functions like gpa_to_vva() can't be inlined.
Calling it would be costly, especially it's a function we have to invoke
for processing each vring desc.

For that reason, the guest memory regions are exported. With that, the
gpa_to_vva could be inlined.

Add helper functions to fetch/update descs or not
-------------------------------------------------

I intended to do it like this way: introduce one function to get @count
of descs from a specific vring and another one to update the used descs.
It's something like
    rte_vhost_vring_get_descs(vid, vring_idx, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vid, vring_idx, count, offset, descs);

With that, vhost-user driver programmer's task would be easier, as he/she
doesn't have to parse the descs any more (such as to handle indirect desc).

But judging that virtio 1.1 is just emerged and it proposes a completely
ring layout, and most importantly, the vring desc structure is also changed,
I'd like to hold to introduce such two functions. Otherwise, it's very
likely the two will be invalid when virtio 1.1 is out. Though I think it
may could be addressed with a care design, something like making the IOV
generic enough:

	struct rte_vhost_iov {
		uint64_t	gpa;
		uint64_t	vva;
		uint64_t	len;
	};

Instead, I go with the other way: introduce few APIs to export all the vring
infos (vring size, vring addr, callfd, etc), and let the vhost-user driver
read and update the descs. Those info could be passed to vhost-user driver
by introducing one API for each, but for saving few APIs and reducing few
calls for the programmer, I packed few key fields into a new structure, so
that it can be fetched with one call:
        struct rte_vhost_vring {
                struct vring_desc       *desc;
                struct vring_avail      *avail;
                struct vring_used       *used;
                uint64_t                log_guest_addr;

                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

When virtio 1.1 comes out, likely a simple change like following would
just work:
        struct rte_vhost_vring {
		union {
			struct {
                		struct vring_desc       *desc;
                		struct vring_avail      *avail;
                		struct vring_used       *used;
                		uint64_t                log_guest_addr;
			};
			struct desc	*desc_1_1;	/* vring addr for virtio 1.1 */
		};

                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

AFAIK, it's not an ABI breakage. Even if it does, we could introduce a new
API to get the virtio 1.1 ring address.

Those fields are the minimum set I got for a specific vring, with the mind
it would bring the minimum chance to break ABI for future extension. If we
need more info, we could introduce a new API.

OTOH, for getting the best performance, the two functions also have to be
inlined ("vid + vring_idx" combo is replaced with "vring"):
    rte_vhost_vring_get_descs(vring, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vring, count, offset, descs);

That said, one way or another, we have to export rte_vhost_vring struct.
For this reason, I didn't rush into introducing the two APIs.

TODOs
=====

This series still got few small items to finish, and they are:
- update release note
- fill API comments
- set protocol features

	--yliu

---
Yuanhan Liu (17):
  vhost: introduce driver features related APIs
  net/vhost: remove feature related APIs
  vhost: use new APIs to handle features
  vhost: make notify ops per vhost driver
  vhost: export guest memory regions
  vhost: introduce API to fetch negotiated features
  vhost: export vhost vring info
  vhost: export API to translate gpa to vva
  vhost: turn queue pair to vring
  vhost: export the number of vrings
  vhost: move the device ready check at proper place
  vhost: drop the Rx and Tx queue macro
  vhost: do not include net specific headers
  vhost: rename device ops struct
  vhost: rename virtio-net to vhost
  vhost: rename header file
  examples/vhost: demonstrate the new generic vhost APIs

 doc/guides/rel_notes/deprecation.rst        |   9 -
 drivers/net/vhost/rte_eth_vhost.c           |  51 ++--
 drivers/net/vhost/rte_eth_vhost.h           |  32 +--
 drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
 examples/tep_termination/main.c             |  11 +-
 examples/tep_termination/main.h             |   2 +
 examples/tep_termination/vxlan_setup.c      |   2 +-
 examples/vhost/Makefile                     |   2 +-
 examples/vhost/main.c                       |  88 ++++--
 examples/vhost/main.h                       |  33 ++-
 examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++++
 lib/librte_vhost/Makefile                   |   4 +-
 lib/librte_vhost/rte_vhost.h                | 259 ++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map      |  18 +-
 lib/librte_vhost/rte_virtio_net.h           | 193 -------------
 lib/librte_vhost/socket.c                   | 143 ++++++++++
 lib/librte_vhost/vhost.c                    | 209 +++++++-------
 lib/librte_vhost/vhost.h                    |  82 +++---
 lib/librte_vhost/vhost_user.c               |  91 +++----
 lib/librte_vhost/vhost_user.h               |   2 +-
 lib/librte_vhost/virtio_net.c               |  35 +--
 21 files changed, 1140 insertions(+), 534 deletions(-)
 create mode 100644 examples/vhost/virtio_net.c
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

-- 
1.9.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 5/6] prgdev: add ABI control info
  2017-03-02  4:03  3% [dpdk-dev] [PATCH 0/6] introduce prgdev abstraction library Chen Jing D(Mark)
@ 2017-03-02  4:03  4% ` Chen Jing D(Mark)
  0 siblings, 0 replies; 200+ results
From: Chen Jing D(Mark) @ 2017-03-02  4:03 UTC (permalink / raw)
  To: dev
  Cc: cunming.liang, gerald.rogers, keith.wiles, bruce.richardson,
	Chen Jing D(Mark)

Add rte_prgdev_version.map file.

Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
---
 lib/librte_prgdev/rte_prgdev_version.map |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_prgdev/rte_prgdev_version.map

diff --git a/lib/librte_prgdev/rte_prgdev_version.map b/lib/librte_prgdev/rte_prgdev_version.map
new file mode 100644
index 0000000..51dc15a
--- /dev/null
+++ b/lib/librte_prgdev/rte_prgdev_version.map
@@ -0,0 +1,19 @@
+DPDK_17.05 {
+	global:
+
+	rte_prgdev_pci_probe;
+	rte_prgdev_pci_remove;
+	rte_prgdev_allocate;
+	rte_prgdev_release;
+	rte_prgdev_info_get;
+	rte_prgdev_is_valid_dev;
+	rte_prgdev_open;
+	rte_prgdev_img_download;
+	rte_prgdev_img_upload;
+	rte_prgdev_check_stat;
+	rte_prgdev_close;
+	rte_prgdev_bind;
+	rte_prgdev_unbind;
+
+	local: *;
+};
-- 
1.7.7.6

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 0/6] introduce prgdev abstraction library
@ 2017-03-02  4:03  3% Chen Jing D(Mark)
  2017-03-02  4:03  4% ` [dpdk-dev] [PATCH 5/6] prgdev: add ABI control info Chen Jing D(Mark)
  0 siblings, 1 reply; 200+ results
From: Chen Jing D(Mark) @ 2017-03-02  4:03 UTC (permalink / raw)
  To: dev
  Cc: cunming.liang, gerald.rogers, keith.wiles, bruce.richardson,
	Chen Jing D(Mark)

These patch set intend to introduce a DPDK generic programming device layer,
called prgdev, to provide an abstract, generic APIs for applications to
program hardware without knowing the details of programmable devices. From
driver's perspective, they'll try to adapt their functions to the abstract
APIs defined in prgdev.

The major purpose of prgdev is to help DPDK users to dynamically load/upgrade
RTL images for FPGA devices, or upgrade firmware for programmble NICs, without
breaking DPDK application running.


Chen Jing D(Mark) (5):
  prgdev: introduce new library
  prgdev: add debug macro for prgdev
  prgdev: add bus probe and remove functions
  prgdev: add prgdev API exposed to application
  prgdev: add ABI control info

Chen, Jing D (1):
  doc: introduction to prgdev

 config/common_base                       |    7 +
 doc/guides/prog_guide/index.rst          |    1 +
 doc/guides/prog_guide/prgdev_lib.rst     |  465 ++++++++++++++++++++++++++++++
 lib/Makefile                             |    1 +
 lib/librte_eal/common/include/rte_log.h  |    1 +
 lib/librte_prgdev/Makefile               |   57 ++++
 lib/librte_prgdev/rte_prgdev.c           |  459 +++++++++++++++++++++++++++++
 lib/librte_prgdev/rte_prgdev.h           |  401 ++++++++++++++++++++++++++
 lib/librte_prgdev/rte_prgdev_version.map |   19 ++
 mk/rte.app.mk                            |    1 +
 10 files changed, 1412 insertions(+), 0 deletions(-)
 create mode 100644 doc/guides/prog_guide/prgdev_lib.rst
 create mode 100644 lib/librte_prgdev/Makefile
 create mode 100644 lib/librte_prgdev/rte_prgdev.c
 create mode 100644 lib/librte_prgdev/rte_prgdev.h
 create mode 100644 lib/librte_prgdev/rte_prgdev_version.map

-- 
1.7.7.6

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v3 04/16] net/avp: add PMD version map file
  @ 2017-03-02  0:19  3%     ` Allain Legacy
    1 sibling, 0 replies; 200+ results
From: Allain Legacy @ 2017-03-02  0:19 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: ian.jolliffe, jerin.jacob, stephen, thomas.monjalon, dev

Adds a default ABI version file for the AVP PMD.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
---
 drivers/net/avp/rte_pmd_avp_version.map | 4 ++++
 1 file changed, 4 insertions(+)
 create mode 100644 drivers/net/avp/rte_pmd_avp_version.map

diff --git a/drivers/net/avp/rte_pmd_avp_version.map b/drivers/net/avp/rte_pmd_avp_version.map
new file mode 100644
index 0000000..af8f3f4
--- /dev/null
+++ b/drivers/net/avp/rte_pmd_avp_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+
+    local: *;
+};
-- 
1.8.3.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v8 09/18] lib: add symbol versioning to distributor
  2017-03-01  7:47  3%       ` [dpdk-dev] [PATCH v8 " David Hunt
@ 2017-03-01 14:50  0%         ` Hunt, David
  0 siblings, 0 replies; 200+ results
From: Hunt, David @ 2017-03-01 14:50 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson

ERROR:SPACING: space prohibited before that ',' (ctx:WxW)
#84: FILE: lib/librte_distributor/rte_distributor.c:172:
+BIND_DEFAULT_SYMBOL(rte_distributor_get_pkt, , 17.05);
                                               ^

FYI, checkpatch does not like this regardless of whether there's
a space there or not. It complains either way. :)

Regards,
Dave.



On 1/3/2017 7:47 AM, David Hunt wrote:
> Also bumped up the ABI version number in the Makefile
>
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>   lib/librte_distributor/Makefile                    |  2 +-
>   lib/librte_distributor/rte_distributor.c           |  8 ++++++++
>   lib/librte_distributor/rte_distributor_v20.c       | 10 ++++++++++
>   lib/librte_distributor/rte_distributor_version.map | 14 ++++++++++++++
>   4 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
> index 2b28eff..2f05cf3 100644
> --- a/lib/librte_distributor/Makefile
> +++ b/lib/librte_distributor/Makefile
> @@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
>   
>   EXPORT_MAP := rte_distributor_version.map
>   
> -LIBABIVER := 1
> +LIBABIVER := 2
>   
>   # all source are stored in SRCS-y
>   SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 6e1debf..2c5511d 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -36,6 +36,7 @@
>   #include <rte_mbuf.h>
>   #include <rte_memory.h>
>   #include <rte_cycles.h>
> +#include <rte_compat.h>
>   #include <rte_memzone.h>
>   #include <rte_errno.h>
>   #include <rte_string_fns.h>
> @@ -168,6 +169,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
>   	}
>   	return count;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_get_pkt, , 17.05);
>   
>   int
>   rte_distributor_return_pkt(struct rte_distributor *d,
> @@ -197,6 +199,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>   
>   	return 0;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_return_pkt, , 17.05);
>   
>   /**** APIs called on distributor core ***/
>   
> @@ -476,6 +479,7 @@ rte_distributor_process(struct rte_distributor *d,
>   
>   	return num_mbufs;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_process, , 17.05);
>   
>   /* return to the caller, packets returned from workers */
>   int
> @@ -504,6 +508,7 @@ rte_distributor_returned_pkts(struct rte_distributor *d,
>   
>   	return retval;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_returned_pkts, , 17.05);
>   
>   /*
>    * Return the number of packets in-flight in a distributor, i.e. packets
> @@ -549,6 +554,7 @@ rte_distributor_flush(struct rte_distributor *d)
>   
>   	return flushed;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_flush, , 17.05);
>   
>   /* clears the internal returns array in the distributor */
>   void
> @@ -565,6 +571,7 @@ rte_distributor_clear_returns(struct rte_distributor *d)
>   	for (wkr = 0; wkr < d->num_workers; wkr++)
>   		d->bufs[wkr].retptr64[0] = 0;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_clear_returns, , 17.05);
>   
>   /* creates a distributor instance */
>   struct rte_distributor *
> @@ -638,3 +645,4 @@ rte_distributor_create(const char *name,
>   
>   	return d;
>   }
> +BIND_DEFAULT_SYMBOL(rte_distributor_create, , 17.05);
> diff --git a/lib/librte_distributor/rte_distributor_v20.c b/lib/librte_distributor/rte_distributor_v20.c
> index 1f406c5..bb6c5d7 100644
> --- a/lib/librte_distributor/rte_distributor_v20.c
> +++ b/lib/librte_distributor/rte_distributor_v20.c
> @@ -38,6 +38,7 @@
>   #include <rte_memory.h>
>   #include <rte_memzone.h>
>   #include <rte_errno.h>
> +#include <rte_compat.h>
>   #include <rte_string_fns.h>
>   #include <rte_eal_memconfig.h>
>   #include "rte_distributor_v20.h"
> @@ -63,6 +64,7 @@ rte_distributor_request_pkt_v20(struct rte_distributor_v20 *d,
>   		rte_pause();
>   	buf->bufptr64 = req;
>   }
> +VERSION_SYMBOL(rte_distributor_request_pkt, _v20, 2.0);
>   
>   struct rte_mbuf *
>   rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
> @@ -76,6 +78,7 @@ rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
>   	int64_t ret = buf->bufptr64 >> RTE_DISTRIB_FLAG_BITS;
>   	return (struct rte_mbuf *)((uintptr_t)ret);
>   }
> +VERSION_SYMBOL(rte_distributor_poll_pkt, _v20, 2.0);
>   
>   struct rte_mbuf *
>   rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
> @@ -87,6 +90,7 @@ rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
>   		rte_pause();
>   	return ret;
>   }
> +VERSION_SYMBOL(rte_distributor_get_pkt, _v20, 2.0);
>   
>   int
>   rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
> @@ -98,6 +102,7 @@ rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
>   	buf->bufptr64 = req;
>   	return 0;
>   }
> +VERSION_SYMBOL(rte_distributor_return_pkt, _v20, 2.0);
>   
>   /**** APIs called on distributor core ***/
>   
> @@ -314,6 +319,7 @@ rte_distributor_process_v20(struct rte_distributor_v20 *d,
>   	d->returns.count = ret_count;
>   	return num_mbufs;
>   }
> +VERSION_SYMBOL(rte_distributor_process, _v20, 2.0);
>   
>   /* return to the caller, packets returned from workers */
>   int
> @@ -334,6 +340,7 @@ rte_distributor_returned_pkts_v20(struct rte_distributor_v20 *d,
>   
>   	return retval;
>   }
> +VERSION_SYMBOL(rte_distributor_returned_pkts, _v20, 2.0);
>   
>   /* return the number of packets in-flight in a distributor, i.e. packets
>    * being workered on or queued up in a backlog. */
> @@ -362,6 +369,7 @@ rte_distributor_flush_v20(struct rte_distributor_v20 *d)
>   
>   	return flushed;
>   }
> +VERSION_SYMBOL(rte_distributor_flush, _v20, 2.0);
>   
>   /* clears the internal returns array in the distributor */
>   void
> @@ -372,6 +380,7 @@ rte_distributor_clear_returns_v20(struct rte_distributor_v20 *d)
>   	memset(d->returns.mbufs, 0, sizeof(d->returns.mbufs));
>   #endif
>   }
> +VERSION_SYMBOL(rte_distributor_clear_returns, _v20, 2.0);
>   
>   /* creates a distributor instance */
>   struct rte_distributor_v20 *
> @@ -415,3 +424,4 @@ rte_distributor_create_v20(const char *name,
>   
>   	return d;
>   }
> +VERSION_SYMBOL(rte_distributor_create, _v20, 2.0);
> diff --git a/lib/librte_distributor/rte_distributor_version.map b/lib/librte_distributor/rte_distributor_version.map
> index 73fdc43..3a285b3 100644
> --- a/lib/librte_distributor/rte_distributor_version.map
> +++ b/lib/librte_distributor/rte_distributor_version.map
> @@ -13,3 +13,17 @@ DPDK_2.0 {
>   
>   	local: *;
>   };
> +
> +DPDK_17.05 {
> +	global:
> +
> +	rte_distributor_clear_returns;
> +	rte_distributor_create;
> +	rte_distributor_flush;
> +	rte_distributor_get_pkt;
> +	rte_distributor_poll_pkt;
> +	rte_distributor_process;
> +	rte_distributor_request_pkt;
> +	rte_distributor_return_pkt;
> +	rte_distributor_returned_pkts;
> +} DPDK_2.0;

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v8 09/18] lib: add symbol versioning to distributor
  2017-03-01  7:47  2%     ` [dpdk-dev] [PATCH v8 0/18] distributor library performance enhancements David Hunt
  2017-03-01  7:47  1%       ` [dpdk-dev] [PATCH v8 01/18] lib: rename legacy distributor lib files David Hunt
@ 2017-03-01  7:47  3%       ` David Hunt
  2017-03-01 14:50  0%         ` Hunt, David
  1 sibling, 1 reply; 200+ results
From: David Hunt @ 2017-03-01  7:47 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Also bumped up the ABI version number in the Makefile

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/Makefile                    |  2 +-
 lib/librte_distributor/rte_distributor.c           |  8 ++++++++
 lib/librte_distributor/rte_distributor_v20.c       | 10 ++++++++++
 lib/librte_distributor/rte_distributor_version.map | 14 ++++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 2b28eff..2f05cf3 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -39,7 +39,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
 
 EXPORT_MAP := rte_distributor_version.map
 
-LIBABIVER := 1
+LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 6e1debf..2c5511d 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -36,6 +36,7 @@
 #include <rte_mbuf.h>
 #include <rte_memory.h>
 #include <rte_cycles.h>
+#include <rte_compat.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
 #include <rte_string_fns.h>
@@ -168,6 +169,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	}
 	return count;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_get_pkt, , 17.05);
 
 int
 rte_distributor_return_pkt(struct rte_distributor *d,
@@ -197,6 +199,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 	return 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_return_pkt, , 17.05);
 
 /**** APIs called on distributor core ***/
 
@@ -476,6 +479,7 @@ rte_distributor_process(struct rte_distributor *d,
 
 	return num_mbufs;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_process, , 17.05);
 
 /* return to the caller, packets returned from workers */
 int
@@ -504,6 +508,7 @@ rte_distributor_returned_pkts(struct rte_distributor *d,
 
 	return retval;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_returned_pkts, , 17.05);
 
 /*
  * Return the number of packets in-flight in a distributor, i.e. packets
@@ -549,6 +554,7 @@ rte_distributor_flush(struct rte_distributor *d)
 
 	return flushed;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_flush, , 17.05);
 
 /* clears the internal returns array in the distributor */
 void
@@ -565,6 +571,7 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 	for (wkr = 0; wkr < d->num_workers; wkr++)
 		d->bufs[wkr].retptr64[0] = 0;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_clear_returns, , 17.05);
 
 /* creates a distributor instance */
 struct rte_distributor *
@@ -638,3 +645,4 @@ rte_distributor_create(const char *name,
 
 	return d;
 }
+BIND_DEFAULT_SYMBOL(rte_distributor_create, , 17.05);
diff --git a/lib/librte_distributor/rte_distributor_v20.c b/lib/librte_distributor/rte_distributor_v20.c
index 1f406c5..bb6c5d7 100644
--- a/lib/librte_distributor/rte_distributor_v20.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -38,6 +38,7 @@
 #include <rte_memory.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
+#include <rte_compat.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
 #include "rte_distributor_v20.h"
@@ -63,6 +64,7 @@ rte_distributor_request_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	buf->bufptr64 = req;
 }
+VERSION_SYMBOL(rte_distributor_request_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
@@ -76,6 +78,7 @@ rte_distributor_poll_pkt_v20(struct rte_distributor_v20 *d,
 	int64_t ret = buf->bufptr64 >> RTE_DISTRIB_FLAG_BITS;
 	return (struct rte_mbuf *)((uintptr_t)ret);
 }
+VERSION_SYMBOL(rte_distributor_poll_pkt, _v20, 2.0);
 
 struct rte_mbuf *
 rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
@@ -87,6 +90,7 @@ rte_distributor_get_pkt_v20(struct rte_distributor_v20 *d,
 		rte_pause();
 	return ret;
 }
+VERSION_SYMBOL(rte_distributor_get_pkt, _v20, 2.0);
 
 int
 rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
@@ -98,6 +102,7 @@ rte_distributor_return_pkt_v20(struct rte_distributor_v20 *d,
 	buf->bufptr64 = req;
 	return 0;
 }
+VERSION_SYMBOL(rte_distributor_return_pkt, _v20, 2.0);
 
 /**** APIs called on distributor core ***/
 
@@ -314,6 +319,7 @@ rte_distributor_process_v20(struct rte_distributor_v20 *d,
 	d->returns.count = ret_count;
 	return num_mbufs;
 }
+VERSION_SYMBOL(rte_distributor_process, _v20, 2.0);
 
 /* return to the caller, packets returned from workers */
 int
@@ -334,6 +340,7 @@ rte_distributor_returned_pkts_v20(struct rte_distributor_v20 *d,
 
 	return retval;
 }
+VERSION_SYMBOL(rte_distributor_returned_pkts, _v20, 2.0);
 
 /* return the number of packets in-flight in a distributor, i.e. packets
  * being workered on or queued up in a backlog. */
@@ -362,6 +369,7 @@ rte_distributor_flush_v20(struct rte_distributor_v20 *d)
 
 	return flushed;
 }
+VERSION_SYMBOL(rte_distributor_flush, _v20, 2.0);
 
 /* clears the internal returns array in the distributor */
 void
@@ -372,6 +380,7 @@ rte_distributor_clear_returns_v20(struct rte_distributor_v20 *d)
 	memset(d->returns.mbufs, 0, sizeof(d->returns.mbufs));
 #endif
 }
+VERSION_SYMBOL(rte_distributor_clear_returns, _v20, 2.0);
 
 /* creates a distributor instance */
 struct rte_distributor_v20 *
@@ -415,3 +424,4 @@ rte_distributor_create_v20(const char *name,
 
 	return d;
 }
+VERSION_SYMBOL(rte_distributor_create, _v20, 2.0);
diff --git a/lib/librte_distributor/rte_distributor_version.map b/lib/librte_distributor/rte_distributor_version.map
index 73fdc43..3a285b3 100644
--- a/lib/librte_distributor/rte_distributor_version.map
+++ b/lib/librte_distributor/rte_distributor_version.map
@@ -13,3 +13,17 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_17.05 {
+	global:
+
+	rte_distributor_clear_returns;
+	rte_distributor_create;
+	rte_distributor_flush;
+	rte_distributor_get_pkt;
+	rte_distributor_poll_pkt;
+	rte_distributor_process;
+	rte_distributor_request_pkt;
+	rte_distributor_return_pkt;
+	rte_distributor_returned_pkts;
+} DPDK_2.0;
-- 
2.7.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v8 0/18] distributor library performance enhancements
    2017-02-24 14:03  0%     ` Bruce Richardson
@ 2017-03-01  7:47  2%     ` David Hunt
  2017-03-01  7:47  1%       ` [dpdk-dev] [PATCH v8 01/18] lib: rename legacy distributor lib files David Hunt
  2017-03-01  7:47  3%       ` [dpdk-dev] [PATCH v8 " David Hunt
  1 sibling, 2 replies; 200+ results
From: David Hunt @ 2017-03-01  7:47 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson

This patch aims to improve the throughput of the distributor library.

It uses a similar handshake mechanism to the previous version of
the library, in that bits are used to indicate when packets are ready
to be sent to a worker and ready to be returned from a worker. One main
difference is that instead of sending one packet in a cache line, it makes
use of the 7 free spaces in the same cache line in order to send up to
8 packets at a time to/from a worker.

The flow matching algorithm has had significant re-work, and now keeps an
array of inflight flows and an array of backlog flows, and matches incoming
flows to the inflight/backlog flows of all workers so that flow pinning to
workers can be maintained.

The Flow Match algorithm has both scalar and a vector versions, and a
function pointer is used to select the post appropriate function at run time,
depending on the presence of the SSE2 cpu flag. On non-x86 platforms, the
the scalar match function is selected, which should still gives a good boost
in performance over the non-burst API.

v8 changes:
   * Changed the patch set to have a more logical order order of
     the changes, but the end result is basically the same.
   * Fixed broken shared library build.
   * Split down the updates to example app more
   * No longer changes the test app and sample app to use a temporary
     API.
   * No longer temporarily re-names the functions in the
     version.map file.

v7 changes:
   * Reorganised patch so there's a more natural progression in the
     changes, and divided them down into easier to review chunks.
   * Previous versions of this patch set were effectively two APIs.
     We now have a single API. Legacy functionality can
     be used by by using the rte_distributor_create API call with the
     RTE_DISTRIBUTOR_SINGLE flag when creating a distributor instance.
   * Added symbol versioning for old API so that ABI is preserved.

v6 changes:
   * Fixed intermittent segfault where num pkts not divisible
     by BURST_SIZE
   * Cleanup due to review comments on mailing list
   * Renamed _priv.h to _private.h.

v5 changes:
   * Removed some un-needed code around retries in worker API calls
   * Cleanup due to review comments on mailing list
   * Cleanup of non-x86 platform compilation, fallback to scalar match

v4 changes:
   * fixed issue building shared libraries

v3 changes:
  * Addressed mailing list review comments
  * Test code removal
  * Split out SSE match into separate file to facilitate NEON addition
  * Cleaned up conditional compilation flags for SSE2
  * Addressed c99 style compilation errors
  * rebased on latest head (Jan 2 2017, Happy New Year to all)

v2 changes:
  * Created a common distributor_priv.h header file with common
    definitions and structures.
  * Added a scalar version so it can be built and used on machines without
    sse2 instruction set
  * Added unit autotests
  * Added perf autotest

Notes:
   Apps must now work in bursts, as up to 8 are given to a worker at a time
   For performance in matching, Flow ID's are 15-bits
   If 32 bits Flow IDs are required, use the packet-at-a-time (SINGLE)
   mode.

Performance Gains
   2.2GHz Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
   2 x XL710 40GbE NICS to 2 x 40Gbps traffic generator channels 64b packets
   separate cores for rx, tx, distributor
    1 worker  - up to 4.8x
    4 workers - up to 2.9x
    8 workers - up to 1.8x
   12 workers - up to 2.1x
   16 workers - up to 1.8x

[01/18] lib: rename legacy distributor lib files
[02/18] lib: create private header file
[03/18] lib: add new burst oriented distributor structs
[04/18] lib: add new distributor code
[05/18] lib: add SIMD flow matching to distributor
[06/18] test/distributor: extra params for autotests
[07/18] lib: switch distributor over to new API
[08/18] lib: make v20 header file private
[09/18] lib: add symbol versioning to distributor
[10/18] test: test single and burst distributor API
[11/18] test: add perf test for distributor burst mode
[12/18] examples/distributor: allow for extra stats
[13/18] sample: distributor: wait for ports to come up
[14/18] examples/distributor: give distributor a core
[15/18] examples/distributor: limit number of Tx rings
[16/18] examples/distributor: give Rx thread a core
[17/18] doc: distributor library changes for new burst API
[18/18] maintainers: add to distributor lib maintainers

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v8 01/18] lib: rename legacy distributor lib files
  2017-03-01  7:47  2%     ` [dpdk-dev] [PATCH v8 0/18] distributor library performance enhancements David Hunt
@ 2017-03-01  7:47  1%       ` David Hunt
  2017-03-06  9:10  2%         ` [dpdk-dev] [PATCH v9 00/18] distributor lib performance enhancements David Hunt
  2017-03-01  7:47  3%       ` [dpdk-dev] [PATCH v8 " David Hunt
  1 sibling, 1 reply; 200+ results
From: David Hunt @ 2017-03-01  7:47 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, David Hunt

Move files out of the way so that we can replace with new
versions of the distributor libtrary. Files are named in
such a way as to match the symbol versioning that we will
apply for backward ABI compatibility.

Signed-off-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/Makefile                    |   3 +-
 lib/librte_distributor/rte_distributor.h           | 210 +-----------------
 .../{rte_distributor.c => rte_distributor_v20.c}   |   2 +-
 lib/librte_distributor/rte_distributor_v20.h       | 247 +++++++++++++++++++++
 4 files changed, 251 insertions(+), 211 deletions(-)
 rename lib/librte_distributor/{rte_distributor.c => rte_distributor_v20.c} (99%)
 create mode 100644 lib/librte_distributor/rte_distributor_v20.h

diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile
index 4c9af17..b314ca6 100644
--- a/lib/librte_distributor/Makefile
+++ b/lib/librte_distributor/Makefile
@@ -42,10 +42,11 @@ EXPORT_MAP := rte_distributor_version.map
 LIBABIVER := 1
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor.c
+SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c
 
 # install this header file
 SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include := rte_distributor.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include += rte_distributor_v20.h
 
 # this lib needs eal
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += lib/librte_eal
diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 7d36bc8..e41d522 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -34,214 +34,6 @@
 #ifndef _RTE_DISTRIBUTE_H_
 #define _RTE_DISTRIBUTE_H_
 
-/**
- * @file
- * RTE distributor
- *
- * The distributor is a component which is designed to pass packets
- * one-at-a-time to workers, with dynamic load balancing.
- */
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
-
-struct rte_distributor;
-struct rte_mbuf;
-
-/**
- * Function to create a new distributor instance
- *
- * Reserves the memory needed for the distributor operation and
- * initializes the distributor to work with the configured number of workers.
- *
- * @param name
- *   The name to be given to the distributor instance.
- * @param socket_id
- *   The NUMA node on which the memory is to be allocated
- * @param num_workers
- *   The maximum number of workers that will request packets from this
- *   distributor
- * @return
- *   The newly created distributor instance
- */
-struct rte_distributor *
-rte_distributor_create(const char *name, unsigned socket_id,
-		unsigned num_workers);
-
-/*  *** APIS to be called on the distributor lcore ***  */
-/*
- * The following APIs are the public APIs which are designed for use on a
- * single lcore which acts as the distributor lcore for a given distributor
- * instance. These functions cannot be called on multiple cores simultaneously
- * without using locking to protect access to the internals of the distributor.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * Process a set of packets by distributing them among workers that request
- * packets. The distributor will ensure that no two packets that have the
- * same flow id, or tag, in the mbuf will be procesed at the same time.
- *
- * The user is advocated to set tag for each mbuf before calling this function.
- * If user doesn't set the tag, the tag value can be various values depending on
- * driver implementation and configuration.
- *
- * This is not multi-thread safe and should only be called on a single lcore.
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs to be distributed
- * @param num_mbufs
- *   The number of mbufs in the mbufs array
- * @return
- *   The number of mbufs processed.
- */
-int
-rte_distributor_process(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned num_mbufs);
-
-/**
- * Get a set of mbufs that have been returned to the distributor by workers
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @param mbufs
- *   The mbufs pointer array to be filled in
- * @param max_mbufs
- *   The size of the mbufs array
- * @return
- *   The number of mbufs returned in the mbufs array.
- */
-int
-rte_distributor_returned_pkts(struct rte_distributor *d,
-		struct rte_mbuf **mbufs, unsigned max_mbufs);
-
-/**
- * Flush the distributor component, so that there are no in-flight or
- * backlogged packets awaiting processing
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- * @return
- *   The number of queued/in-flight packets that were completed by this call.
- */
-int
-rte_distributor_flush(struct rte_distributor *d);
-
-/**
- * Clears the array of returned packets used as the source for the
- * rte_distributor_returned_pkts() API call.
- *
- * This should only be called on the same lcore as rte_distributor_process()
- *
- * @param d
- *   The distributor instance to be used
- */
-void
-rte_distributor_clear_returns(struct rte_distributor *d);
-
-/*  *** APIS to be called on the worker lcores ***  */
-/*
- * The following APIs are the public APIs which are designed for use on
- * multiple lcores which act as workers for a distributor. Each lcore should use
- * a unique worker id when requesting packets.
- *
- * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
- * for the same distributor instance, otherwise deadlock will result.
- */
-
-/**
- * API called by a worker to get a new packet to process. Any previous packet
- * given to the worker is assumed to have completed processing, and may be
- * optionally returned to the distributor via the oldpkt parameter.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- *
- * @return
- *   A new packet to be processed by the worker thread.
- */
-struct rte_mbuf *
-rte_distributor_get_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to return a completed packet without requesting a
- * new packet, for example, because a worker thread is shutting down
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param mbuf
- *   The previous packet being processed by the worker
- */
-int
-rte_distributor_return_pkt(struct rte_distributor *d, unsigned worker_id,
-		struct rte_mbuf *mbuf);
-
-/**
- * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
- * processing, and may be optionally returned to the distributor via
- * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt(), this function does not wait for a new
- * packet to be provided by the distributor.
- *
- * NOTE: after calling this function, rte_distributor_poll_pkt() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt()
- * API should *not* be used to try and retrieve the new packet.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- * @param oldpkt
- *   The previous packet, if any, being processed by the worker
- */
-void
-rte_distributor_request_pkt(struct rte_distributor *d,
-		unsigned worker_id, struct rte_mbuf *oldpkt);
-
-/**
- * API called by a worker to check for a new packet that was previously
- * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
- * not yet been fulfilled by the distributor.
- *
- * @param d
- *   The distributor instance to be used
- * @param worker_id
- *   The worker instance number to use - must be less that num_workers passed
- *   at distributor creation time.
- *
- * @return
- *   A new packet to be processed by the worker thread, or NULL if no
- *   packet is yet available.
- */
-struct rte_mbuf *
-rte_distributor_poll_pkt(struct rte_distributor *d,
-		unsigned worker_id);
-
-#ifdef __cplusplus
-}
-#endif
+#include <rte_distributor_v20.h>
 
 #endif
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor_v20.c
similarity index 99%
rename from lib/librte_distributor/rte_distributor.c
rename to lib/librte_distributor/rte_distributor_v20.c
index f3f778c..b890947 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor_v20.c
@@ -40,7 +40,7 @@
 #include <rte_errno.h>
 #include <rte_string_fns.h>
 #include <rte_eal_memconfig.h>
-#include "rte_distributor.h"
+#include "rte_distributor_v20.h"
 
 #define NO_FLAGS 0
 #define RTE_DISTRIB_PREFIX "DT_"
diff --git a/lib/librte_distributor/rte_distributor_v20.h b/lib/librte_distributor/rte_distributor_v20.h
new file mode 100644
index 0000000..b69aa27
--- /dev/null
+++ b/lib/librte_distributor/rte_distributor_v20.h
@@ -0,0 +1,247 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_DISTRIB_V20_H_
+#define _RTE_DISTRIB_V20_H_
+
+/**
+ * @file
+ * RTE distributor
+ *
+ * The distributor is a component which is designed to pass packets
+ * one-at-a-time to workers, with dynamic load balancing.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */
+
+struct rte_distributor;
+struct rte_mbuf;
+
+/**
+ * Function to create a new distributor instance
+ *
+ * Reserves the memory needed for the distributor operation and
+ * initializes the distributor to work with the configured number of workers.
+ *
+ * @param name
+ *   The name to be given to the distributor instance.
+ * @param socket_id
+ *   The NUMA node on which the memory is to be allocated
+ * @param num_workers
+ *   The maximum number of workers that will request packets from this
+ *   distributor
+ * @return
+ *   The newly created distributor instance
+ */
+struct rte_distributor *
+rte_distributor_create(const char *name, unsigned int socket_id,
+		unsigned int num_workers);
+
+/*  *** APIS to be called on the distributor lcore ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on a
+ * single lcore which acts as the distributor lcore for a given distributor
+ * instance. These functions cannot be called on multiple cores simultaneously
+ * without using locking to protect access to the internals of the distributor.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * Process a set of packets by distributing them among workers that request
+ * packets. The distributor will ensure that no two packets that have the
+ * same flow id, or tag, in the mbuf will be processed at the same time.
+ *
+ * The user is advocated to set tag for each mbuf before calling this function.
+ * If user doesn't set the tag, the tag value can be various values depending on
+ * driver implementation and configuration.
+ *
+ * This is not multi-thread safe and should only be called on a single lcore.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs to be distributed
+ * @param num_mbufs
+ *   The number of mbufs in the mbufs array
+ * @return
+ *   The number of mbufs processed.
+ */
+int
+rte_distributor_process(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int num_mbufs);
+
+/**
+ * Get a set of mbufs that have been returned to the distributor by workers
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param mbufs
+ *   The mbufs pointer array to be filled in
+ * @param max_mbufs
+ *   The size of the mbufs array
+ * @return
+ *   The number of mbufs returned in the mbufs array.
+ */
+int
+rte_distributor_returned_pkts(struct rte_distributor *d,
+		struct rte_mbuf **mbufs, unsigned int max_mbufs);
+
+/**
+ * Flush the distributor component, so that there are no in-flight or
+ * backlogged packets awaiting processing
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @return
+ *   The number of queued/in-flight packets that were completed by this call.
+ */
+int
+rte_distributor_flush(struct rte_distributor *d);
+
+/**
+ * Clears the array of returned packets used as the source for the
+ * rte_distributor_returned_pkts() API call.
+ *
+ * This should only be called on the same lcore as rte_distributor_process()
+ *
+ * @param d
+ *   The distributor instance to be used
+ */
+void
+rte_distributor_clear_returns(struct rte_distributor *d);
+
+/*  *** APIS to be called on the worker lcores ***  */
+/*
+ * The following APIs are the public APIs which are designed for use on
+ * multiple lcores which act as workers for a distributor. Each lcore should use
+ * a unique worker id when requesting packets.
+ *
+ * NOTE: a given lcore cannot act as both a distributor lcore and a worker lcore
+ * for the same distributor instance, otherwise deadlock will result.
+ */
+
+/**
+ * API called by a worker to get a new packet to process. Any previous packet
+ * given to the worker is assumed to have completed processing, and may be
+ * optionally returned to the distributor via the oldpkt parameter.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ *
+ * @return
+ *   A new packet to be processed by the worker thread.
+ */
+struct rte_mbuf *
+rte_distributor_get_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to return a completed packet without requesting a
+ * new packet, for example, because a worker thread is shutting down
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param mbuf
+ *   The previous packet being processed by the worker
+ */
+int
+rte_distributor_return_pkt(struct rte_distributor *d, unsigned int worker_id,
+		struct rte_mbuf *mbuf);
+
+/**
+ * API called by a worker to request a new packet to process.
+ * Any previous packet given to the worker is assumed to have completed
+ * processing, and may be optionally returned to the distributor via
+ * the oldpkt parameter.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for a new
+ * packet to be provided by the distributor.
+ *
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packet requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packet.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ * @param oldpkt
+ *   The previous packet, if any, being processed by the worker
+ */
+void
+rte_distributor_request_pkt(struct rte_distributor *d,
+		unsigned int worker_id, struct rte_mbuf *oldpkt);
+
+/**
+ * API called by a worker to check for a new packet that was previously
+ * requested by a call to rte_distributor_request_pkt(). It does not wait
+ * for the new packet to be available, but returns NULL if the request has
+ * not yet been fulfilled by the distributor.
+ *
+ * @param d
+ *   The distributor instance to be used
+ * @param worker_id
+ *   The worker instance number to use - must be less that num_workers passed
+ *   at distributor creation time.
+ *
+ * @return
+ *   A new packet to be processed by the worker thread, or NULL if no
+ *   packet is yet available.
+ */
+struct rte_mbuf *
+rte_distributor_poll_pkt(struct rte_distributor *d,
+		unsigned int worker_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.7.4

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v2] mk: Provide option to set Major ABI version
  2017-03-01  9:34 20%       ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt
@ 2017-03-01 14:35  4%         ` Jan Blunck
  2017-03-16 17:19  4%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Jan Blunck @ 2017-03-01 14:35 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: dev, cjcollier @ linuxfoundation . org, ricardo.salveti, Luca Boccassi

On Wed, Mar 1, 2017 at 10:34 AM, Christian Ehrhardt
<christian.ehrhardt@canonical.com> wrote:
> Downstreams might want to provide different DPDK releases at the same
> time to support multiple consumers of DPDK linked against older and newer
> sonames.
>
> Also due to the interdependencies that DPDK libraries can have applications
> might end up with an executable space in which multiple versions of a
> library are mapped by ld.so.
>
> Think of LibA that got an ABI bump and LibB that did not get an ABI bump
> but is depending on LibA.
>
>     Application
>     \-> LibA.old
>     \-> LibB.new -> LibA.new
>
> That is a conflict which can be avoided by setting CONFIG_RTE_MAJOR_ABI.
> If set CONFIG_RTE_MAJOR_ABI overwrites any LIBABIVER value.
> An example might be ``CONFIG_RTE_MAJOR_ABI=16.11`` which will make all
> libraries librte<?>.so.16.11 instead of librte<?>.so.<LIBABIVER>.
>
> We need to cut arbitrary long stings after the .so now and this would work
> for any ABI version in LIBABIVER:
>   $(Q)ln -s -f $< $(patsubst %.$(LIBABIVER),%,$@)
> But using the following instead additionally allows to simplify the Make
> File for the CONFIG_RTE_NEXT_ABI case.
>   $(Q)ln -s -f $< $(shell echo $@ | sed 's/\.so.*/.so/')
>
> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> ---
>  config/common_base                     |  5 +++++
>  doc/guides/contributing/versioning.rst | 25 +++++++++++++++++++++++++
>  mk/rte.lib.mk                          | 14 +++++++++-----
>  3 files changed, 39 insertions(+), 5 deletions(-)
>
> diff --git a/config/common_base b/config/common_base
> index aeee13e..37aa1e1 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -75,6 +75,11 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
>  CONFIG_RTE_NEXT_ABI=y
>
>  #
> +# Major ABI to overwrite library specific LIBABIVER
> +#
> +CONFIG_RTE_MAJOR_ABI=
> +
> +#
>  # Machine's cache line size
>  #
>  CONFIG_RTE_CACHE_LINE_SIZE=64
> diff --git a/doc/guides/contributing/versioning.rst b/doc/guides/contributing/versioning.rst
> index fbc44a7..8aaf370 100644
> --- a/doc/guides/contributing/versioning.rst
> +++ b/doc/guides/contributing/versioning.rst
> @@ -133,6 +133,31 @@ The macros exported are:
>    fully qualified function ``p``, so that if a symbol becomes versioned, it
>    can still be mapped back to the public symbol name.
>
> +Setting a Major ABI version
> +---------------------------
> +
> +Downstreams might want to provide different DPDK releases at the same time to
> +support multiple consumers of DPDK linked against older and newer sonames.
> +
> +Also due to the interdependencies that DPDK libraries can have applications
> +might end up with an executable space in which multiple versions of a library
> +are mapped by ld.so.
> +
> +Think of LibA that got an ABI bump and LibB that did not get an ABI bump but is
> +depending on LibA.
> +
> +.. note::
> +
> +    Application
> +    \-> LibA.old
> +    \-> LibB.new -> LibA.new
> +
> +That is a conflict which can be avoided by setting ``CONFIG_RTE_MAJOR_ABI``.
> +If set, the value of ``CONFIG_RTE_MAJOR_ABI`` overwrites all - otherwise per
> +library - versions defined in the libraries ``LIBABIVER``.
> +An example might be ``CONFIG_RTE_MAJOR_ABI=16.11`` which will make all libraries
> +``librte<?>.so.16.11`` instead of ``librte<?>.so.<LIBABIVER>``.
> +
>  Examples of ABI Macro use
>  -------------------------
>
> diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
> index 33a5f5a..1ffbf42 100644
> --- a/mk/rte.lib.mk
> +++ b/mk/rte.lib.mk
> @@ -40,12 +40,20 @@ EXTLIB_BUILD ?= n
>  # VPATH contains at least SRCDIR
>  VPATH += $(SRCDIR)
>
> +ifneq ($(CONFIG_RTE_MAJOR_ABI),)
> +ifneq ($(LIBABIVER),)
> +LIBABIVER := $(CONFIG_RTE_MAJOR_ABI)
> +endif
> +endif
> +
>  ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
>  LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
>  ifeq ($(EXTLIB_BUILD),n)
> +ifeq ($(CONFIG_RTE_MAJOR_ABI),)
>  ifeq ($(CONFIG_RTE_NEXT_ABI),y)
>  LIB := $(LIB).1
>  endif
> +endif
>  CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
>  endif
>  endif
> @@ -156,11 +164,7 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
>         @[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
>         $(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
>  ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
> -ifeq ($(CONFIG_RTE_NEXT_ABI)$(EXTLIB_BUILD),yn)
> -       $(Q)ln -s -f $< $(basename $(basename $@))
> -else
> -       $(Q)ln -s -f $< $(basename $@)
> -endif
> +       $(Q)ln -s -f $< $(shell echo $@ | sed 's/\.so.*/.so/')
>  endif
>
>  #
> --
> 2.7.4
>

Reviewed-by: Jan Blunck <jblunck@infradead.org>
Tested-by: Jan Blunck <jblunck@infradead.org>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 0/1] net/mlx5: add TSO support
  @ 2017-03-01 11:11  3% ` Shahaf Shuler
  0 siblings, 0 replies; 200+ results
From: Shahaf Shuler @ 2017-03-01 11:11 UTC (permalink / raw)
  To: nelio.laranjeiro, adrien.mazarguil; +Cc: dev

on v2:
* Suppressed patches:
  [PATCH 1/4] ethdev: add Tx offload limitations.
  [PATCH 2/4] ethdev: add TSO disable flag.
  [PATCH 3/4] app/testpmd: add TSO disable to test options.
* The changes introduced on the above conflict with tx_prepare API and break ABI.
  A proposal to disable by default optional offloads and a way to reflect HW offloads
  limitations to application will be addressed on different commit.
* TSO support modification 

[PATCH v2 1/1] net/mlx5: add hardware TSO support

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v7 01/17] lib: rename legacy distributor lib files
  2017-02-24 14:03  0%     ` Bruce Richardson
@ 2017-03-01  9:55  0%       ` Hunt, David
  0 siblings, 0 replies; 200+ results
From: Hunt, David @ 2017-03-01  9:55 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev



On 24/2/2017 2:03 PM, Bruce Richardson wrote:
> On Tue, Feb 21, 2017 at 03:17:37AM +0000, David Hunt wrote:
>> Move files out of the way so that we can replace with new
>> versions of the distributor libtrary. Files are named in
>> such a way as to match the symbol versioning that we will
>> apply for backward ABI compatibility.
>>
>> Signed-off-by: David Hunt <david.hunt@intel.com>
>> ---
>>   app/test/test_distributor.c                  |   2 +-
>>   app/test/test_distributor_perf.c             |   2 +-
>>   examples/distributor/main.c                  |   2 +-
>>   lib/librte_distributor/Makefile              |   4 +-
>>   lib/librte_distributor/rte_distributor.c     | 487 ---------------------------
>>   lib/librte_distributor/rte_distributor.h     | 247 --------------
>>   lib/librte_distributor/rte_distributor_v20.c | 487 +++++++++++++++++++++++++++
>>   lib/librte_distributor/rte_distributor_v20.h | 247 ++++++++++++++
> Rather than changing the unit tests and example applications, I think
> this patch would be better with a new rte_distributor.h file which
> simply does "#include  <rte_distributor_v20.h>". Alternatively, I
> recently upstreamed a patch, which went into 17.02, to allow symlinks in
> the folder so you could create a symlink to the renamed file.
>
> /Bruce

Thanks for the review, Bruce. I've just finished reworking the patchset 
on your review comments (including later emails) and will post soon.

Regards,
Dave.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
  2017-02-28 17:54  0%           ` Jerin Jacob
@ 2017-03-01  9:47  0%             ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2017-03-01  9:47 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: olivier.matz, dev

On Tue, Feb 28, 2017 at 11:24:25PM +0530, Jerin Jacob wrote:
> On Tue, Feb 28, 2017 at 01:52:26PM +0000, Bruce Richardson wrote:
> > On Tue, Feb 28, 2017 at 05:38:34PM +0530, Jerin Jacob wrote:
> > > On Tue, Feb 28, 2017 at 11:57:03AM +0000, Bruce Richardson wrote:
> > > > On Tue, Feb 28, 2017 at 05:05:13PM +0530, Jerin Jacob wrote:
> > > > > On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson wrote:
> > > > > > Users compiling DPDK should not need to know or care about the arrangement
> > > > > > of cachelines in the rte_ring structure. Therefore just remove the build
> > > > > > option and set the structures to be always split. For improved
> > > > > > performance use 128B rather than 64B alignment since it stops the producer
> > > > > > and consumer data being on adjacent cachelines.
> > > > > > 
> > > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > ---
> > > > > >  config/common_base                     | 1 -
> > > > > >  doc/guides/rel_notes/release_17_05.rst | 6 ++++++
> > > > > >  lib/librte_ring/rte_ring.c             | 2 --
> > > > > >  lib/librte_ring/rte_ring.h             | 8 ++------
> > > > > >  4 files changed, 8 insertions(+), 9 deletions(-)
> > > > > > 
> > > > > > diff --git a/config/common_base b/config/common_base
> > > > > > index aeee13e..099ffda 100644
> > > > > > --- a/config/common_base
> > > > > > +++ b/config/common_base
> > > > > > @@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> > > > > >  #
> > > > > >  CONFIG_RTE_LIBRTE_RING=y
> > > > > >  CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > > > > -CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > > > >  CONFIG_RTE_RING_PAUSE_REP_COUNT=0
> > > > > >  
> > > > > >  #
> > > > > > diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
> > > > > > index e25ea9f..ea45e0c 100644
> > > > > > --- a/doc/guides/rel_notes/release_17_05.rst
> > > > > > +++ b/doc/guides/rel_notes/release_17_05.rst
> > > > > > @@ -110,6 +110,12 @@ API Changes
> > > > > >     Also, make sure to start the actual text at the margin.
> > > > > >     =========================================================
> > > > > >  
> > > > > > +* **Reworked rte_ring library**
> > > > > > +
> > > > > > +  The rte_ring library has been reworked and updated. The following changes
> > > > > > +  have been made to it:
> > > > > > +
> > > > > > +  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
> > > > > >  
> > > > > >  ABI Changes
> > > > > >  -----------
> > > > > > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > > > > > index ca0a108..4bc6da1 100644
> > > > > > --- a/lib/librte_ring/rte_ring.c
> > > > > > +++ b/lib/librte_ring/rte_ring.c
> > > > > > @@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
> > > > > >  	/* compilation-time checks */
> > > > > >  	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
> > > > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > > > -#ifdef RTE_RING_SPLIT_PROD_CONS
> > > > > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
> > > > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > > > -#endif
> > > > > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
> > > > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > > >  #ifdef RTE_LIBRTE_RING_DEBUG
> > > > > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > > > > > index 72ccca5..04fe667 100644
> > > > > > --- a/lib/librte_ring/rte_ring.h
> > > > > > +++ b/lib/librte_ring/rte_ring.h
> > > > > > @@ -168,7 +168,7 @@ struct rte_ring {
> > > > > >  		uint32_t mask;           /**< Mask (size-1) of ring. */
> > > > > >  		volatile uint32_t head;  /**< Producer head. */
> > > > > >  		volatile uint32_t tail;  /**< Producer tail. */
> > > > > > -	} prod __rte_cache_aligned;
> > > > > > +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);
> > > > > 
> > > > > I think we need to use RTE_CACHE_LINE_MIN_SIZE instead of
> > > > > RTE_CACHE_LINE_SIZE for alignment here. PPC and ThunderX1 targets are cache line
> > > > > size of 128B
> > > > > 
> > > > Sure.
> > > > 
> > > > However, can you perhaps try a performance test and check to see if
> > > > there is a performance difference between the two values before I change
> > > > it? In my tests I see improved performance by having an extra blank
> > > > cache-line between the producer and consumer data.
> > > 
> > > Sure. Which test are you running to measure the performance difference?
> > > Is it app/test/test_ring_perf.c?
> > > 
> > > > 
> > Yep, just the basic ring perf test. I look mostly at the core-to-core
> > numbers, since hyperthread-to-hyperthread or NUMA socket to NUMA socket
> > would be far less common use cases IMHO.
> 
> Performance test result shows regression with RTE_CACHE_LINE_MIN_SIZE
> scheme in some use case and some use case has higher performance(Testing using
> two physical cores)
> 
> 
> # base code
> RTE>>ring_perf_autotest
> ### Testing single element and burst enq/deq ###
> SP/SC single enq/dequeue: 84
> MP/MC single enq/dequeue: 301
> SP/SC burst enq/dequeue (size: 8): 20
> MP/MC burst enq/dequeue (size: 8): 46
> SP/SC burst enq/dequeue (size: 32): 12
> MP/MC burst enq/dequeue (size: 32): 18
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 7.11
> MC empty dequeue: 12.15
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 19.08
> MP/MC bulk enq/dequeue (size: 8): 46.28
> SP/SC bulk enq/dequeue (size: 32): 11.89
> MP/MC bulk enq/dequeue (size: 32): 18.84
> 
> ### Testing using two physical cores ###
> SP/SC bulk enq/dequeue (size: 8): 37.42
> MP/MC bulk enq/dequeue (size: 8): 73.32
> SP/SC bulk enq/dequeue (size: 32): 18.69
> MP/MC bulk enq/dequeue (size: 32): 24.59
> Test OK
> 
> # with ring rework patch
> RTE>>ring_perf_autotest
> ### Testing single element and burst enq/deq ###
> SP/SC single enq/dequeue: 84
> MP/MC single enq/dequeue: 301
> SP/SC burst enq/dequeue (size: 8): 19
> MP/MC burst enq/dequeue (size: 8): 45
> SP/SC burst enq/dequeue (size: 32): 11
> MP/MC burst enq/dequeue (size: 32): 18
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 7.10
> MC empty dequeue: 12.15
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 18.59
> MP/MC bulk enq/dequeue (size: 8): 45.49
> SP/SC bulk enq/dequeue (size: 32): 11.67
> MP/MC bulk enq/dequeue (size: 32): 18.65
> 
> ### Testing using two physical cores ###
> SP/SC bulk enq/dequeue (size: 8): 37.41
> MP/MC bulk enq/dequeue (size: 8): 72.98
> SP/SC bulk enq/dequeue (size: 32): 18.69
> MP/MC bulk enq/dequeue (size: 32): 24.59
> Test OK
> RTE>>
> 
> # with ring rework patch + cache-line size change to one on 128BCL target
> RTE>>ring_perf_autotest
> ### Testing single element and burst enq/deq ###
> SP/SC single enq/dequeue: 90
> MP/MC single enq/dequeue: 317
> SP/SC burst enq/dequeue (size: 8): 20
> MP/MC burst enq/dequeue (size: 8): 48
> SP/SC burst enq/dequeue (size: 32): 11
> MP/MC burst enq/dequeue (size: 32): 18
> 
> ### Testing empty dequeue ###
> SC empty dequeue: 8.10
> MC empty dequeue: 11.15
> 
> ### Testing using a single lcore ###
> SP/SC bulk enq/dequeue (size: 8): 20.24
> MP/MC bulk enq/dequeue (size: 8): 48.43
> SP/SC bulk enq/dequeue (size: 32): 11.01
> MP/MC bulk enq/dequeue (size: 32): 18.43
> 
> ### Testing using two physical cores ###
> SP/SC bulk enq/dequeue (size: 8): 25.92
> MP/MC bulk enq/dequeue (size: 8): 69.76
> SP/SC bulk enq/dequeue (size: 32): 14.27
> MP/MC bulk enq/dequeue (size: 32): 22.94
> Test OK
> RTE>>

So given that there is not much difference here, is the MIN_SIZE i.e.
forced 64B, your preference, rather than actual cacheline-size?

/Bruce

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] mk: Provide option to set Major ABI version
  2017-03-01  9:31  4%     ` Christian Ehrhardt
@ 2017-03-01  9:34 20%       ` Christian Ehrhardt
  2017-03-01 14:35  4%         ` Jan Blunck
  0 siblings, 1 reply; 200+ results
From: Christian Ehrhardt @ 2017-03-01  9:34 UTC (permalink / raw)
  To: dev
  Cc: Christian Ehrhardt, cjcollier @ linuxfoundation . org,
	ricardo.salveti, Luca Boccassi

Downstreams might want to provide different DPDK releases at the same
time to support multiple consumers of DPDK linked against older and newer
sonames.

Also due to the interdependencies that DPDK libraries can have applications
might end up with an executable space in which multiple versions of a
library are mapped by ld.so.

Think of LibA that got an ABI bump and LibB that did not get an ABI bump
but is depending on LibA.

    Application
    \-> LibA.old
    \-> LibB.new -> LibA.new

That is a conflict which can be avoided by setting CONFIG_RTE_MAJOR_ABI.
If set CONFIG_RTE_MAJOR_ABI overwrites any LIBABIVER value.
An example might be ``CONFIG_RTE_MAJOR_ABI=16.11`` which will make all
libraries librte<?>.so.16.11 instead of librte<?>.so.<LIBABIVER>.

We need to cut arbitrary long stings after the .so now and this would work
for any ABI version in LIBABIVER:
  $(Q)ln -s -f $< $(patsubst %.$(LIBABIVER),%,$@)
But using the following instead additionally allows to simplify the Make
File for the CONFIG_RTE_NEXT_ABI case.
  $(Q)ln -s -f $< $(shell echo $@ | sed 's/\.so.*/.so/')

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
---
 config/common_base                     |  5 +++++
 doc/guides/contributing/versioning.rst | 25 +++++++++++++++++++++++++
 mk/rte.lib.mk                          | 14 +++++++++-----
 3 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/config/common_base b/config/common_base
index aeee13e..37aa1e1 100644
--- a/config/common_base
+++ b/config/common_base
@@ -75,6 +75,11 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 CONFIG_RTE_NEXT_ABI=y
 
 #
+# Major ABI to overwrite library specific LIBABIVER
+#
+CONFIG_RTE_MAJOR_ABI=
+
+#
 # Machine's cache line size
 #
 CONFIG_RTE_CACHE_LINE_SIZE=64
diff --git a/doc/guides/contributing/versioning.rst b/doc/guides/contributing/versioning.rst
index fbc44a7..8aaf370 100644
--- a/doc/guides/contributing/versioning.rst
+++ b/doc/guides/contributing/versioning.rst
@@ -133,6 +133,31 @@ The macros exported are:
   fully qualified function ``p``, so that if a symbol becomes versioned, it
   can still be mapped back to the public symbol name.
 
+Setting a Major ABI version
+---------------------------
+
+Downstreams might want to provide different DPDK releases at the same time to
+support multiple consumers of DPDK linked against older and newer sonames.
+
+Also due to the interdependencies that DPDK libraries can have applications
+might end up with an executable space in which multiple versions of a library
+are mapped by ld.so.
+
+Think of LibA that got an ABI bump and LibB that did not get an ABI bump but is
+depending on LibA.
+
+.. note::
+
+    Application
+    \-> LibA.old
+    \-> LibB.new -> LibA.new
+
+That is a conflict which can be avoided by setting ``CONFIG_RTE_MAJOR_ABI``.
+If set, the value of ``CONFIG_RTE_MAJOR_ABI`` overwrites all - otherwise per
+library - versions defined in the libraries ``LIBABIVER``.
+An example might be ``CONFIG_RTE_MAJOR_ABI=16.11`` which will make all libraries
+``librte<?>.so.16.11`` instead of ``librte<?>.so.<LIBABIVER>``.
+
 Examples of ABI Macro use
 -------------------------
 
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 33a5f5a..1ffbf42 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -40,12 +40,20 @@ EXTLIB_BUILD ?= n
 # VPATH contains at least SRCDIR
 VPATH += $(SRCDIR)
 
+ifneq ($(CONFIG_RTE_MAJOR_ABI),)
+ifneq ($(LIBABIVER),)
+LIBABIVER := $(CONFIG_RTE_MAJOR_ABI)
+endif
+endif
+
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
 ifeq ($(EXTLIB_BUILD),n)
+ifeq ($(CONFIG_RTE_MAJOR_ABI),)
 ifeq ($(CONFIG_RTE_NEXT_ABI),y)
 LIB := $(LIB).1
 endif
+endif
 CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
 endif
 endif
@@ -156,11 +164,7 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
 	@[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
 	$(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
-ifeq ($(CONFIG_RTE_NEXT_ABI)$(EXTLIB_BUILD),yn)
-	$(Q)ln -s -f $< $(basename $(basename $@))
-else
-	$(Q)ln -s -f $< $(basename $@)
-endif
+	$(Q)ln -s -f $< $(shell echo $@ | sed 's/\.so.*/.so/')
 endif
 
 #
-- 
2.7.4

^ permalink raw reply	[relevance 20%]

* Re: [dpdk-dev] [PATCH] mk: Provide option to set Major ABI version
  2017-02-28  8:34  4%   ` Jan Blunck
@ 2017-03-01  9:31  4%     ` Christian Ehrhardt
  2017-03-01  9:34 20%       ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt
  0 siblings, 1 reply; 200+ results
From: Christian Ehrhardt @ 2017-03-01  9:31 UTC (permalink / raw)
  To: Jan Blunck
  Cc: dev, cjcollier @ linuxfoundation . org, ricardo.salveti, Luca Boccassi

On Tue, Feb 28, 2017 at 9:34 AM, Jan Blunck <jblunck@infradead.org> wrote:

> In case CONFIG_RTE_NEXT_ABI=y is set this is actually generating
> shared objects with suffix:
>
>   .so.$(CONFIG_RTE_MAJOR_ABI).1
>
> I don't think that this is the intention.
>

You are right, thanks for the catch Jan!
The fix is a trivial extra ifeq - a V2 is on the way soon.


-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
  2017-02-28 13:52  0%         ` Bruce Richardson
@ 2017-02-28 17:54  0%           ` Jerin Jacob
  2017-03-01  9:47  0%             ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2017-02-28 17:54 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: olivier.matz, dev

On Tue, Feb 28, 2017 at 01:52:26PM +0000, Bruce Richardson wrote:
> On Tue, Feb 28, 2017 at 05:38:34PM +0530, Jerin Jacob wrote:
> > On Tue, Feb 28, 2017 at 11:57:03AM +0000, Bruce Richardson wrote:
> > > On Tue, Feb 28, 2017 at 05:05:13PM +0530, Jerin Jacob wrote:
> > > > On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson wrote:
> > > > > Users compiling DPDK should not need to know or care about the arrangement
> > > > > of cachelines in the rte_ring structure. Therefore just remove the build
> > > > > option and set the structures to be always split. For improved
> > > > > performance use 128B rather than 64B alignment since it stops the producer
> > > > > and consumer data being on adjacent cachelines.
> > > > > 
> > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > > ---
> > > > >  config/common_base                     | 1 -
> > > > >  doc/guides/rel_notes/release_17_05.rst | 6 ++++++
> > > > >  lib/librte_ring/rte_ring.c             | 2 --
> > > > >  lib/librte_ring/rte_ring.h             | 8 ++------
> > > > >  4 files changed, 8 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/config/common_base b/config/common_base
> > > > > index aeee13e..099ffda 100644
> > > > > --- a/config/common_base
> > > > > +++ b/config/common_base
> > > > > @@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> > > > >  #
> > > > >  CONFIG_RTE_LIBRTE_RING=y
> > > > >  CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > > > -CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > > >  CONFIG_RTE_RING_PAUSE_REP_COUNT=0
> > > > >  
> > > > >  #
> > > > > diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
> > > > > index e25ea9f..ea45e0c 100644
> > > > > --- a/doc/guides/rel_notes/release_17_05.rst
> > > > > +++ b/doc/guides/rel_notes/release_17_05.rst
> > > > > @@ -110,6 +110,12 @@ API Changes
> > > > >     Also, make sure to start the actual text at the margin.
> > > > >     =========================================================
> > > > >  
> > > > > +* **Reworked rte_ring library**
> > > > > +
> > > > > +  The rte_ring library has been reworked and updated. The following changes
> > > > > +  have been made to it:
> > > > > +
> > > > > +  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
> > > > >  
> > > > >  ABI Changes
> > > > >  -----------
> > > > > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > > > > index ca0a108..4bc6da1 100644
> > > > > --- a/lib/librte_ring/rte_ring.c
> > > > > +++ b/lib/librte_ring/rte_ring.c
> > > > > @@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
> > > > >  	/* compilation-time checks */
> > > > >  	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
> > > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > > -#ifdef RTE_RING_SPLIT_PROD_CONS
> > > > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
> > > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > > -#endif
> > > > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
> > > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > >  #ifdef RTE_LIBRTE_RING_DEBUG
> > > > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > > > > index 72ccca5..04fe667 100644
> > > > > --- a/lib/librte_ring/rte_ring.h
> > > > > +++ b/lib/librte_ring/rte_ring.h
> > > > > @@ -168,7 +168,7 @@ struct rte_ring {
> > > > >  		uint32_t mask;           /**< Mask (size-1) of ring. */
> > > > >  		volatile uint32_t head;  /**< Producer head. */
> > > > >  		volatile uint32_t tail;  /**< Producer tail. */
> > > > > -	} prod __rte_cache_aligned;
> > > > > +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);
> > > > 
> > > > I think we need to use RTE_CACHE_LINE_MIN_SIZE instead of
> > > > RTE_CACHE_LINE_SIZE for alignment here. PPC and ThunderX1 targets are cache line
> > > > size of 128B
> > > > 
> > > Sure.
> > > 
> > > However, can you perhaps try a performance test and check to see if
> > > there is a performance difference between the two values before I change
> > > it? In my tests I see improved performance by having an extra blank
> > > cache-line between the producer and consumer data.
> > 
> > Sure. Which test are you running to measure the performance difference?
> > Is it app/test/test_ring_perf.c?
> > 
> > > 
> Yep, just the basic ring perf test. I look mostly at the core-to-core
> numbers, since hyperthread-to-hyperthread or NUMA socket to NUMA socket
> would be far less common use cases IMHO.

Performance test result shows regression with RTE_CACHE_LINE_MIN_SIZE
scheme in some use case and some use case has higher performance(Testing using
two physical cores)


# base code
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 84
MP/MC single enq/dequeue: 301
SP/SC burst enq/dequeue (size: 8): 20
MP/MC burst enq/dequeue (size: 8): 46
SP/SC burst enq/dequeue (size: 32): 12
MP/MC burst enq/dequeue (size: 32): 18

### Testing empty dequeue ###
SC empty dequeue: 7.11
MC empty dequeue: 12.15

### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 19.08
MP/MC bulk enq/dequeue (size: 8): 46.28
SP/SC bulk enq/dequeue (size: 32): 11.89
MP/MC bulk enq/dequeue (size: 32): 18.84

### Testing using two physical cores ###
SP/SC bulk enq/dequeue (size: 8): 37.42
MP/MC bulk enq/dequeue (size: 8): 73.32
SP/SC bulk enq/dequeue (size: 32): 18.69
MP/MC bulk enq/dequeue (size: 32): 24.59
Test OK

# with ring rework patch
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 84
MP/MC single enq/dequeue: 301
SP/SC burst enq/dequeue (size: 8): 19
MP/MC burst enq/dequeue (size: 8): 45
SP/SC burst enq/dequeue (size: 32): 11
MP/MC burst enq/dequeue (size: 32): 18

### Testing empty dequeue ###
SC empty dequeue: 7.10
MC empty dequeue: 12.15

### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 18.59
MP/MC bulk enq/dequeue (size: 8): 45.49
SP/SC bulk enq/dequeue (size: 32): 11.67
MP/MC bulk enq/dequeue (size: 32): 18.65

### Testing using two physical cores ###
SP/SC bulk enq/dequeue (size: 8): 37.41
MP/MC bulk enq/dequeue (size: 8): 72.98
SP/SC bulk enq/dequeue (size: 32): 18.69
MP/MC bulk enq/dequeue (size: 32): 24.59
Test OK
RTE>>

# with ring rework patch + cache-line size change to one on 128BCL target
RTE>>ring_perf_autotest
### Testing single element and burst enq/deq ###
SP/SC single enq/dequeue: 90
MP/MC single enq/dequeue: 317
SP/SC burst enq/dequeue (size: 8): 20
MP/MC burst enq/dequeue (size: 8): 48
SP/SC burst enq/dequeue (size: 32): 11
MP/MC burst enq/dequeue (size: 32): 18

### Testing empty dequeue ###
SC empty dequeue: 8.10
MC empty dequeue: 11.15

### Testing using a single lcore ###
SP/SC bulk enq/dequeue (size: 8): 20.24
MP/MC bulk enq/dequeue (size: 8): 48.43
SP/SC bulk enq/dequeue (size: 32): 11.01
MP/MC bulk enq/dequeue (size: 32): 18.43

### Testing using two physical cores ###
SP/SC bulk enq/dequeue (size: 8): 25.92
MP/MC bulk enq/dequeue (size: 8): 69.76
SP/SC bulk enq/dequeue (size: 32): 14.27
MP/MC bulk enq/dequeue (size: 32): 22.94
Test OK
RTE>>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
  2017-02-28 12:08  0%       ` Jerin Jacob
@ 2017-02-28 13:52  0%         ` Bruce Richardson
  2017-02-28 17:54  0%           ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2017-02-28 13:52 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: olivier.matz, dev

On Tue, Feb 28, 2017 at 05:38:34PM +0530, Jerin Jacob wrote:
> On Tue, Feb 28, 2017 at 11:57:03AM +0000, Bruce Richardson wrote:
> > On Tue, Feb 28, 2017 at 05:05:13PM +0530, Jerin Jacob wrote:
> > > On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson wrote:
> > > > Users compiling DPDK should not need to know or care about the arrangement
> > > > of cachelines in the rte_ring structure. Therefore just remove the build
> > > > option and set the structures to be always split. For improved
> > > > performance use 128B rather than 64B alignment since it stops the producer
> > > > and consumer data being on adjacent cachelines.
> > > > 
> > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > > ---
> > > >  config/common_base                     | 1 -
> > > >  doc/guides/rel_notes/release_17_05.rst | 6 ++++++
> > > >  lib/librte_ring/rte_ring.c             | 2 --
> > > >  lib/librte_ring/rte_ring.h             | 8 ++------
> > > >  4 files changed, 8 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/config/common_base b/config/common_base
> > > > index aeee13e..099ffda 100644
> > > > --- a/config/common_base
> > > > +++ b/config/common_base
> > > > @@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> > > >  #
> > > >  CONFIG_RTE_LIBRTE_RING=y
> > > >  CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > > -CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > > >  CONFIG_RTE_RING_PAUSE_REP_COUNT=0
> > > >  
> > > >  #
> > > > diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
> > > > index e25ea9f..ea45e0c 100644
> > > > --- a/doc/guides/rel_notes/release_17_05.rst
> > > > +++ b/doc/guides/rel_notes/release_17_05.rst
> > > > @@ -110,6 +110,12 @@ API Changes
> > > >     Also, make sure to start the actual text at the margin.
> > > >     =========================================================
> > > >  
> > > > +* **Reworked rte_ring library**
> > > > +
> > > > +  The rte_ring library has been reworked and updated. The following changes
> > > > +  have been made to it:
> > > > +
> > > > +  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
> > > >  
> > > >  ABI Changes
> > > >  -----------
> > > > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > > > index ca0a108..4bc6da1 100644
> > > > --- a/lib/librte_ring/rte_ring.c
> > > > +++ b/lib/librte_ring/rte_ring.c
> > > > @@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
> > > >  	/* compilation-time checks */
> > > >  	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
> > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > -#ifdef RTE_RING_SPLIT_PROD_CONS
> > > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
> > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > > -#endif
> > > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
> > > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > >  #ifdef RTE_LIBRTE_RING_DEBUG
> > > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > > > index 72ccca5..04fe667 100644
> > > > --- a/lib/librte_ring/rte_ring.h
> > > > +++ b/lib/librte_ring/rte_ring.h
> > > > @@ -168,7 +168,7 @@ struct rte_ring {
> > > >  		uint32_t mask;           /**< Mask (size-1) of ring. */
> > > >  		volatile uint32_t head;  /**< Producer head. */
> > > >  		volatile uint32_t tail;  /**< Producer tail. */
> > > > -	} prod __rte_cache_aligned;
> > > > +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);
> > > 
> > > I think we need to use RTE_CACHE_LINE_MIN_SIZE instead of
> > > RTE_CACHE_LINE_SIZE for alignment here. PPC and ThunderX1 targets are cache line
> > > size of 128B
> > > 
> > Sure.
> > 
> > However, can you perhaps try a performance test and check to see if
> > there is a performance difference between the two values before I change
> > it? In my tests I see improved performance by having an extra blank
> > cache-line between the producer and consumer data.
> 
> Sure. Which test are you running to measure the performance difference?
> Is it app/test/test_ring_perf.c?
> 
> > 
Yep, just the basic ring perf test. I look mostly at the core-to-core
numbers, since hyperthread-to-hyperthread or NUMA socket to NUMA socket
would be far less common use cases IMHO.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
  2017-02-28 11:57  0%     ` Bruce Richardson
@ 2017-02-28 12:08  0%       ` Jerin Jacob
  2017-02-28 13:52  0%         ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2017-02-28 12:08 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: olivier.matz, dev

On Tue, Feb 28, 2017 at 11:57:03AM +0000, Bruce Richardson wrote:
> On Tue, Feb 28, 2017 at 05:05:13PM +0530, Jerin Jacob wrote:
> > On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson wrote:
> > > Users compiling DPDK should not need to know or care about the arrangement
> > > of cachelines in the rte_ring structure. Therefore just remove the build
> > > option and set the structures to be always split. For improved
> > > performance use 128B rather than 64B alignment since it stops the producer
> > > and consumer data being on adjacent cachelines.
> > > 
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> > >  config/common_base                     | 1 -
> > >  doc/guides/rel_notes/release_17_05.rst | 6 ++++++
> > >  lib/librte_ring/rte_ring.c             | 2 --
> > >  lib/librte_ring/rte_ring.h             | 8 ++------
> > >  4 files changed, 8 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/config/common_base b/config/common_base
> > > index aeee13e..099ffda 100644
> > > --- a/config/common_base
> > > +++ b/config/common_base
> > > @@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> > >  #
> > >  CONFIG_RTE_LIBRTE_RING=y
> > >  CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > > -CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> > >  CONFIG_RTE_RING_PAUSE_REP_COUNT=0
> > >  
> > >  #
> > > diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
> > > index e25ea9f..ea45e0c 100644
> > > --- a/doc/guides/rel_notes/release_17_05.rst
> > > +++ b/doc/guides/rel_notes/release_17_05.rst
> > > @@ -110,6 +110,12 @@ API Changes
> > >     Also, make sure to start the actual text at the margin.
> > >     =========================================================
> > >  
> > > +* **Reworked rte_ring library**
> > > +
> > > +  The rte_ring library has been reworked and updated. The following changes
> > > +  have been made to it:
> > > +
> > > +  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
> > >  
> > >  ABI Changes
> > >  -----------
> > > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > > index ca0a108..4bc6da1 100644
> > > --- a/lib/librte_ring/rte_ring.c
> > > +++ b/lib/librte_ring/rte_ring.c
> > > @@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
> > >  	/* compilation-time checks */
> > >  	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
> > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > -#ifdef RTE_RING_SPLIT_PROD_CONS
> > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
> > >  			  RTE_CACHE_LINE_MASK) != 0);
> > > -#endif
> > >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
> > >  			  RTE_CACHE_LINE_MASK) != 0);
> > >  #ifdef RTE_LIBRTE_RING_DEBUG
> > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > > index 72ccca5..04fe667 100644
> > > --- a/lib/librte_ring/rte_ring.h
> > > +++ b/lib/librte_ring/rte_ring.h
> > > @@ -168,7 +168,7 @@ struct rte_ring {
> > >  		uint32_t mask;           /**< Mask (size-1) of ring. */
> > >  		volatile uint32_t head;  /**< Producer head. */
> > >  		volatile uint32_t tail;  /**< Producer tail. */
> > > -	} prod __rte_cache_aligned;
> > > +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);
> > 
> > I think we need to use RTE_CACHE_LINE_MIN_SIZE instead of
> > RTE_CACHE_LINE_SIZE for alignment here. PPC and ThunderX1 targets are cache line
> > size of 128B
> > 
> Sure.
> 
> However, can you perhaps try a performance test and check to see if
> there is a performance difference between the two values before I change
> it? In my tests I see improved performance by having an extra blank
> cache-line between the producer and consumer data.

Sure. Which test are you running to measure the performance difference?
Is it app/test/test_ring_perf.c?

> 
> /Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
  2017-02-28 11:35  0%   ` Jerin Jacob
@ 2017-02-28 11:57  0%     ` Bruce Richardson
  2017-02-28 12:08  0%       ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2017-02-28 11:57 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: olivier.matz, dev

On Tue, Feb 28, 2017 at 05:05:13PM +0530, Jerin Jacob wrote:
> On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson wrote:
> > Users compiling DPDK should not need to know or care about the arrangement
> > of cachelines in the rte_ring structure. Therefore just remove the build
> > option and set the structures to be always split. For improved
> > performance use 128B rather than 64B alignment since it stops the producer
> > and consumer data being on adjacent cachelines.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  config/common_base                     | 1 -
> >  doc/guides/rel_notes/release_17_05.rst | 6 ++++++
> >  lib/librte_ring/rte_ring.c             | 2 --
> >  lib/librte_ring/rte_ring.h             | 8 ++------
> >  4 files changed, 8 insertions(+), 9 deletions(-)
> > 
> > diff --git a/config/common_base b/config/common_base
> > index aeee13e..099ffda 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> >  #
> >  CONFIG_RTE_LIBRTE_RING=y
> >  CONFIG_RTE_LIBRTE_RING_DEBUG=n
> > -CONFIG_RTE_RING_SPLIT_PROD_CONS=n
> >  CONFIG_RTE_RING_PAUSE_REP_COUNT=0
> >  
> >  #
> > diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
> > index e25ea9f..ea45e0c 100644
> > --- a/doc/guides/rel_notes/release_17_05.rst
> > +++ b/doc/guides/rel_notes/release_17_05.rst
> > @@ -110,6 +110,12 @@ API Changes
> >     Also, make sure to start the actual text at the margin.
> >     =========================================================
> >  
> > +* **Reworked rte_ring library**
> > +
> > +  The rte_ring library has been reworked and updated. The following changes
> > +  have been made to it:
> > +
> > +  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
> >  
> >  ABI Changes
> >  -----------
> > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> > index ca0a108..4bc6da1 100644
> > --- a/lib/librte_ring/rte_ring.c
> > +++ b/lib/librte_ring/rte_ring.c
> > @@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
> >  	/* compilation-time checks */
> >  	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
> >  			  RTE_CACHE_LINE_MASK) != 0);
> > -#ifdef RTE_RING_SPLIT_PROD_CONS
> >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
> >  			  RTE_CACHE_LINE_MASK) != 0);
> > -#endif
> >  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
> >  			  RTE_CACHE_LINE_MASK) != 0);
> >  #ifdef RTE_LIBRTE_RING_DEBUG
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index 72ccca5..04fe667 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -168,7 +168,7 @@ struct rte_ring {
> >  		uint32_t mask;           /**< Mask (size-1) of ring. */
> >  		volatile uint32_t head;  /**< Producer head. */
> >  		volatile uint32_t tail;  /**< Producer tail. */
> > -	} prod __rte_cache_aligned;
> > +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);
> 
> I think we need to use RTE_CACHE_LINE_MIN_SIZE instead of
> RTE_CACHE_LINE_SIZE for alignment here. PPC and ThunderX1 targets are cache line
> size of 128B
> 
Sure.

However, can you perhaps try a performance test and check to see if
there is a performance difference between the two values before I change
it? In my tests I see improved performance by having an extra blank
cache-line between the producer and consumer data.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting
  @ 2017-02-28 11:35  0%   ` Jerin Jacob
  2017-02-28 11:57  0%     ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2017-02-28 11:35 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: olivier.matz, dev

On Thu, Feb 23, 2017 at 05:23:54PM +0000, Bruce Richardson wrote:
> Users compiling DPDK should not need to know or care about the arrangement
> of cachelines in the rte_ring structure. Therefore just remove the build
> option and set the structures to be always split. For improved
> performance use 128B rather than 64B alignment since it stops the producer
> and consumer data being on adjacent cachelines.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  config/common_base                     | 1 -
>  doc/guides/rel_notes/release_17_05.rst | 6 ++++++
>  lib/librte_ring/rte_ring.c             | 2 --
>  lib/librte_ring/rte_ring.h             | 8 ++------
>  4 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/config/common_base b/config/common_base
> index aeee13e..099ffda 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -448,7 +448,6 @@ CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
>  #
>  CONFIG_RTE_LIBRTE_RING=y
>  CONFIG_RTE_LIBRTE_RING_DEBUG=n
> -CONFIG_RTE_RING_SPLIT_PROD_CONS=n
>  CONFIG_RTE_RING_PAUSE_REP_COUNT=0
>  
>  #
> diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
> index e25ea9f..ea45e0c 100644
> --- a/doc/guides/rel_notes/release_17_05.rst
> +++ b/doc/guides/rel_notes/release_17_05.rst
> @@ -110,6 +110,12 @@ API Changes
>     Also, make sure to start the actual text at the margin.
>     =========================================================
>  
> +* **Reworked rte_ring library**
> +
> +  The rte_ring library has been reworked and updated. The following changes
> +  have been made to it:
> +
> +  * removed the build-time setting ``CONFIG_RTE_RING_SPLIT_PROD_CONS``
>  
>  ABI Changes
>  -----------
> diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> index ca0a108..4bc6da1 100644
> --- a/lib/librte_ring/rte_ring.c
> +++ b/lib/librte_ring/rte_ring.c
> @@ -127,10 +127,8 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
>  	/* compilation-time checks */
>  	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
>  			  RTE_CACHE_LINE_MASK) != 0);
> -#ifdef RTE_RING_SPLIT_PROD_CONS
>  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, cons) &
>  			  RTE_CACHE_LINE_MASK) != 0);
> -#endif
>  	RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
>  			  RTE_CACHE_LINE_MASK) != 0);
>  #ifdef RTE_LIBRTE_RING_DEBUG
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 72ccca5..04fe667 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -168,7 +168,7 @@ struct rte_ring {
>  		uint32_t mask;           /**< Mask (size-1) of ring. */
>  		volatile uint32_t head;  /**< Producer head. */
>  		volatile uint32_t tail;  /**< Producer tail. */
> -	} prod __rte_cache_aligned;
> +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);

I think we need to use RTE_CACHE_LINE_MIN_SIZE instead of
RTE_CACHE_LINE_SIZE for alignment here. PPC and ThunderX1 targets are cache line
size of 128B

> +	} prod __rte_aligned(RTE_CACHE_LINE_SIZE * 2);


>  
>  	/** Ring consumer status. */
>  	struct cons {
> @@ -177,11 +177,7 @@ struct rte_ring {
>  		uint32_t mask;           /**< Mask (size-1) of ring. */
>  		volatile uint32_t head;  /**< Consumer head. */
>  		volatile uint32_t tail;  /**< Consumer tail. */
> -#ifdef RTE_RING_SPLIT_PROD_CONS
> -	} cons __rte_cache_aligned;
> -#else
> -	} cons;
> -#endif
> +	} cons __rte_aligned(RTE_CACHE_LINE_SIZE * 2);
>  
>  #ifdef RTE_LIBRTE_RING_DEBUG
>  	struct rte_ring_debug_stats stats[RTE_MAX_LCORE];
> -- 
> 2.9.3
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] mk: Provide option to set Major ABI version
  @ 2017-02-28  8:34  4%   ` Jan Blunck
  2017-03-01  9:31  4%     ` Christian Ehrhardt
  0 siblings, 1 reply; 200+ results
From: Jan Blunck @ 2017-02-28  8:34 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: dev, cjcollier @ linuxfoundation . org, ricardo.salveti, Luca Boccassi

On Wed, Feb 22, 2017 at 2:24 PM, Christian Ehrhardt
<christian.ehrhardt@canonical.com> wrote:
> --- a/mk/rte.lib.mk
> +++ b/mk/rte.lib.mk
> @@ -40,6 +40,12 @@ EXTLIB_BUILD ?= n
>  # VPATH contains at least SRCDIR
>  VPATH += $(SRCDIR)
>
> +ifneq ($(CONFIG_RTE_MAJOR_ABI),)
> +ifneq ($(LIBABIVER),)
> +LIBABIVER := $(CONFIG_RTE_MAJOR_ABI)
> +endif
> +endif
> +
>  ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
>  LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
>  ifeq ($(EXTLIB_BUILD),n)
> @@ -156,11 +162,7 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
>         @[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
>         $(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
>  ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
> -ifeq ($(CONFIG_RTE_NEXT_ABI)$(EXTLIB_BUILD),yn)
> -       $(Q)ln -s -f $< $(basename $(basename $@))
> -else
> -       $(Q)ln -s -f $< $(basename $@)
> -endif
> +       $(Q)ln -s -f $< $(shell echo $@ | sed 's/\.so.*/.so/')
>  endif
>

In case CONFIG_RTE_NEXT_ABI=y is set this is actually generating
shared objects with suffix:

  .so.$(CONFIG_RTE_MAJOR_ABI).1

I don't think that this is the intention.

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 04/15] net/avp: add PMD version map file
  @ 2017-02-26 19:08  3%   ` Allain Legacy
    1 sibling, 0 replies; 200+ results
From: Allain Legacy @ 2017-02-26 19:08 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev

Adds a default ABI version file for the AVP PMD.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
---
 drivers/net/avp/rte_pmd_avp_version.map | 4 ++++
 1 file changed, 4 insertions(+)
 create mode 100644 drivers/net/avp/rte_pmd_avp_version.map

diff --git a/drivers/net/avp/rte_pmd_avp_version.map b/drivers/net/avp/rte_pmd_avp_version.map
new file mode 100644
index 0000000..af8f3f4
--- /dev/null
+++ b/drivers/net/avp/rte_pmd_avp_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+
+    local: *;
+};
-- 
1.8.3.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH 04/16] net/avp: add PMD version map file
  @ 2017-02-25  1:23  3% ` Allain Legacy
    1 sibling, 0 replies; 200+ results
From: Allain Legacy @ 2017-02-25  1:23 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev

Adds a default ABI version file for the AVP PMD.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
---
 drivers/net/avp/rte_pmd_avp_version.map | 4 ++++
 1 file changed, 4 insertions(+)
 create mode 100644 drivers/net/avp/rte_pmd_avp_version.map

diff --git a/drivers/net/avp/rte_pmd_avp_version.map b/drivers/net/avp/rte_pmd_avp_version.map
new file mode 100644
index 0000000..af8f3f4
--- /dev/null
+++ b/drivers/net/avp/rte_pmd_avp_version.map
@@ -0,0 +1,4 @@
+DPDK_17.05 {
+
+    local: *;
+};
-- 
1.8.3.1

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 2/2] ethdev: add hierarchical scheduler API
  @ 2017-02-24 16:28  1% ` Cristian Dumitrescu
  0 siblings, 0 replies; 200+ results
From: Cristian Dumitrescu @ 2017-02-24 16:28 UTC (permalink / raw)
  To: dev; +Cc: thomas.monjalon, jerin.jacob, hemant.agrawal

This patch introduces the generic ethdev API for the hierarchical scheduler
capability.

Main features:
- Exposed as ethdev plugin capability (similar to rte_flow approach)
- Capability query API per port, per hierarchy level and per hierarchy node
- Scheduling algorithms: Strict Priority (SP), Weighed Fair Queuing (WFQ),
  Weighted Round Robin (WRR)
- Traffic shaping: single/dual rate, private (per node) and shared (by multiple
  nodes) shapers
- Congestion management for hierarchy leaf nodes: algorithms of tail drop,
  head drop, WRED; private (per node) and shared (by multiple nodes) WRED
  contexts
- Packet marking: IEEE 802.1q (VLAN DEI), IETF RFC 3168 (IPv4/IPv6 ECN for
  TCP and SCTP), IETF RFC 2597 (IPv4 / IPv6 DSCP)

Changes in v2:
- Implemented feedback from Hemant [4]
- Improvements on the capability API
	- Added capability API for hierarchy level
	- Merged stats capability into the capability API
	- Added dynamic updates
	- Added non-leaf/leaf union to the node capability structure
	- Renamed sp_priority_min to sp_n_priorities_max, added clarifications
	- Fixed description for sp_n_children_max
- Clarified and enforced rule on node ID range for leaf and non-leaf nodes
	- Added API functions to get node type (i.e. leaf/non-leaf):
	  get_leaf_nodes(), node_type_get()
- Added clarification for the root node: its creation, its parent, its role
	- Macro NODE_ID_NULL as root node's parent
	- Description of the node_add() and node_parent_update() API functions
- Added clarification for the first time add vs. subsequent updates rule
	- Cleaned up the description for the node_add() function
- Statistics API improvements
	- Merged stats capability into the capability API
	- Added API function node_stats_update()
	- Added more stats per packet color
- Added more error types
- Fixed small Doxygen style issues

Changes in v1 (since RFC [1]):
- Implemented as ethdev plugin (similar to rte_flow) as opposed to more
  monolithic additions to ethdev itself
- Implemented feedback from Jerin [2] and Hemant [3]. Implemented all the
  suggested items with only one exception, see the long list below, hopefully
  nothing was forgotten.
    - The item not done (hopefully for a good reason): driver-generated object
      IDs. IMO the choice to have application-generated object IDs adds marginal
      complexity to the driver (search ID function required), but it provides
      huge simplification for the application. The app does not need to worry
      about building & managing tree-like structure for storing driver-generated
      object IDs, the app can use its own convention for node IDs depending on
      the specific hierarchy that it needs. Trivial example: identify all
      level-2 nodes with IDs like 100, 200, 300, … and the level-3 nodes based
      on their level-2 parents: 110, 120, 130, 140, …, 210, 220, 230, 240, …,
      310, 320, 330, … and level-4 nodes based on their level-3 parents: 111,
      112, 113, 114, …, 121, 122, 123, 124, …). Moreover, see the change log for
      the other related simplification that was implemented: leaf nodes now have
      predefined IDs that are the same with their Ethernet TX queue ID (
      therefore no translation is required for leaf nodes).
- Capability API. Done per port and per node as well.
- Dual rate shapers
- Added configuration of private shaper (per node) directly from the shaper
  profile as part of node API (no shaper ID needed for private shapers), while
  the shared shapers are configured outside of the node API using shaper profile
  and communicated to the node using shared shaper ID. So there is no
  configuration overhead for shared shapers if the app does not use any of them.
- Leaf nodes now have predefined IDs that are the same with their Ethernet TX
  queue ID (therefore no translation is required for leaf nodes). This is also
  used to differentiate between a leaf node and a non-leaf node.
- Domain-specific errors to give a precise indication of the error cause (same
  as done by rte_flow)
- Packet marking API
- Packet length optional adjustment for shapers, positive (e.g. for adding
  Ethernet framing overhead of 20 bytes) or negative (e.g. for rate limiting
  based on IP packet bytes)

Next steps:
- SW fallback based on librte_sched library (to be later introduced by
  standalone patch set)

[1] RFC: http://dpdk.org/ml/archives/dev/2016-November/050956.html
[2] Jerin’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054484.html
[3] Hemant’s feedback on RFC: http://www.dpdk.org/ml/archives/dev/2017-January/054866.html
[4] Hemant's feedback on v1: http://www.dpdk.org/ml/archives/dev/2017-February/058033.html

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 MAINTAINERS                            |    4 +
 lib/librte_ether/Makefile              |    5 +-
 lib/librte_ether/rte_ether_version.map |   30 +
 lib/librte_ether/rte_scheddev.c        |  781 ++++++++++++++++++
 lib/librte_ether/rte_scheddev.h        | 1416 ++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_scheddev_driver.h |  365 ++++++++
 6 files changed, 2600 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_ether/rte_scheddev.c
 create mode 100644 lib/librte_ether/rte_scheddev.h
 create mode 100644 lib/librte_ether/rte_scheddev_driver.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 24e0eff..8a8719f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -247,6 +247,10 @@ Flow API
 M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
 F: lib/librte_ether/rte_flow*
 
+SchedDev API
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: lib/librte_ether/rte_scheddev*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 1d095a9..7e0527f 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -45,6 +45,7 @@ LIBABIVER := 6
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
+SRCS-y += rte_scheddev.c
 
 #
 # Export include files
@@ -54,6 +55,8 @@ SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
 SYMLINK-y-include += rte_flow.h
 SYMLINK-y-include += rte_flow_driver.h
+SYMLINK-y-include += rte_scheddev.h
+SYMLINK-y-include += rte_scheddev_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 637317c..4d67eee 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -159,5 +159,35 @@ DPDK_17.05 {
 	global:
 
 	rte_eth_dev_capability_ops_get;
+	rte_scheddev_get_leaf_nodes;
+	rte_scheddev_node_type_get;
+	rte_scheddev_capabilities_get;
+	rte_scheddev_level_capabilities_get;
+	rte_scheddev_node_capabilities_get;
+	rte_scheddev_wred_profile_add;
+	rte_scheddev_wred_profile_delete;
+	rte_scheddev_shared_wred_context_add_update;
+	rte_scheddev_shared_wred_context_delete;
+	rte_scheddev_shaper_profile_add;
+	rte_scheddev_shaper_profile_delete;
+	rte_scheddev_shared_shaper_add_update;
+	rte_scheddev_shared_shaper_delete;
+	rte_scheddev_node_add;
+	rte_scheddev_node_delete;
+	rte_scheddev_node_suspend;
+	rte_scheddev_node_resume;
+	rte_scheddev_hierarchy_set;
+	rte_scheddev_node_parent_update;
+	rte_scheddev_node_shaper_update;
+	rte_scheddev_node_shared_shaper_update;
+	rte_scheddev_node_stats_update;
+	rte_scheddev_node_scheduling_mode_update;
+	rte_scheddev_node_cman_update;
+	rte_scheddev_node_wred_context_update;
+	rte_scheddev_node_shared_wred_context_update;
+	rte_scheddev_node_stats_read;
+	rte_scheddev_mark_vlan_dei;
+	rte_scheddev_mark_ip_ecn;
+	rte_scheddev_mark_ip_dscp;
 
 } DPDK_17.02;
diff --git a/lib/librte_ether/rte_scheddev.c b/lib/librte_ether/rte_scheddev.c
new file mode 100644
index 0000000..d9c7dfe
--- /dev/null
+++ b/lib/librte_ether/rte_scheddev.c
@@ -0,0 +1,781 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include "rte_ethdev.h"
+#include "rte_scheddev_driver.h"
+#include "rte_scheddev.h"
+
+/* Get generic scheduler operations structure from a port. */
+const struct rte_scheddev_ops *
+rte_scheddev_ops_get(uint8_t port_id, struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		rte_scheddev_error_set(error,
+			ENODEV,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENODEV));
+		return NULL;
+	}
+
+	if ((dev->dev_ops->cap_ops_get == NULL) ||
+		(dev->dev_ops->cap_ops_get(dev, RTE_ETH_CAPABILITY_SCHED,
+		&ops) != 0) || (ops == NULL)) {
+		rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+		return NULL;
+	}
+
+	return ops;
+}
+
+/* Get number of leaf nodes */
+int
+rte_scheddev_get_leaf_nodes(uint8_t port_id,
+	uint32_t *n_leaf_nodes,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (n_leaf_nodes == NULL) {
+		rte_scheddev_error_set(error,
+			EINVAL,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		return -rte_errno;
+	}
+
+	*n_leaf_nodes = dev->data->nb_tx_queues;
+	return 0;
+}
+
+/* Check node ID type (leaf or non-leaf) */
+int
+rte_scheddev_node_type_get(uint8_t port_id,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_type_get == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_type_get(dev, node_id, is_leaf, error);
+}
+
+/* Get capabilities */
+int rte_scheddev_capabilities_get(uint8_t port_id,
+	struct rte_scheddev_capabilities *cap,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->capabilities_get == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->capabilities_get(dev, cap, error);
+}
+
+/* Get level capabilities */
+int rte_scheddev_level_capabilities_get(uint8_t port_id,
+	uint32_t level_id,
+	struct rte_scheddev_level_capabilities *cap,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->level_capabilities_get == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->level_capabilities_get(dev, level_id, cap, error);
+}
+
+/* Get node capabilities */
+int rte_scheddev_node_capabilities_get(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_node_capabilities *cap,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_capabilities_get == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_capabilities_get(dev, node_id, cap, error);
+}
+
+/* Add WRED profile */
+int rte_scheddev_wred_profile_add(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_wred_params *profile,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->wred_profile_add == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->wred_profile_add(dev, wred_profile_id, profile, error);
+}
+
+/* Delete WRED profile */
+int rte_scheddev_wred_profile_delete(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->wred_profile_delete == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->wred_profile_delete(dev, wred_profile_id, error);
+}
+
+/* Add/update shared WRED context */
+int rte_scheddev_shared_wred_context_add_update(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->shared_wred_context_add_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->shared_wred_context_add_update(dev, shared_wred_context_id,
+		wred_profile_id, error);
+}
+
+/* Delete shared WRED context */
+int rte_scheddev_shared_wred_context_delete(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->shared_wred_context_delete == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->shared_wred_context_delete(dev, shared_wred_context_id,
+		error);
+}
+
+/* Add shaper profile */
+int rte_scheddev_shaper_profile_add(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_shaper_params *profile,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->shaper_profile_add == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->shaper_profile_add(dev, shaper_profile_id, profile, error);
+}
+
+/* Delete WRED profile */
+int rte_scheddev_shaper_profile_delete(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->shaper_profile_delete == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->shaper_profile_delete(dev, shaper_profile_id, error);
+}
+
+/* Add shared shaper */
+int rte_scheddev_shared_shaper_add_update(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->shared_shaper_add_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->shared_shaper_add_update(dev, shared_shaper_id,
+		shaper_profile_id, error);
+}
+
+/* Delete shared shaper */
+int rte_scheddev_shared_shaper_delete(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->shared_shaper_delete == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->shared_shaper_delete(dev, shared_shaper_id, error);
+}
+
+/* Add node to port scheduler hierarchy */
+int rte_scheddev_node_add(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_scheddev_node_params *params,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_add == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_add(dev, node_id, parent_node_id, priority, weight,
+		params, error);
+}
+
+/* Delete node from scheduler hierarchy */
+int rte_scheddev_node_delete(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_delete == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_delete(dev, node_id, error);
+}
+
+/* Suspend node */
+int rte_scheddev_node_suspend(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_suspend == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_suspend(dev, node_id, error);
+}
+
+/* Resume node */
+int rte_scheddev_node_resume(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_resume == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_resume(dev, node_id, error);
+}
+
+/* Set the initial port scheduler hierarchy */
+int rte_scheddev_hierarchy_set(uint8_t port_id,
+	int clear_on_fail,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->hierarchy_set == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->hierarchy_set(dev, clear_on_fail, error);
+}
+
+/* Update node parent  */
+int rte_scheddev_node_parent_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_parent_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_parent_update(dev, node_id, parent_node_id, priority,
+		weight, error);
+}
+
+/* Update node private shaper */
+int rte_scheddev_node_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_shaper_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_shaper_update(dev, node_id, shaper_profile_id,
+		error);
+}
+
+/* Update node shared shapers */
+int rte_scheddev_node_shared_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_shaper_id,
+	int add,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_shared_shaper_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_shared_shaper_update(dev, node_id, shared_shaper_id,
+		add, error);
+}
+
+/* Update node stats */
+int rte_scheddev_node_stats_update(uint8_t port_id,
+	uint32_t node_id,
+	uint64_t stats_mask,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_stats_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_stats_update(dev, node_id, stats_mask, error);
+}
+
+/* Update scheduling mode */
+int rte_scheddev_node_scheduling_mode_update(uint8_t port_id,
+	uint32_t node_id,
+	int *scheduling_mode_per_priority,
+	uint32_t n_priorities,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_scheduling_mode_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_scheduling_mode_update(dev, node_id,
+		scheduling_mode_per_priority, n_priorities, error);
+}
+
+/* Update node congestion management mode */
+int rte_scheddev_node_cman_update(uint8_t port_id,
+	uint32_t node_id,
+	enum rte_scheddev_cman_mode cman,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_cman_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_cman_update(dev, node_id, cman, error);
+}
+
+/* Update node private WRED context */
+int rte_scheddev_node_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_wred_context_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_wred_context_update(dev, node_id, wred_profile_id,
+		error);
+}
+
+/* Update node shared WRED context */
+int rte_scheddev_node_shared_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_wred_context_id,
+	int add,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_shared_wred_context_update == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_shared_wred_context_update(dev, node_id,
+		shared_wred_context_id, add, error);
+}
+
+/* Read and/or clear stats counters for specific node */
+int rte_scheddev_node_stats_read(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->node_stats_read == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->node_stats_read(dev, node_id, stats, stats_mask, clear,
+		error);
+}
+
+/* Packet marking - VLAN DEI */
+int rte_scheddev_mark_vlan_dei(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->mark_vlan_dei == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->mark_vlan_dei(dev, mark_green, mark_yellow, mark_red,
+		error);
+}
+
+/* Packet marking - IPv4/IPv6 ECN */
+int rte_scheddev_mark_ip_ecn(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->mark_ip_ecn == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->mark_ip_ecn(dev, mark_green, mark_yellow, mark_red, error);
+}
+
+/* Packet marking - IPv4/IPv6 DSCP */
+int rte_scheddev_mark_ip_dscp(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_scheddev_ops *ops =
+		rte_scheddev_ops_get(port_id, error);
+
+	if (ops == NULL)
+		return -rte_errno;
+
+	if (ops->mark_ip_dscp == NULL)
+		return -rte_scheddev_error_set(error,
+			ENOSYS,
+			RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOSYS));
+
+	return ops->mark_ip_dscp(dev, mark_green, mark_yellow, mark_red,
+		error);
+}
diff --git a/lib/librte_ether/rte_scheddev.h b/lib/librte_ether/rte_scheddev.h
new file mode 100644
index 0000000..1741f7a
--- /dev/null
+++ b/lib/librte_ether/rte_scheddev.h
@@ -0,0 +1,1416 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_SCHEDDEV_H__
+#define __INCLUDE_RTE_SCHEDDEV_H__
+
+/**
+ * @file
+ * RTE Generic Hierarchical Scheduler API
+ *
+ * This interface provides the ability to configure the hierarchical scheduler
+ * feature in a generic way.
+ */
+
+#include <stdint.h>
+
+#include <rte_red.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/** Ethernet framing overhead
+ *
+ * Overhead fields per Ethernet frame:
+ * 1. Preamble:                                            7 bytes;
+ * 2. Start of Frame Delimiter (SFD):                      1 byte;
+ * 3. Inter-Frame Gap (IFG):                              12 bytes.
+ */
+#define RTE_SCHEDDEV_ETH_FRAMING_OVERHEAD                  20
+
+/**
+ * Ethernet framing overhead plus Frame Check Sequence (FCS). Useful when FCS
+ * is generated and added at the end of the Ethernet frame on TX side without
+ * any SW intervention.
+ */
+#define RTE_SCHEDDEV_ETH_FRAMING_OVERHEAD_FCS              24
+
+/**< Invalid WRED profile ID */
+#define RTE_SCHEDDEV_WRED_PROFILE_ID_NONE                  UINT32_MAX
+
+/**< Invalid shaper profile ID */
+#define RTE_SCHEDDEV_SHAPER_PROFILE_ID_NONE                UINT32_MAX
+
+/**< Node ID for the parent of the root node */
+#define RTE_SCHEDDEV_NODE_ID_NULL                          UINT32_MAX
+
+/**
+ * Color
+ */
+enum rte_scheddev_color {
+	e_RTE_SCHEDDEV_GREEN = 0, /**< Green */
+	e_RTE_SCHEDDEV_YELLOW, /**< Yellow */
+	e_RTE_SCHEDDEV_RED, /**< Red */
+	e_RTE_SCHEDDEV_COLORS /**< Number of colors */
+};
+
+/**
+ * Node statistics counter type
+ */
+enum rte_scheddev_stats_type {
+	/**< Number of packets scheduled from current node. */
+	RTE_SCHEDDEV_STATS_N_PKTS = 1 << 0,
+
+	/**< Number of bytes scheduled from current node. */
+	RTE_SCHEDDEV_STATS_N_BYTES = 1 << 1,
+
+	/**< Number of green packets dropped by current leaf node.  */
+	RTE_SCHEDDEV_STATS_N_PKTS_GREEN_DROPPED = 1 << 2,
+
+	/**< Number of yellow packets dropped by current leaf node.  */
+	RTE_SCHEDDEV_STATS_N_PKTS_YELLOW_DROPPED = 1 << 3,
+
+	/**< Number of red packets dropped by current leaf node.  */
+	RTE_SCHEDDEV_STATS_N_PKTS_RED_DROPPED = 1 << 4,
+
+	/**< Number of green bytes dropped by current leaf node.  */
+	RTE_SCHEDDEV_STATS_N_BYTES_GREEN_DROPPED = 1 << 5,
+
+	/**< Number of yellow bytes dropped by current leaf node.  */
+	RTE_SCHEDDEV_STATS_N_BYTES_YELLOW_DROPPED = 1 << 6,
+
+	/**< Number of red bytes dropped by current leaf node.  */
+	RTE_SCHEDDEV_STATS_N_BYTES_RED_DROPPED = 1 << 7,
+
+	/**< Number of packets currently waiting in the packet queue of current
+	 * leaf node.
+	 */
+	RTE_SCHEDDEV_STATS_N_PKTS_QUEUED = 1 << 8,
+
+	/**< Number of bytes currently waiting in the packet queue of current
+	 * leaf node.
+	 */
+	RTE_SCHEDDEV_STATS_N_BYTES_QUEUED = 1 << 9,
+};
+
+/**
+ * Node statistics counters
+ */
+struct rte_scheddev_node_stats {
+	/**< Number of packets scheduled from current node. */
+	uint64_t n_pkts;
+
+	/**< Number of bytes scheduled from current node. */
+	uint64_t n_bytes;
+
+	/**< Statistics counters for leaf nodes only. */
+	struct {
+		/**< Number of packets dropped by current leaf node per each
+		 * color.
+		 */
+		uint64_t n_pkts_dropped[e_RTE_SCHEDDEV_COLORS];
+
+		/**< Number of bytes dropped by current leaf node per each
+		 * color.
+		 */
+		uint64_t n_bytes_dropped[e_RTE_SCHEDDEV_COLORS];
+
+		/**< Number of packets currently waiting in the packet queue of
+		 * current leaf node.
+		 */
+		uint64_t n_pkts_queued;
+
+		/**< Number of bytes currently waiting in the packet queue of
+		 * current leaf node.
+		 */
+		uint64_t n_bytes_queued;
+	} leaf;
+};
+
+/**
+ * Scheduler dynamic updates
+ */
+enum rte_scheddev_dynamic_update_type {
+	/**< Dynamic parent node update. */
+	RTE_SCHEDDEV_UPDATE_NODE_PARENT = 1 << 0,
+
+	/**< Dynamic node add/delete. */
+	RTE_SCHEDDEV_UPDATE_NODE_ADD_DELETE = 1 << 1,
+
+	/**< Suspend/resume nodes. */
+	RTE_SCHEDDEV_UPDATE_NODE_SUSPEND_RESUME = 1 << 2,
+
+	/**< Dynamic switch between WFQ and WRR per node SP priority level. */
+	RTE_SCHEDDEV_UPDATE_NODE_SCHEDULING_MODE = 1 << 3,
+
+	/**< Dynamic update of the set of enabled stats counter types. */
+	RTE_SCHEDDEV_UPDATE_NODE_STATS = 1 << 4,
+
+	/**< Dynamic update of congestion management mode for leaf nodes. */
+	RTE_SCHEDDEV_UPDATE_NODE_CMAN = 1 << 5,
+};
+
+/**
+ * Scheduler node capabilities
+ */
+struct rte_scheddev_node_capabilities {
+	/**< Private shaper support. */
+	int shaper_private_supported;
+
+	/**< Dual rate shaping support for private shaper. Valid only when
+	 * private shaper is supported.
+	 */
+	int shaper_private_dual_rate_supported;
+
+	/**< Minimum committed/peak rate (bytes per second) for private
+	 * shaper. Valid only when private shaper is supported.
+	 */
+	uint64_t shaper_private_rate_min;
+
+	/**< Maximum committed/peak rate (bytes per second) for private
+	 * shaper. Valid only when private shaper is supported.
+	 */
+	uint64_t shaper_private_rate_max;
+
+	/**< Maximum number of supported shared shapers. The value of zero
+	 * indicates that shared shapers are not supported.
+	 */
+	uint32_t shaper_shared_n_max;
+
+	/**< Mask of supported statistics counter types. */
+	uint64_t stats_mask;
+
+	union {
+		/**< Items valid only for non-leaf nodes. */
+		struct {
+			/**< Maximum number of children nodes. */
+			uint32_t n_children_max;
+
+			/**< Maximum number of supported priority levels. The
+			 * value of zero is invalid. The value of 1 indicates
+			 * that only priority 0 is supported, which essentially
+			 * means that Strict Priority (SP) algorithm is not
+			 * supported.
+			 */
+			uint32_t sp_n_priorities_max;
+
+			/**< Maximum number of sibling nodes that can have the
+			 * same priority at any given time. The value of zero is
+			 * invalid. The value of 1 indicates that WFQ/WRR
+			 * algorithms are not supported. The maximum value is
+			 * *n_children_max*.
+			 */
+			uint32_t sp_n_children_max;
+
+			/**< WFQ algorithm support. */
+			int wfq_supported;
+
+			/**< WRR algorithm support. */
+			int wrr_supported;
+
+			/**< Maximum WFQ/WRR weight. */
+			uint32_t wfq_wrr_weight_max;
+		} nonleaf;
+
+		/**< Items valid only for leaf nodes. */
+		struct {
+			/**< Head drop algorithm support. */
+			int cman_head_drop_supported;
+
+			/**< Private WRED context support. */
+			int cman_wred_context_private_supported;
+
+			/**< Maximum number of shared WRED contexts supported.
+			 * The value of zero indicates that shared WRED contexts
+			 * are not supported.
+			 */
+			uint32_t cman_wred_context_shared_n_max;
+		} leaf;
+	};
+};
+
+/**
+ * Scheduler level capabilities
+ */
+struct rte_scheddev_level_capabilities {
+	/**< Maximum number of nodes for the current hierarchy level. */
+	uint32_t n_nodes_max;
+
+	/**< Maximum number of non-leaf nodes for the current hierarchy level.
+	 * The value of 0 indicates that current level only supports leaf nodes.
+	 * The maximum value is *n_nodes_max*.
+	 */
+	uint32_t n_nodes_nonleaf_max;
+
+	/**< Maximum number of leaf nodes for the current hierarchy level. The
+	 * value of 0 indicates that current level only supports non-leaf nodes.
+	 * The maximum value is *n_nodes_max*.
+	 */
+	uint32_t n_nodes_leaf_max;
+
+	/**< Summary of node-level capabilities across all the non-leaf nodes
+	 * of the current hierarchy level. Valid only when *n_nodes_nonleaf_max*
+	 * is greater than 0.
+	 */
+	struct rte_scheddev_node_capabilities nonleaf;
+
+	/**< Summary of node-level capabilities across all the leaf nodes of the
+	 * current hierarchy level. Valid only when *n_nodes_leaf_max* is
+	 * greater than 0.
+	 */
+	struct rte_scheddev_node_capabilities leaf;
+};
+
+/**
+ * Scheduler capabilities
+ */
+struct rte_scheddev_capabilities {
+	/**< Maximum number of nodes. */
+	uint32_t n_nodes_max;
+
+	/**< Maximum number of levels (i.e. number of nodes connecting the root
+	 * node with any leaf node, including the root and the leaf).
+	 */
+	uint32_t n_levels_max;
+
+	/**< Maximum number of shapers, either private or shared. In case the
+	 * implementation does not share any resource between private and
+	 * shared shapers, it is typically equal to the sum between
+	 * *shaper_private_n_max* and *shaper_shared_n_max*.
+	 */
+	uint32_t shaper_n_max;
+
+	/**< Maximum number of private shapers. Indicates the maximum number of
+	 * nodes that can concurrently have the private shaper enabled.
+	 */
+	uint32_t shaper_private_n_max;
+
+	/**< Maximum number of shared shapers. The value of zero indicates that
+	 * shared shapers are not supported.
+	 */
+	uint32_t shaper_shared_n_max;
+
+	/**< Maximum number of nodes that can share the same shared shaper. Only
+	 * valid when shared shapers are supported.
+	 */
+	uint32_t shaper_shared_n_nodes_max;
+
+	/**< Maximum number of shared shapers that can be configured with dual
+	 * rate shaping. The value of zero indicates that dual rate shaping
+	 * support is not available for shared shapers.
+	 */
+	uint32_t shaper_shared_dual_rate_n_max;
+
+	/**< Minimum committed/peak rate (bytes per second) for shared
+	 * shapers. Only valid when shared shapers are supported.
+	 */
+	uint64_t shaper_shared_rate_min;
+
+	/**< Maximum committed/peak rate (bytes per second) for shared
+	 * shaper. Only valid when shared shapers are supported.
+	 */
+	uint64_t shaper_shared_rate_max;
+
+	/**< Minimum value allowed for packet length adjustment for
+	 * private/shared shapers.
+	 */
+	int shaper_pkt_length_adjust_min;
+
+	/**< Maximum value allowed for packet length adjustment for
+	 * private/shared shapers.
+	 */
+	int shaper_pkt_length_adjust_max;
+
+	/**< Maximum number of WRED contexts. */
+	uint32_t cman_wred_context_n_max;
+
+	/**< Maximum number of private WRED contexts. Indicates the maximum
+	 * number of leaf nodes that can concurrently have the private WRED
+	 * context enabled.
+	 */
+	uint32_t cman_wred_context_private_n_max;
+
+	/**< Maximum number of shared WRED contexts. The value of zero indicates
+	 * that shared WRED contexts are not supported.
+	 */
+	uint32_t cman_wred_context_shared_n_max;
+
+	/**< Maximum number of leaf nodes that can share the same WRED context.
+	 * Only valid when shared WRED contexts are supported.
+	 */
+	uint32_t cman_wred_context_shared_n_nodes_max;
+
+	/**< Support for VLAN DEI packet marking. */
+	int mark_vlan_dei_supported;
+
+	/**< Support for IPv4/IPv6 ECN marking of TCP packets. */
+	int mark_ip_ecn_tcp_supported;
+
+	/**< Support for IPv4/IPv6 ECN marking of SCTP packets. */
+	int mark_ip_ecn_sctp_supported;
+
+	/**< Support for IPv4/IPv6 DSCP packet marking. */
+	int mark_ip_dscp_supported;
+
+	/**< Set of supported dynamic update operations
+	 * (see enum rte_scheddev_dynamic_update_type).
+	 */
+	uint64_t dynamic_update_mask;
+
+	/**< Summary of node-level capabilities across all non-leaf nodes. */
+	struct rte_scheddev_node_capabilities nonleaf;
+
+	/**< Summary of node-level capabilities across all leaf nodes. */
+	struct rte_scheddev_node_capabilities leaf;
+};
+
+/**
+ * Congestion management (CMAN) mode
+ *
+ * This is used for controlling the admission of packets into a packet queue or
+ * group of packet queues on congestion. On request of writing a new packet
+ * into the current queue while the queue is full, the *tail drop* algorithm
+ * drops the new packet while leaving the queue unmodified, as opposed to *head
+ * drop* algorithm, which drops the packet at the head of the queue (the oldest
+ * packet waiting in the queue) and admits the new packet at the tail of the
+ * queue.
+ *
+ * The *Random Early Detection (RED)* algorithm works by proactively dropping
+ * more and more input packets as the queue occupancy builds up. When the queue
+ * is full or almost full, RED effectively works as *tail drop*. The *Weighted
+ * RED* algorithm uses a separate set of RED thresholds for each packet color.
+ */
+enum rte_scheddev_cman_mode {
+	RTE_SCHEDDEV_CMAN_TAIL_DROP = 0, /**< Tail drop */
+	RTE_SCHEDDEV_CMAN_HEAD_DROP, /**< Head drop */
+	RTE_SCHEDDEV_CMAN_WRED, /**< Weighted Random Early Detection (WRED) */
+};
+
+/**
+ * WRED profile
+ *
+ * Multiple WRED contexts can share the same WRED profile. Each leaf node with
+ * WRED enabled as its congestion management mode has zero or one private WRED
+ * context (only one leaf node using it) and/or zero, one or several shared
+ * WRED contexts (multiple leaf nodes use the same WRED context). A private
+ * WRED context is used to perform congestion management for a single leaf
+ * node, while a shared WRED context is used to perform congestion management
+ * for a group of leaf nodes.
+ */
+struct rte_scheddev_wred_params {
+	/**< One set of RED parameters per packet color */
+	struct rte_red_params red_params[e_RTE_SCHEDDEV_COLORS];
+};
+
+/**
+ * Token bucket
+ */
+struct rte_scheddev_token_bucket {
+	/**< Token bucket rate (bytes per second) */
+	uint64_t rate;
+
+	/**< Token bucket size (bytes), a.k.a. max burst size */
+	uint64_t size;
+};
+
+/**
+ * Shaper (rate limiter) profile
+ *
+ * Multiple shaper instances can share the same shaper profile. Each node has
+ * zero or one private shaper (only one node using it) and/or zero, one or
+ * several shared shapers (multiple nodes use the same shaper instance).
+ * A private shaper is used to perform traffic shaping for a single node, while
+ * a shared shaper is used to perform traffic shaping for a group of nodes.
+ *
+ * Single rate shapers use a single token bucket. A single rate shaper can be
+ * configured by setting the rate of the committed bucket to zero, which
+ * effectively disables this bucket. The peak bucket is used to limit the rate
+ * and the burst size for the current shaper.
+ *
+ * Dual rate shapers use both the committed and the peak token buckets. The
+ * rate of the committed bucket has to be less than or equal to the rate of the
+ * peak bucket.
+ */
+struct rte_scheddev_shaper_params {
+	/**< Committed token bucket */
+	struct rte_scheddev_token_bucket committed;
+
+	/**< Peak token bucket */
+	struct rte_scheddev_token_bucket peak;
+
+	/**< Signed value to be added to the length of each packet for the
+	 * purpose of shaping. Can be used to correct the packet length with
+	 * the framing overhead bytes that are also consumed on the wire (e.g.
+	 * RTE_SCHEDDEV_ETH_FRAMING_OVERHEAD_FCS).
+	 */
+	int32_t pkt_length_adjust;
+};
+
+/**
+ * Node parameters
+ *
+ * Each scheduler hierarchy node has multiple inputs (children nodes of the
+ * current parent node) and a single output (which is input to its parent
+ * node). The current node arbitrates its inputs using Strict Priority (SP),
+ * Weighted Fair Queuing (WFQ) and Weighted Round Robin (WRR) algorithms to
+ * schedule input packets on its output while observing its shaping (rate
+ * limiting) constraints.
+ *
+ * Algorithms such as byte-level WRR, Deficit WRR (DWRR), etc are considered
+ * approximations of the ideal of WFQ and are assimilated to WFQ, although
+ * an associated implementation-dependent trade-off on accuracy, performance
+ * and resource usage might exist.
+ *
+ * Children nodes with different priorities are scheduled using the SP
+ * algorithm, based on their priority, with zero (0) as the highest priority.
+ * Children with same priority are scheduled using the WFQ or WRR algorithm,
+ * based on their weight, which is relative to the sum of the weights of all
+ * siblings with same priority, with one (1) as the lowest weight.
+ *
+ * Each leaf node sits on on top of a TX queue of the current Ethernet port.
+ * Therefore, the leaf nodes are predefined with the node IDs of 0 .. (N-1),
+ * where N is the number of TX queues configured for the current Ethernet port.
+ * The non-leaf nodes have their IDs generated by the application.
+ */
+struct rte_scheddev_node_params {
+	/**< Shaper profile for the private shaper. The absence of the private
+	 * shaper for the current node is indicated by setting this parameter
+	 * to RTE_SCHEDDEV_SHAPER_PROFILE_ID_NONE.
+	 */
+	uint32_t shaper_profile_id;
+
+	/**< User allocated array of valid shared shaper IDs. */
+	uint32_t *shared_shaper_id;
+
+	/**< Number of shared shaper IDs in the *shared_shaper_id* array. */
+	uint32_t n_shared_shapers;
+
+	/**< Mask of statistics counter types to be enabled for this node. This
+	 * needs to be a subset of the statistics counter types available for
+	 * the current node. Any statistics counter type not included in this
+	 * set is to be disabled for the current node.
+	 */
+	uint64_t stats_mask;
+
+	union {
+		/**< Parameters only valid for non-leaf nodes. */
+		struct {
+			/**< For each priority, indicates whether the children
+			 * nodes sharing the same priority are to be scheduled
+			 * by WFQ or by WRR. When NULL, it indicates that WFQ
+			 * is to be used for all priorities. When non-NULL, it
+			 * points to a pre-allocated array of *n_priority*
+			 * elements, with a non-zero value element indicating
+			 * WFQ and a zero value element for WRR.
+			 */
+			int *scheduling_mode_per_priority;
+
+			/**< Number of priorities. */
+			uint32_t n_priorities;
+		} nonleaf;
+
+		/**< Parameters only valid for leaf nodes. */
+		struct {
+			/**< Congestion management mode */
+			enum rte_scheddev_cman_mode cman;
+
+			/**< WRED parameters (valid when *cman* is WRED). */
+			struct {
+				/**< WRED profile for private WRED context. */
+				uint32_t wred_profile_id;
+
+				/**< User allocated array of shared WRED context
+				 * IDs. The absence of a private WRED context
+				 * for current leaf node is indicated by value
+				 * RTE_SCHEDDEV_WRED_PROFILE_ID_NONE.
+				 */
+				uint32_t *shared_wred_context_id;
+
+				/**< Number of shared WRED context IDs in the
+				 * *shared_wred_context_id* array.
+				 */
+				uint32_t n_shared_wred_contexts;
+			} wred;
+		} leaf;
+	};
+};
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_scheddev_error::cause.
+ */
+enum rte_scheddev_error_type {
+	RTE_SCHEDDEV_ERROR_TYPE_NONE, /**< No error. */
+	RTE_SCHEDDEV_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_SCHEDDEV_ERROR_TYPE_CAPABILITIES,
+	RTE_SCHEDDEV_ERROR_TYPE_LEVEL_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE,
+	RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_GREEN,
+	RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_YELLOW,
+	RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_RED,
+	RTE_SCHEDDEV_ERROR_TYPE_WRED_PROFILE_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_SHARED_WRED_CONTEXT_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_SIZE,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+	RTE_SCHEDDEV_ERROR_TYPE_SHAPER_PROFILE_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_SHARED_SHAPER_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARENT_NODE_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PRIORITY,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_WEIGHT,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_STATS,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SCHEDULING_MODE,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_N_PRIORITIES,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_CMAN,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_SHARED_WRED_CONTEXT_ID,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+	RTE_SCHEDDEV_ERROR_TYPE_NODE_ID,
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_scheddev_error {
+	enum rte_scheddev_error_type type; /**< Cause field and error type. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Scheduler get number of leaf nodes
+ *
+ * Each leaf node sits on on top of a TX queue of the current Ethernet port.
+ * Therefore, the set of leaf nodes is predefined, their number is always equal
+ * to N (where N is the number of TX queues configured for the current port) and
+ * their IDs are 0 .. (N-1).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param n_leaf_nodes
+ *   Number of leaf nodes for the current port.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_get_leaf_nodes(uint8_t port_id,
+	uint32_t *n_leaf_nodes,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node type (i.e. leaf or non-leaf) get
+ *
+ * The leaf nodes have predefined IDs in the range of 0 .. (N-1), where N is the
+ * number of TX queues of the current Ethernet port. The non-leaf nodes have
+ * their IDs generated by the application outside of the above range, which is
+ * reserved for leaf nodes.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID value. Needs to be valid.
+ * @param is_leaf
+ *   Set to non-zero value when node is leaf and to zero otherwise (non-leaf).
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_type_get(uint8_t port_id,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler capabilities get
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param cap
+ *   Scheduler capabilities. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_capabilities_get(uint8_t port_id,
+	struct rte_scheddev_capabilities *cap,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler level capabilities get
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param level_id
+ *   The scheduler hierarchy level identifier. The value of 0 identifies the
+ *   level of the root node.
+ * @param cap
+ *   Scheduler level capabilities. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_level_capabilities_get(uint8_t port_id,
+	uint32_t level_id,
+	struct rte_scheddev_level_capabilities *cap,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node capabilities get
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param cap
+ *   Scheduler node capabilities. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_capabilities_get(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_node_capabilities *cap,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler WRED profile add
+ *
+ * Create a new WRED profile with ID set to *wred_profile_id*. The new profile
+ * is used to create one or several WRED contexts.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param wred_profile_id
+ *   WRED profile ID for the new profile. Needs to be unused.
+ * @param profile
+ *   WRED profile parameters. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_wred_profile_add(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_wred_params *profile,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler WRED profile delete
+ *
+ * Delete an existing WRED profile. This operation fails when there is currently
+ * at least one user (i.e. WRED context) of this WRED profile.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param wred_profile_id
+ *   WRED profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_wred_profile_delete(uint8_t port_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler shared WRED context add or update
+ *
+ * When *shared_wred_context_id* is invalid, a new WRED context with this ID is
+ * created by using the WRED profile identified by *wred_profile_id*.
+ *
+ * When *shared_wred_context_id* is valid, this WRED context is no longer using
+ * the profile previously assigned to it and is updated to use the profile
+ * identified by *wred_profile_id*.
+ *
+ * A valid shared WRED context can be assigned to several scheduler hierarchy
+ * leaf nodes configured to use WRED as the congestion management mode.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_wred_context_id
+ *   Shared WRED context ID
+ * @param wred_profile_id
+ *   WRED profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_shared_wred_context_add_update(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler shared WRED context delete
+ *
+ * Delete an existing shared WRED context. This operation fails when there is
+ * currently at least one user (i.e. scheduler hierarchy leaf node) of this
+ * shared WRED context.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_wred_context_id
+ *   Shared WRED context ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_shared_wred_context_delete(uint8_t port_id,
+	uint32_t shared_wred_context_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler shaper profile add
+ *
+ * Create a new shaper profile with ID set to *shaper_profile_id*. The new
+ * shaper profile is used to create one or several shapers.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shaper_profile_id
+ *   Shaper profile ID for the new profile. Needs to be unused.
+ * @param profile
+ *   Shaper profile parameters. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_shaper_profile_add(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_shaper_params *profile,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler shaper profile delete
+ *
+ * Delete an existing shaper profile. This operation fails when there is
+ * currently at least one user (i.e. shaper) of this shaper profile.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shaper_profile_id
+ *   Shaper profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_shaper_profile_delete(uint8_t port_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler shared shaper add or update
+ *
+ * When *shared_shaper_id* is not a valid shared shaper ID, a new shared shaper
+ * with this ID is created using the shaper profile identified by
+ * *shaper_profile_id*.
+ *
+ * When *shared_shaper_id* is a valid shared shaper ID, this shared shaper is no
+ * longer using the shaper profile previously assigned to it and is updated to
+ * use the shaper profile identified by *shaper_profile_id*.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_shaper_id
+ *   Shared shaper ID
+ * @param shaper_profile_id
+ *   Shaper profile ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_shared_shaper_add_update(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler shared shaper delete
+ *
+ * Delete an existing shared shaper. This operation fails when there is
+ * currently at least one user (i.e. scheduler hierarchy node) of this shared
+ * shaper.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param shared_shaper_id
+ *   Shared shaper ID. Needs to be the valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_shared_shaper_delete(uint8_t port_id,
+	uint32_t shared_shaper_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node add
+ *
+ * Create new node and connect it as child of an existing node. The new node is
+ * further identified by *node_id*, which needs to be unused by any of the
+ * existing nodes. The parent node is identified by *parent_node_id*, which
+ * needs to be the valid ID of an existing non-leaf node. The parent node is
+ * going to use the provided SP *priority* and WFQ/WRR *weight* to schedule its
+ * new child node.
+ *
+ * This function has to be called for both leaf and non-leaf nodes. In the case
+ * of leaf nodes (i.e. *node_id* is within the range of 0 .. (N-1), with N as
+ * the number of configured TX queues of the current port), the leaf node is
+ * configured rather than created (as the set of leaf nodes is predefined) and
+ * it is also connected as child of an existing node.
+ *
+ * The first node that is added becomes the root node and all the nodes that are
+ * subsequently added have to be added as descendants of the root node. The
+ * parent of the root node has to be specified as RTE_SCHEDDEV_NODE_ID_NULL and
+ * there can only be one node with this parent ID (i.e. the root node).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be unused by any of the existing nodes.
+ * @param parent_node_id
+ *   Parent node ID. Needs to be the valid.
+ * @param priority
+ *   Node priority. The highest node priority is zero. Used by the SP algorithm
+ *   running on the parent of the current node for scheduling this child node.
+ * @param weight
+ *   Node weight. The node weight is relative to the weight sum of all siblings
+ *   that have the same priority. The lowest weight is one. Used by the WFQ/WRR
+ *   algorithm running on the parent of the current node for scheduling this
+ *   child node.
+ * @param params
+ *   Node parameters. Needs to be pre-allocated and valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_add(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_scheddev_node_params *params,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node delete
+ *
+ * Delete an existing node. This operation fails when this node currently has at
+ * least one user (i.e. child node).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_delete(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node suspend
+ *
+ * Suspend an existing node.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_suspend(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node resume
+ *
+ * Resume an existing node that was previously suspended.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_resume(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler hierarchy set
+ *
+ * This function is called during the port initialization phase (before the
+ * Ethernet port is started) to freeze the scheduler start-up hierarchy.
+ *
+ * This function fails when the currently configured scheduler hierarchy is not
+ * supported by the Ethernet port, in which case the user can abort or try out
+ * another hierarchy configuration (e.g. a hierarchy with less leaf nodes),
+ * which can be build from scratch (when *clear_on_fail* is enabled) or by
+ * modifying the existing hierarchy configuration (when *clear_on_fail* is
+ * disabled).
+ *
+ * Note that, even when the configured scheduler hierarchy is supported (so this
+ * function is successful), the Ethernet port start might still fail due to e.g.
+ * not enough memory being available in the system, etc.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param clear_on_fail
+ *   On function call failure, hierarchy is cleared when this parameter is
+ *   non-zero and preserved when this parameter is equal to zero.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_hierarchy_set(uint8_t port_id,
+	int clear_on_fail,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node parent update
+ *
+ * The parent of the root node cannot be changed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param parent_node_id
+ *   Node ID for the new parent. Needs to be valid.
+ * @param priority
+ *   Node priority. The highest node priority is zero. Used by the SP algorithm
+ *   running on the parent of the current node for scheduling this child node.
+ * @param weight
+ *   Node weight. The node weight is relative to the weight sum of all siblings
+ *   that have the same priority. The lowest weight is zero. Used by the WFQ/WRR
+ *   algorithm running on the parent of the current node for scheduling this
+ *   child node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_parent_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node private shaper update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param shaper_profile_id
+ *   Shaper profile ID for the private shaper of the current node. Needs to be
+ *   either valid shaper profile ID or RTE_SCHEDDEV_SHAPER_PROFILE_ID_NONE, with
+ *   the latter disabling the private shaper of the current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node shared shapers update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param shared_shaper_id
+ *   Shared shaper ID. Needs to be valid.
+ * @param add
+ *   Set to non-zero value to add this shared shaper to current node or to zero
+ *   to delete this shared shaper from current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_shared_shaper_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_shaper_id,
+	int add,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node enabled statistics counters update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param stats_mask
+ *   Mask of statistics counter types to be enabled for the current node. This
+ *   needs to be a subset of the statistics counter types available for the
+ *   current node. Any statistics counter type not included in this set is to be
+ *   disabled for the current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_stats_update(uint8_t port_id,
+	uint32_t node_id,
+	uint64_t stats_mask,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node scheduling mode update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param scheduling_mode_per_priority
+ *   For each priority, indicates whether the children nodes sharing the same
+ *   priority are to be scheduled by WFQ or by WRR. When NULL, it indicates that
+ *   WFQ is to be used for all priorities. When non-NULL, it points to a
+ *   pre-allocated array of *n_priority* elements, with a non-zero value element
+ *   indicating WFQ and a zero value element for WRR.
+ * @param n_priorities
+ *   Number of priorities.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_scheduling_mode_update(uint8_t port_id,
+	uint32_t node_id,
+	int *scheduling_mode_per_priority,
+	uint32_t n_priorities,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node congestion management mode update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param cman
+ *   Congestion management mode.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_cman_update(uint8_t port_id,
+	uint32_t node_id,
+	enum rte_scheddev_cman_mode cman,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node private WRED context update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param wred_profile_id
+ *   WRED profile ID for the private WRED context of the current node. Needs to
+ *   be either valid WRED profile ID or RTE_SCHEDDEV_WRED_PROFILE_ID_NONE, with
+ *   the latter disabling the private WRED context of the current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node shared WRED context update
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid leaf node ID.
+ * @param shared_wred_context_id
+ *   Shared WRED context ID. Needs to be valid.
+ * @param add
+ *   Set to non-zero value to add this shared WRED context to current node or to
+ *   zero to delete this shared WRED context from current node.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_shared_wred_context_update(uint8_t port_id,
+	uint32_t node_id,
+	uint32_t shared_wred_context_id,
+	int add,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler node statistics counters read
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param node_id
+ *   Node ID. Needs to be valid.
+ * @param stats
+ *   When non-NULL, it contains the current value for the statistics counters
+ *   enabled for the current node.
+ * @param stats_mask
+ *   When non-NULL, it contains the mask of statistics counter types that are
+ *   currently enabled for this node, indicating which of the counters retrieved
+ *   with the *stats* structure are valid.
+ * @param clear
+ *   When this parameter has a non-zero value, the statistics counters are
+ *   cleared (i.e. set to zero) immediately after they have been read, otherwise
+ *   the statistics counters are left untouched.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_node_stats_read(uint8_t port_id,
+	uint32_t node_id,
+	struct rte_scheddev_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler packet marking - VLAN DEI (IEEE 802.1Q)
+ *
+ * IEEE 802.1p maps the traffic class to the VLAN Priority Code Point (PCP)
+ * field (3 bits), while IEEE 802.1q maps the drop priority to the VLAN Drop
+ * Eligible Indicator (DEI) field (1 bit), which was previously named Canonical
+ * Format Indicator (CFI).
+ *
+ * All VLAN frames of a given color get their DEI bit set if marking is enabled
+ * for this color; otherwise, their DEI bit is left as is (either set or not).
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mark_green
+ *   Set to non-zero value to enable marking of green packets and to zero to
+ *   disable it.
+ * @param mark_yellow
+ *   Set to non-zero value to enable marking of yellow packets and to zero to
+ *   disable it.
+ * @param mark_red
+ *   Set to non-zero value to enable marking of red packets and to zero to
+ *   disable it.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_mark_vlan_dei(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler packet marking - IPv4 / IPv6 ECN (IETF RFC 3168)
+ *
+ * IETF RFCs 2474 and 3168 reorganize the IPv4 Type of Service (TOS) field
+ * (8 bits) and the IPv6 Traffic Class (TC) field (8 bits) into Differentiated
+ * Services Codepoint (DSCP) field (6 bits) and Explicit Congestion Notification
+ * (ECN) field (2 bits). The DSCP field is typically used to encode the traffic
+ * class and/or drop priority (RFC 2597), while the ECN field is used by RFC
+ * 3168 to implement a congestion notification mechanism to be leveraged by
+ * transport layer protocols such as TCP and SCTP that have congestion control
+ * mechanisms.
+ *
+ * When congestion is experienced, as alternative to dropping the packet,
+ * routers can change the ECN field of input packets from 2'b01 or 2'b10 (values
+ * indicating that source endpoint is ECN-capable) to 2'b11 (meaning that
+ * congestion is experienced). The destination endpoint can use the ECN-Echo
+ * (ECE) TCP flag to relay the congestion indication back to the source
+ * endpoint, which acknowledges it back to the destination endpoint with the
+ * Congestion Window Reduced (CWR) TCP flag.
+ *
+ * All IPv4/IPv6 packets of a given color with ECN set to 2’b01 or 2’b10
+ * carrying TCP or SCTP have their ECN set to 2’b11 if the marking feature is
+ * enabled for the current color, otherwise the ECN field is left as is.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mark_green
+ *   Set to non-zero value to enable marking of green packets and to zero to
+ *   disable it.
+ * @param mark_yellow
+ *   Set to non-zero value to enable marking of yellow packets and to zero to
+ *   disable it.
+ * @param mark_red
+ *   Set to non-zero value to enable marking of red packets and to zero to
+ *   disable it.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_mark_ip_ecn(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error);
+
+/**
+ * Scheduler packet marking - IPv4 / IPv6 DSCP (IETF RFC 2597)
+ *
+ * IETF RFC 2597 maps the traffic class and the drop priority to the IPv4/IPv6
+ * Differentiated Services Codepoint (DSCP) field (6 bits). Here are the DSCP
+ * values proposed by this RFC:
+ *
+ *                       Class 1    Class 2    Class 3    Class 4
+ *                     +----------+----------+----------+----------+
+ *    Low Drop Prec    |  001010  |  010010  |  011010  |  100010  |
+ *    Medium Drop Prec |  001100  |  010100  |  011100  |  100100  |
+ *    High Drop Prec   |  001110  |  010110  |  011110  |  100110  |
+ *                     +----------+----------+----------+----------+
+ *
+ * There are 4 traffic classes (classes 1 .. 4) encoded by DSCP bits 1 and 2, as
+ * well as 3 drop priorities (low/medium/high) encoded by DSCP bits 3 and 4.
+ *
+ * All IPv4/IPv6 packets have their color marked into DSCP bits 3 and 4 as
+ * follows: green mapped to Low Drop Precedence (2’b01), yellow to Medium
+ * (2’b10) and red to High (2’b11). Marking needs to be explicitly enabled
+ * for each color; when not enabled for a given color, the DSCP field of all
+ * packets with that color is left as is.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param mark_green
+ *   Set to non-zero value to enable marking of green packets and to zero to
+ *   disable it.
+ * @param mark_yellow
+ *   Set to non-zero value to enable marking of yellow packets and to zero to
+ *   disable it.
+ * @param mark_red
+ *   Set to non-zero value to enable marking of red packets and to zero to
+ *   disable it.
+ * @param error
+ *   Error details. Filled in only on error, when not NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_scheddev_mark_ip_dscp(uint8_t port_id,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_SCHEDDEV_H__ */
diff --git a/lib/librte_ether/rte_scheddev_driver.h b/lib/librte_ether/rte_scheddev_driver.h
new file mode 100644
index 0000000..d245aea
--- /dev/null
+++ b/lib/librte_ether/rte_scheddev_driver.h
@@ -0,0 +1,365 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_SCHEDDEV_DRIVER_H__
+#define __INCLUDE_RTE_SCHEDDEV_DRIVER_H__
+
+/**
+ * @file
+ * RTE Generic Hierarchical Scheduler API (Driver Side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include "rte_ethdev.h"
+#include "rte_scheddev.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef int (*rte_scheddev_node_type_get_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node type get */
+
+typedef int (*rte_scheddev_capabilities_get_t)(struct rte_eth_dev *dev,
+	struct rte_scheddev_capabilities *cap,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler capabilities get */
+
+typedef int (*rte_scheddev_level_capabilities_get_t)(struct rte_eth_dev *dev,
+	uint32_t level_id,
+	struct rte_scheddev_level_capabilities *cap,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler level capabilities get */
+
+typedef int (*rte_scheddev_node_capabilities_get_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_scheddev_node_capabilities *cap,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node capabilities get */
+
+typedef int (*rte_scheddev_wred_profile_add_t)(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_wred_params *profile,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler WRED profile add */
+
+typedef int (*rte_scheddev_wred_profile_delete_t)(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler WRED profile delete */
+
+typedef int (*rte_scheddev_shared_wred_context_add_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t shared_wred_context_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler shared WRED context add */
+
+typedef int (*rte_scheddev_shared_wred_context_delete_t)(
+	struct rte_eth_dev *dev,
+	uint32_t shared_wred_context_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler shared WRED context delete */
+
+typedef int (*rte_scheddev_shaper_profile_add_t)(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_shaper_params *profile,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler shaper profile add */
+
+typedef int (*rte_scheddev_shaper_profile_delete_t)(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler shaper profile delete */
+
+typedef int (*rte_scheddev_shared_shaper_add_update_t)(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler shared shaper add/update */
+
+typedef int (*rte_scheddev_shared_shaper_delete_t)(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler shared shaper delete */
+
+typedef int (*rte_scheddev_node_add_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_scheddev_node_params *params,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node add */
+
+typedef int (*rte_scheddev_node_delete_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node delete */
+
+typedef int (*rte_scheddev_node_suspend_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node suspend */
+
+typedef int (*rte_scheddev_node_resume_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node resume */
+
+typedef int (*rte_scheddev_hierarchy_set_t)(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler hierarchy set */
+
+typedef int (*rte_scheddev_node_parent_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node parent update */
+
+typedef int (*rte_scheddev_node_shaper_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node shaper update */
+
+typedef int (*rte_scheddev_node_shared_shaper_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shared_shaper_id,
+	int32_t add,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node shaper update */
+
+typedef int (*rte_scheddev_node_stats_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint64_t stats_mask,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node stats update */
+
+typedef int (*rte_scheddev_node_scheduling_mode_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *scheduling_mode_per_priority,
+	uint32_t n_priorities,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node scheduling mode update */
+
+typedef int (*rte_scheddev_node_cman_update_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	enum rte_scheddev_cman_mode cman,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node congestion management mode update */
+
+typedef int (*rte_scheddev_node_wred_context_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t wred_profile_id,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node WRED context update */
+
+typedef int (*rte_scheddev_node_shared_wred_context_update_t)(
+	struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shared_wred_context_id,
+	int add,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler node WRED context update */
+
+typedef int (*rte_scheddev_node_stats_read_t)(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_scheddev_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler read stats counters for specific node */
+
+typedef int (*rte_scheddev_mark_vlan_dei_t)(struct rte_eth_dev *dev,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler packet marking - VLAN DEI */
+
+typedef int (*rte_scheddev_mark_ip_ecn_t)(struct rte_eth_dev *dev,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler packet marking - IPv4/IPv6 ECN */
+
+typedef int (*rte_scheddev_mark_ip_dscp_t)(struct rte_eth_dev *dev,
+	int mark_green,
+	int mark_yellow,
+	int mark_red,
+	struct rte_scheddev_error *error);
+/**< @internal Scheduler packet marking - IPv4/IPv6 DSCP */
+
+struct rte_scheddev_ops {
+	/** Scheduler node type get */
+	rte_scheddev_node_type_get_t node_type_get;
+
+	/** Scheduler capabilities_get */
+	rte_scheddev_capabilities_get_t capabilities_get;
+	/** Scheduler level capabilities_get */
+	rte_scheddev_level_capabilities_get_t level_capabilities_get;
+	/** Scheduler node capabilities get */
+	rte_scheddev_node_capabilities_get_t node_capabilities_get;
+
+	/** Scheduler WRED profile add */
+	rte_scheddev_wred_profile_add_t wred_profile_add;
+	/** Scheduler WRED profile delete */
+	rte_scheddev_wred_profile_delete_t wred_profile_delete;
+	/** Scheduler shared WRED context add/update */
+	rte_scheddev_shared_wred_context_add_update_t
+		shared_wred_context_add_update;
+	/** Scheduler shared WRED context delete */
+	rte_scheddev_shared_wred_context_delete_t
+		shared_wred_context_delete;
+
+	/** Scheduler shaper profile add */
+	rte_scheddev_shaper_profile_add_t shaper_profile_add;
+	/** Scheduler shaper profile delete */
+	rte_scheddev_shaper_profile_delete_t shaper_profile_delete;
+	/** Scheduler shared shaper add/update */
+	rte_scheddev_shared_shaper_add_update_t shared_shaper_add_update;
+	/** Scheduler shared shaper delete */
+	rte_scheddev_shared_shaper_delete_t shared_shaper_delete;
+
+	/** Scheduler node add */
+	rte_scheddev_node_add_t node_add;
+	/** Scheduler node delete */
+	rte_scheddev_node_delete_t node_delete;
+	/** Scheduler node suspend */
+	rte_scheddev_node_suspend_t node_suspend;
+	/** Scheduler node resume */
+	rte_scheddev_node_resume_t node_resume;
+	/** Scheduler hierarchy set */
+	rte_scheddev_hierarchy_set_t hierarchy_set;
+
+	/** Scheduler node parent update */
+	rte_scheddev_node_parent_update_t node_parent_update;
+	/** Scheduler node shaper update */
+	rte_scheddev_node_shaper_update_t node_shaper_update;
+	/** Scheduler node shared shaper update */
+	rte_scheddev_node_shared_shaper_update_t node_shared_shaper_update;
+	/** Scheduler node stats update */
+	rte_scheddev_node_stats_update_t node_stats_update;
+	/** Scheduler node scheduling mode update */
+	rte_scheddev_node_scheduling_mode_update_t node_scheduling_mode_update;
+	/** Scheduler node congestion management mode update */
+	rte_scheddev_node_cman_update_t node_cman_update;
+	/** Scheduler node WRED context update */
+	rte_scheddev_node_wred_context_update_t node_wred_context_update;
+	/** Scheduler node shared WRED context update */
+	rte_scheddev_node_shared_wred_context_update_t
+		node_shared_wred_context_update;
+	/** Scheduler read statistics counters for current node */
+	rte_scheddev_node_stats_read_t node_stats_read;
+
+	/** Scheduler packet marking - VLAN DEI */
+	rte_scheddev_mark_vlan_dei_t mark_vlan_dei;
+	/** Scheduler packet marking - IPv4/IPv6 ECN */
+	rte_scheddev_mark_ip_ecn_t mark_ip_ecn;
+	/** Scheduler packet marking - IPv4/IPv6 DSCP */
+	rte_scheddev_mark_ip_dscp_t mark_ip_dscp;
+};
+
+/**
+ * Initialize generic error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param error
+ *   Pointer to error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error type.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Error code.
+ */
+static inline int
+rte_scheddev_error_set(struct rte_scheddev_error *error,
+		   int code,
+		   enum rte_scheddev_error_type type,
+		   const void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_scheddev_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return code;
+}
+
+/**
+ * Get generic hierarchical scheduler operations structure from a port
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param error
+ *   Error details
+ *
+ * @return
+ *   The hierarchical scheduler operations structure associated with port_id on
+ *   success, NULL otherwise.
+ */
+const struct rte_scheddev_ops *
+rte_scheddev_ops_get(uint8_t port_id, struct rte_scheddev_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_SCHEDDEV_DRIVER_H__ */
-- 
2.5.0

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v7 01/17] lib: rename legacy distributor lib files
  @ 2017-02-24 14:03  0%     ` Bruce Richardson
  2017-03-01  9:55  0%       ` Hunt, David
  2017-03-01  7:47  2%     ` [dpdk-dev] [PATCH v8 0/18] distributor library performance enhancements David Hunt
  1 sibling, 1 reply; 200+ results
From: Bruce Richardson @ 2017-02-24 14:03 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Tue, Feb 21, 2017 at 03:17:37AM +0000, David Hunt wrote:
> Move files out of the way so that we can replace with new
> versions of the distributor libtrary. Files are named in
> such a way as to match the symbol versioning that we will
> apply for backward ABI compatibility.
> 
> Signed-off-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c                  |   2 +-
>  app/test/test_distributor_perf.c             |   2 +-
>  examples/distributor/main.c                  |   2 +-
>  lib/librte_distributor/Makefile              |   4 +-
>  lib/librte_distributor/rte_distributor.c     | 487 ---------------------------
>  lib/librte_distributor/rte_distributor.h     | 247 --------------
>  lib/librte_distributor/rte_distributor_v20.c | 487 +++++++++++++++++++++++++++
>  lib/librte_distributor/rte_distributor_v20.h | 247 ++++++++++++++

Rather than changing the unit tests and example applications, I think
this patch would be better with a new rte_distributor.h file which
simply does "#include  <rte_distributor_v20.h>". Alternatively, I
recently upstreamed a patch, which went into 17.02, to allow symlinks in
the folder so you could create a symlink to the renamed file.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 0/17] distributor library performance enhancements
    @ 2017-02-24 14:01  0%   ` Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2017-02-24 14:01 UTC (permalink / raw)
  To: David Hunt; +Cc: dev

On Tue, Feb 21, 2017 at 03:17:36AM +0000, David Hunt wrote:
> This patch aims to improve the throughput of the distributor library.
> 
> It uses a similar handshake mechanism to the previous version of
> the library, in that bits are used to indicate when packets are ready
> to be sent to a worker and ready to be returned from a worker. One main
> difference is that instead of sending one packet in a cache line, it makes
> use of the 7 free spaces in the same cache line in order to send up to
> 8 packets at a time to/from a worker.
> 
> The flow matching algorithm has had significant re-work, and now keeps an
> array of inflight flows and an array of backlog flows, and matches incoming
> flows to the inflight/backlog flows of all workers so that flow pinning to
> workers can be maintained.
> 
> The Flow Match algorithm has both scalar and a vector versions, and a
> function pointer is used to select the post appropriate function at run time,
> depending on the presence of the SSE2 cpu flag. On non-x86 platforms, the
> the scalar match function is selected, which should still gives a good boost
> in performance over the non-burst API.
> 
> v2 changes:
>   * Created a common distributor_priv.h header file with common
>     definitions and structures.
>   * Added a scalar version so it can be built and used on machines without
>     sse2 instruction set
>   * Added unit autotests
>   * Added perf autotest

For future reference, I think it's better to put the list of deltas from
each version in reverse order, so that the latest changes are on top,
and save scrolling for those of us who have been tracking the set.

> 
> v3 changes:
>   * Addressed mailing list review comments
>   * Test code removal
>   * Split out SSE match into separate file to facilitate NEON addition
>   * Cleaned up conditional compilation flags for SSE2
>   * Addressed c99 style compilation errors
>   * rebased on latest head (Jan 2 2017, Happy New Year to all)
> 
> v4 changes:
>    * fixed issue building shared libraries
> 
> v5 changes:
>    * Removed some un-needed code around retries in worker API calls
>    * Cleanup due to review comments on mailing list
>    * Cleanup of non-x86 platform compilation, fallback to scalar match
> 
> v6 changes:
>    * Fixed intermittent segfault where num pkts not divisible
>      by BURST_SIZE
>    * Cleanup due to review comments on mailing list
>    * Renamed _priv.h to _private.h.
> 
> v7 changes:
>    * Reorganised patch so there's a more natural progression in the
>      changes, and divided them down into easier to review chunks.
>    * Previous versions of this patch set were effectively two APIs.
>      We now have a single API. Legacy functionality can
>      be used by by using the rte_distributor_create API call with the
>      RTE_DISTRIBUTOR_SINGLE flag when creating a distributor instance.
>    * Added symbol versioning for old API so that ABI is preserved.
> 
The merging to a single API is great to see, making it so much easier
for app developers. Thanks for that.

/Bruce

^ permalink raw reply	[relevance 0%]

Results 11201-11400 of ~19000  next (older) | prev (newer) | reverse | sort options + mbox downloads above

-- links below jump to the message on this page --
2017-01-23  9:24     [dpdk-dev] [PATCH v6 1/6] lib: distributor performance enhancements David Hunt
2017-02-21  3:17     ` [dpdk-dev] [PATCH v7 0/17] distributor library " David Hunt
2017-02-21  3:17       ` [dpdk-dev] [PATCH v7 01/17] lib: rename legacy distributor lib files David Hunt
2017-02-24 14:03  0%     ` Bruce Richardson
2017-03-01  9:55  0%       ` Hunt, David
2017-03-01  7:47  2%     ` [dpdk-dev] [PATCH v8 0/18] distributor library performance enhancements David Hunt
2017-03-01  7:47  1%       ` [dpdk-dev] [PATCH v8 01/18] lib: rename legacy distributor lib files David Hunt
2017-03-06  9:10  2%         ` [dpdk-dev] [PATCH v9 00/18] distributor lib performance enhancements David Hunt
2017-03-06  9:10  1%           ` [dpdk-dev] [PATCH v9 01/18] lib: rename legacy distributor lib files David Hunt
2017-03-15  6:19  2%             ` [dpdk-dev] [PATCH v10 0/18] distributor library performance enhancements David Hunt
2017-03-15  6:19  1%               ` [dpdk-dev] [PATCH v10 01/18] lib: rename legacy distributor lib files David Hunt
2017-03-20 10:08  2%                 ` [dpdk-dev] [PATCH v11 0/18] distributor lib performance enhancements David Hunt
2017-03-20 10:08  1%                   ` [dpdk-dev] [PATCH v11 01/18] lib: rename legacy distributor lib files David Hunt
2017-03-20 10:08  2%                   ` [dpdk-dev] [PATCH v11 08/18] lib: add symbol versioning to distributor David Hunt
2017-03-27 13:02  3%                     ` Thomas Monjalon
2017-03-28  8:25  0%                       ` Hunt, David
2017-03-15  6:19  2%               ` [dpdk-dev] [PATCH v10 " David Hunt
2017-03-06  9:10  2%           ` [dpdk-dev] [PATCH v9 09/18] " David Hunt
2017-03-10 16:22  0%             ` Bruce Richardson
2017-03-13 10:17  0%               ` Hunt, David
2017-03-13 10:28  0%               ` Hunt, David
2017-03-13 11:01  0%                 ` Van Haaren, Harry
2017-03-13 11:02  0%                   ` Hunt, David
2017-03-01  7:47  3%       ` [dpdk-dev] [PATCH v8 " David Hunt
2017-03-01 14:50  0%         ` Hunt, David
2017-02-24 14:01  0%   ` [dpdk-dev] [PATCH v7 0/17] distributor library performance enhancements Bruce Richardson
2017-02-22 13:12     [dpdk-dev] Further fun with ABI tracking Christian Ehrhardt
2017-02-22 13:24     ` [dpdk-dev] [PATCH] mk: Provide option to set Major ABI version Christian Ehrhardt
2017-02-28  8:34  4%   ` Jan Blunck
2017-03-01  9:31  4%     ` Christian Ehrhardt
2017-03-01  9:34 20%       ` [dpdk-dev] [PATCH v2] " Christian Ehrhardt
2017-03-01 14:35  4%         ` Jan Blunck
2017-03-16 17:19  4%           ` Thomas Monjalon
2017-03-17  8:27  4%             ` Christian Ehrhardt
2017-03-17  9:16  4%               ` Jan Blunck
2017-02-22 16:09     [dpdk-dev] [PATCH 0/4] net/mlx5 add TSO support Shahaf Shuler
2017-03-01 11:11  3% ` [dpdk-dev] [PATCH v2 0/1] net/mlx5: " Shahaf Shuler
2017-02-23 17:23     [dpdk-dev] [PATCH v1 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-02-23 17:23     ` [dpdk-dev] [PATCH v1 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-02-28 11:35  0%   ` Jerin Jacob
2017-02-28 11:57  0%     ` Bruce Richardson
2017-02-28 12:08  0%       ` Jerin Jacob
2017-02-28 13:52  0%         ` Bruce Richardson
2017-02-28 17:54  0%           ` Jerin Jacob
2017-03-01  9:47  0%             ` Bruce Richardson
2017-03-07 11:32     ` [dpdk-dev] [PATCH v2 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-03-07 11:32  4%   ` [dpdk-dev] [PATCH v2 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-07 11:32  3%   ` [dpdk-dev] [PATCH v2 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 04/14] ring: remove debug setting Bruce Richardson
2017-03-07 11:32  4%   ` [dpdk-dev] [PATCH v2 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 06/14] ring: remove watermark support Bruce Richardson
2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-03-07 11:32  2%   ` [dpdk-dev] [PATCH v2 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-24 17:09       ` [dpdk-dev] [PATCH v3 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-03-24 17:09  5%     ` [dpdk-dev] [PATCH v3 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-24 17:09  3%     ` [dpdk-dev] [PATCH v3 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-27  9:52           ` Thomas Monjalon
2017-03-27 10:13  3%         ` Bruce Richardson
2017-03-24 17:09  2%     ` [dpdk-dev] [PATCH v3 04/14] ring: remove debug setting Bruce Richardson
2017-03-24 17:09  4%     ` [dpdk-dev] [PATCH v3 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 06/14] ring: remove watermark support Bruce Richardson
2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-03-24 17:10  2%     ` [dpdk-dev] [PATCH v3 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-28 20:35         ` [dpdk-dev] [PATCH v4 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-03-28 20:35  5%       ` [dpdk-dev] [PATCH v4 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-28 20:35  3%       ` [dpdk-dev] [PATCH v4 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 04/14] ring: remove debug setting Bruce Richardson
2017-03-28 20:35  4%       ` [dpdk-dev] [PATCH v4 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 06/14] ring: remove watermark support Bruce Richardson
2017-03-28 20:35  2%       ` [dpdk-dev] [PATCH v4 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-03-28 20:36  1%       ` [dpdk-dev] [PATCH v4 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-03-29 13:09           ` [dpdk-dev] [PATCH v5 00/14] refactor and cleanup of rte_ring Bruce Richardson
2017-03-29 13:09  5%         ` [dpdk-dev] [PATCH v5 01/14] ring: remove split cacheline build setting Bruce Richardson
2017-03-29 13:09  3%         ` [dpdk-dev] [PATCH v5 03/14] ring: eliminate duplication of size and mask fields Bruce Richardson
2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 04/14] ring: remove debug setting Bruce Richardson
2017-03-29 13:09  4%         ` [dpdk-dev] [PATCH v5 05/14] ring: remove the yield when waiting for tail update Bruce Richardson
2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 06/14] ring: remove watermark support Bruce Richardson
2017-03-29 13:09  2%         ` [dpdk-dev] [PATCH v5 07/14] ring: make bulk and burst fn return vals consistent Bruce Richardson
2017-04-13  6:42  0%           ` Wang, Zhihong
2017-03-29 13:09  1%         ` [dpdk-dev] [PATCH v5 09/14] ring: allow dequeue fns to return remaining entry count Bruce Richardson
2017-02-24 16:28     [dpdk-dev] [PATCH v2 0/2] ethdev: abstraction layer for QoS hierarchical scheduler Cristian Dumitrescu
2017-02-24 16:28  1% ` [dpdk-dev] [PATCH v2 2/2] ethdev: add hierarchical scheduler API Cristian Dumitrescu
2017-02-25  1:22     [dpdk-dev] [PATCH 00/16] Wind River Systems AVP PMD Allain Legacy
2017-02-25  1:23  3% ` [dpdk-dev] [PATCH 04/16] net/avp: add PMD version map file Allain Legacy
2017-02-26 19:08     ` [dpdk-dev] [PATCH v2 00/16] Wind River Systems AVP PMD Allain Legacy
2017-02-26 19:08  3%   ` [dpdk-dev] [PATCH v2 04/15] net/avp: add PMD version map file Allain Legacy
2017-03-02  0:19       ` [dpdk-dev] [PATCH v3 00/16] Wind River Systems AVP PMD Allain Legacy
2017-03-02  0:19  3%     ` [dpdk-dev] [PATCH v3 04/16] net/avp: add PMD version map file Allain Legacy
2017-03-13 19:16         ` [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD Allain Legacy
2017-03-13 19:16  3%       ` [dpdk-dev] [PATCH v4 04/17] net/avp: add PMD version map file Allain Legacy
2017-03-16 14:52  0%         ` Ferruh Yigit
2017-03-14 17:37           ` [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? Vincent JARDIN
2017-03-15  4:10             ` O'Driscoll, Tim
2017-03-16 23:17  3%           ` Stephen Hemminger
2017-03-16 23:41  0%             ` [dpdk-dev] [PATCH v4 00/17] Wind River Systems AVP PMD vs virtio? - ivshmem is back Vincent JARDIN
2017-03-17  0:08  0%               ` Wiles, Keith
2017-03-17  0:15  0%                 ` O'Driscoll, Tim
2017-03-17  0:11  0%               ` Wiles, Keith
2017-03-17  0:14  0%                 ` Stephen Hemminger
2017-03-02  4:03  3% [dpdk-dev] [PATCH 0/6] introduce prgdev abstraction library Chen Jing D(Mark)
2017-03-02  4:03  4% ` [dpdk-dev] [PATCH 5/6] prgdev: add ABI control info Chen Jing D(Mark)
2017-03-02 19:29     [dpdk-dev] [PATCH 0/5] librte_cfgfile enhancement Allain Legacy
2017-03-02 19:29     ` [dpdk-dev] [PATCH 1/5] cfgfile: configurable comment character Allain Legacy
2017-03-02 21:10       ` Bruce Richardson
2017-03-03  0:53         ` Yuanhan Liu
2017-03-03 11:17           ` Dumitrescu, Cristian
2017-03-03 11:31             ` Legacy, Allain
2017-03-03 12:10  4%           ` Bruce Richardson
2017-03-03 12:17  0%             ` Legacy, Allain
2017-03-03 13:10  0%               ` Bruce Richardson
2017-03-03  9:31     [dpdk-dev] [PATCH 0/4] support replace filter function Beilei Xing
2017-03-03  9:31     ` [dpdk-dev] [PATCH 1/4] net/i40e: support replace filter type Beilei Xing
2017-03-08 15:50       ` Ferruh Yigit
2017-03-09  5:59  3%     ` Xing, Beilei
2017-03-09 10:01  0%       ` Ferruh Yigit
2017-03-09 10:43  0%         ` Xing, Beilei
2017-03-03  9:31     ` [dpdk-dev] [PATCH 4/4] net/i40e: refine consistent tunnel filter Beilei Xing
2017-03-08 15:50       ` Ferruh Yigit
2017-03-09  6:11  3%     ` Xing, Beilei
2017-03-03  9:51  4% [dpdk-dev] [PATCH 00/17] vhost: generic vhost API Yuanhan Liu
2017-03-03  9:51  3% ` [dpdk-dev] [PATCH 16/17] vhost: rename header file Yuanhan Liu
2017-03-23  7:10  4% ` [dpdk-dev] [PATCH v2 00/22] vhost: generic vhost API Yuanhan Liu
2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 02/22] net/vhost: remove feature related APIs Yuanhan Liu
2017-03-23  7:10  3%   ` [dpdk-dev] [PATCH v2 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 10/22] vhost: export the number of vrings Yuanhan Liu
2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 13/22] vhost: do not include net specific headers Yuanhan Liu
2017-03-23  7:10  4%   ` [dpdk-dev] [PATCH v2 14/22] vhost: rename device ops struct Yuanhan Liu
2017-03-23  7:10  3%   ` [dpdk-dev] [PATCH v2 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
2017-03-23  7:10  5%   ` [dpdk-dev] [PATCH v2 19/22] vhost: rename header file Yuanhan Liu
2017-03-28 12:45  4%   ` [dpdk-dev] [PATCH v3 00/22] vhost: generic vhost API Yuanhan Liu
2017-03-28 12:45  5%     ` [dpdk-dev] [PATCH v3 02/22] net/vhost: remove feature related APIs Yuanhan Liu
2017-03-28 12:45  3%     ` [dpdk-dev] [PATCH v3 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 10/22] vhost: export the number of vrings Yuanhan Liu
2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 13/22] vhost: do not include net specific headers Yuanhan Liu
2017-03-28 12:45  4%     ` [dpdk-dev] [PATCH v3 14/22] vhost: rename device ops struct Yuanhan Liu
2017-03-28 12:45  2%     ` [dpdk-dev] [PATCH v3 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
2017-03-28 12:45 10%     ` [dpdk-dev] [PATCH v3 19/22] vhost: rename header file Yuanhan Liu
2017-04-01  7:22  3%     ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 02/22] net/vhost: remove feature related APIs Yuanhan Liu
2017-04-01  7:22  3%       ` [dpdk-dev] [PATCH v4 04/22] vhost: make notify ops per vhost driver Yuanhan Liu
2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 10/22] vhost: export the number of vrings Yuanhan Liu
2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 12/22] vhost: drop the Rx and Tx queue macro Yuanhan Liu
2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 13/22] vhost: do not include net specific headers Yuanhan Liu
2017-04-01  7:22  4%       ` [dpdk-dev] [PATCH v4 14/22] vhost: rename device ops struct Yuanhan Liu
2017-04-01  7:22  3%       ` [dpdk-dev] [PATCH v4 18/22] vhost: introduce API to start a specific driver Yuanhan Liu
2017-04-01  7:22  5%       ` [dpdk-dev] [PATCH v4 19/22] vhost: rename header file Yuanhan Liu
2017-04-01  8:44  0%       ` [dpdk-dev] [PATCH v4 00/22] vhost: generic vhost API Yuanhan Liu
2017-03-03 14:48     [dpdk-dev] [PATCH v1 0/4] net/i40e: QinQ filter Bernard Iremonger
2017-03-23 16:38     ` [dpdk-dev] [PATCH v2 1/3] net/i40e: add QinQ wrapper function Bernard Iremonger
2017-03-28  8:11       ` Lu, Wenzhuo
2017-03-28 11:09         ` Iremonger, Bernard
2017-03-28 13:23           ` Iremonger, Bernard
2017-03-29  0:52  3%         ` Lu, Wenzhuo
2017-03-29  9:11  0%           ` Iremonger, Bernard
2017-03-03 15:40     [dpdk-dev] [PATCH 00/12] introduce fail-safe PMD Gaetan Rivet
2017-03-14 14:49     ` [dpdk-dev] [PATCH v2 00/13] " Gaëtan Rivet
2017-03-15  3:28       ` Bruce Richardson
2017-03-15 11:15         ` Thomas Monjalon
2017-03-15 14:25           ` Gaëtan Rivet
2017-03-16 20:50             ` Neil Horman
2017-03-17 10:56               ` Gaëtan Rivet
2017-03-18 19:51  3%             ` Neil Horman
2017-03-04  1:10     [dpdk-dev] [PATCH v3 0/2] ethdev: abstraction layer for QoS hierarchical scheduler Cristian Dumitrescu
2017-03-04  1:10  1% ` [dpdk-dev] [PATCH v3 2/2] ethdev: add hierarchical scheduler API Cristian Dumitrescu
2017-03-30 10:32  0%   ` Hemant Agrawal
2017-03-06 16:57     ` [dpdk-dev] [PATCH v3 1/2] ethdev: add capability control API Thomas Monjalon
2017-03-06 18:28       ` Dumitrescu, Cristian
2017-03-06 20:21         ` Thomas Monjalon
2017-03-06 20:41  3%       ` Wiles, Keith
2017-03-07 11:11     [dpdk-dev] Issues with ixgbe and rte_flow Le Scouarnec Nicolas
2017-03-08  3:16     ` Lu, Wenzhuo
2017-03-08  9:24       ` Le Scouarnec Nicolas
2017-03-08 15:41  3%     ` Adrien Mazarguil
2017-03-09 15:37     [dpdk-dev] [PATCH v2] lpm: extend IPv6 next hop field Thomas Monjalon
2017-03-14 17:17  4% ` [dpdk-dev] [PATCH v3] " Vladyslav Buslov
2017-03-09 16:25     [dpdk-dev] [PATCH v11 0/7] Expanded statistics reporting Remy Horton
2017-03-09 16:25  1% ` [dpdk-dev] [PATCH v11 1/7] lib: add information metrics library Remy Horton
2017-03-09 16:25  2% ` [dpdk-dev] [PATCH v11 3/7] lib: add bitrate statistics library Remy Horton
2017-03-15 17:21     [dpdk-dev] [PATCH] doc: add eventdev library to programmers guide Harry van Haaren
2017-03-30 13:36  3% ` Jerin Jacob
2017-03-30 15:59  3%   ` Van Haaren, Harry
2017-03-31 12:43  0%   ` Thomas Monjalon
2017-03-31 13:31  0%     ` Jerin Jacob
2017-03-17 21:15  1% [dpdk-dev] [PATCH] net/ark: poll-mode driver for AtomicRules Arkville Ed Czeck
2017-03-20 21:14  1% [dpdk-dev] [PATCH v2] " Ed Czeck
2017-03-21 21:43  3% [dpdk-dev] [PATCH v3 1/7] net/ark: PMD for Atomic Rules Arkville driver stub Ed Czeck
2017-03-22 18:16  0% ` Ferruh Yigit
2017-03-23  1:03  3% [dpdk-dev] [PATCH v4 " Ed Czeck
2017-03-23 22:59  3% ` [dpdk-dev] [PATCH v5 " Ed Czeck
2017-03-29  1:04  3% ` [dpdk-dev] [PATCH v6 " Ed Czeck
2017-03-29 21:32  3%   ` [dpdk-dev] [PATCH v7 1/7] net/ark: stub PMD for Atomic Rules Arkville Ed Czeck
2017-04-04 19:50  3%     ` [dpdk-dev] [PATCH v8 " Ed Czeck
2017-03-23 10:02     [dpdk-dev] [PATCH v3 0/5] pipeline personalization profile support Beilei Xing
2017-03-24 10:19     ` [dpdk-dev] [PATCH v4 " Beilei Xing
2017-03-24 10:19       ` [dpdk-dev] [PATCH v4 1/5] net/i40e: add pipeline personalization profile processing Beilei Xing
2017-03-24 14:52         ` Chilikin, Andrey
2017-03-25  4:04           ` Xing, Beilei
2017-03-25 21:03  3%         ` Chilikin, Andrey
2017-03-27  2:09  0%           ` Xing, Beilei
2017-03-27 20:21     [dpdk-dev] [PATCH v12 0/6] Expanded statistics reporting Remy Horton
2017-03-27 20:21  1% ` [dpdk-dev] [PATCH v12 1/6] lib: add information metrics library Remy Horton
2017-03-27 20:21  2% ` [dpdk-dev] [PATCH v12 3/6] lib: add bitrate statistics library Remy Horton
2017-03-29 18:28     [dpdk-dev] [PATCH v13 0/6] Expanded statistics reporting Remy Horton
2017-03-29 18:28  1% ` [dpdk-dev] [PATCH v13 1/6] lib: add information metrics library Remy Horton
2017-03-29 18:28  2% ` [dpdk-dev] [PATCH v13 3/6] lib: add bitrate statistics library Remy Horton
2017-03-30  9:41     [dpdk-dev] next technical board meeting, 2017-04-06 Jerin Jacob
2017-03-30 14:25     ` Wiles, Keith
2017-03-30 15:05  3%   ` Olivier Matz
2017-03-30 15:51  4%     ` Wiles, Keith
2017-03-30 21:00     [dpdk-dev] [PATCH v14 0/6] Expanded statistics reporting Remy Horton
2017-03-30 21:00  1% ` [dpdk-dev] [PATCH v14 1/6] lib: add information metrics library Remy Horton
2017-03-30 21:00  2% ` [dpdk-dev] [PATCH v14 3/6] lib: add bitrate statistics library Remy Horton
2017-03-30 21:50  3% [dpdk-dev] [PATCH v2 0/5] Extended xstats API in ethdev library to allow grouping of stats Michal Jastrzebski
2017-03-30 21:50  1% ` [dpdk-dev] [PATCH v2 1/5] add new xstats API retrieving by id Michal Jastrzebski
2017-04-03 12:09  3%   ` [dpdk-dev] [PATCH v3 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
2017-04-03 12:09  1%     ` [dpdk-dev] [PATCH v3 1/3] add new xstats API retrieving by id Jacek Piasecki
2017-04-03 12:37  0%       ` Van Haaren, Harry
2017-04-04 15:03  0%       ` Thomas Monjalon
2017-04-04 15:45  0%         ` Van Haaren, Harry
2017-04-04 16:18  0%           ` Thomas Monjalon
2017-04-10 17:59  4%       ` [dpdk-dev] [PATCH v4 0/3] Extended xstats API in ethdev library to allow grouping of stats Jacek Piasecki
2017-04-10 17:59  1%         ` [dpdk-dev] [PATCH v4 1/3] ethdev: new xstats API add retrieving by ID Jacek Piasecki
2017-04-11 16:37  4%           ` [dpdk-dev] [PATCH v5 0/3] Extended xstats API in ethdev library to allow grouping of stats Michal Jastrzebski
2017-04-11 16:37  1%             ` [dpdk-dev] [PATCH v5 1/3] ethdev: new xstats API add retrieving by ID Michal Jastrzebski
2017-04-12  8:56  5%               ` Van Haaren, Harry
2017-04-13 14:59  4%               ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Kuba Kozak
2017-04-13 14:59  3%                 ` [dpdk-dev] [PATCH v6 1/5] ethdev: new xstats API add retrieving by ID Kuba Kozak
2017-04-13 16:23  0%                   ` Van Haaren, Harry
2017-04-13 14:59  4%                 ` [dpdk-dev] [PATCH v6 2/5] ethdev: added new function for xstats ID Kuba Kozak
2017-04-13 16:21  0%                 ` [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats Van Haaren, Harry
2017-04-20 20:31  0%                 ` Thomas Monjalon
2017-04-24 12:32  3%                   ` Olivier Matz
2017-04-24 12:41  0%                     ` Thomas Monjalon
2017-03-31 18:31     [dpdk-dev] [PATCH] eal: deprecate rte_cpu_check_supported Aaron Conole
2017-04-04 15:38  3% ` [dpdk-dev] [PATCH 1/2] eal: add rte_cpu_is_supported to map Aaron Conole
2017-04-06 20:58  0%   ` Thomas Monjalon
2017-04-03 10:52  9% [dpdk-dev] [PATCH] doc: update deprecation notice and release_17_05 for ABI change akhil.goyal
2017-04-03 15:32  4% ` Mcnamara, John
2017-04-04  9:59  4%   ` De Lara Guarch, Pablo
2017-04-05 10:00 14% [dpdk-dev] [PATCH] mbuf: bump library version Olivier Matz
2017-04-05 11:41  0% ` Thomas Monjalon
2017-04-05 15:03  4% [dpdk-dev] [PATCH] ring: fix build with icc Ferruh Yigit
2017-04-06  6:51     [dpdk-dev] [PATCH v6 0/3] net/i40e: vf port reset Wei Zhao
2017-04-10  3:02     ` [dpdk-dev] [PATCH v7 " Wei Zhao
2017-04-10  3:02       ` [dpdk-dev] [PATCH v7 1/3] lib/librte_ether: add support for " Wei Zhao
2017-04-20 20:49  3%     ` Thomas Monjalon
2017-04-21  3:20  0%       ` Zhao1, Wei
2017-04-12  9:02  3% [dpdk-dev] [PATCH v3 0/3] MAC address fail to be added shouldn't be stored Wei Dai
2017-04-12  9:02  5% ` [dpdk-dev] [PATCH v3 2/3] doc: change type of return value of adding MAC addr Wei Dai
2017-04-13  8:21  3% ` [dpdk-dev] [PATCH v4 0/3] MAC address fail to be added shouldn't be stored Wei Dai
2017-04-13 13:54  0%   ` Ananyev, Konstantin
2017-04-13  8:21  5% ` [dpdk-dev] [PATCH v4 2/3] doc: change type of return value of adding MAC addr Wei Dai
2017-04-13  5:29     [dpdk-dev] [PATCH] ethdev: fix compilation issue with strict flags Shahaf Shuler
2017-04-13  9:36  5% ` Van Haaren, Harry
2017-04-14 13:27     [dpdk-dev] 17.08 Roadmap O'Driscoll, Tim
2017-04-14 14:30  4% ` Adrien Mazarguil
2017-04-19  1:23  0%   ` Lu, Wenzhuo
2017-04-18 15:48  4% [dpdk-dev] [PATCH] doc: postpone ABI change in ethdev Bernard Iremonger
2017-04-26 15:02  4% ` Mcnamara, John
2017-04-25  9:40     [dpdk-dev] [PATCH v2 4/5] app/testpmd: request link status interrupt Ferruh Yigit
2017-04-25 10:10     ` [dpdk-dev] [PATCH 1/3] doc: fix missing backquotes Gaetan Rivet
2017-04-25 10:10  4%   ` [dpdk-dev] [PATCH 2/3] doc: add device removal event to release note Gaetan Rivet
2017-04-27 14:42     [dpdk-dev] [PATCH v1 0/6] Extended xstats API in ethdev library to allow grouping of stats Michal Jastrzebski
2017-04-27 14:42  1% ` [dpdk-dev] [PATCH v1 1/6] ethdev: revert patches extending xstats API in ethdev Michal Jastrzebski
2017-04-27 14:42  2% ` [dpdk-dev] [PATCH v1 2/6] ethdev: retrieve xstats by ID Michal Jastrzebski
2017-04-27 14:42  4% ` [dpdk-dev] [PATCH v1 3/6] ethdev: get xstats ID by name Michal Jastrzebski
2017-04-28 13:25     [dpdk-dev] [RFC PATCH] timer: inform periodic timers of multiple expiries Bruce Richardson
2017-04-28 13:29  4% ` Bruce Richardson
2017-04-29  6:08  3% [dpdk-dev] [PATCH v5 0/3] MAC address fail to be added shouldn't be stored Wei Dai
2017-04-29  6:08  5% ` [dpdk-dev] [PATCH v5 2/3] doc: change type of return value of adding MAC addr Wei Dai
2017-04-30  4:11  3% [dpdk-dev] [PATCH v5 0/3] MAC address fail to be added shouldn't be stored Wei Dai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).