DPDK patches and discussions
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download: 
* Re: [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (9 preceding siblings ...)
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
@ 2015-07-23 14:18  3%       ` Liang, Cunming
  10 siblings, 0 replies; 200+ results
From: Liang, Cunming @ 2015-07-23 14:18 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

Hi Thomas and all,

This patch set postponed from v2.0, and widely reviewed during this release cycle.
The packet I/O interrupt framework is the prerequisite of all PMDs to support packet I/O intr.
There's no significant change since the last three version(v13~v15), the changes are about ABI and version map fixing.
It missed the rc1 in the last minutes, I'm now asking for the exception to make it go into v2.1 rc2 if nobody will reject it.
Again, thanks for all the comments by Stephen, David, Neil and Thomas.

Thanks,
Steve

> -----Original Message-----
> From: Liang, Cunming
> Sent: Monday, July 20, 2015 11:02 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com
> Cc: shemming@brocade.com; david.marchand@6wind.com; Zhou, Danny; Liu,
> Yong; nhorman@tuxdriver.com; Liang, Cunming
> Subject: [PATCH v15 00/13] Interrupt mode PMD
> 
> v15 changes
>  - remove unnecessary RTE_NEXT_ABI comment
>  - remove ifdef RTE_NEXT_ABI from header file
> 
> v14 changes
>  - per-patch basis ABI compatibility rework
>  - remove unnecessary 'local: *' from version map
>  - minor comments rework
> 
> v13 changes
>  - version map cleanup for v2.1
>  - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
> 
> Patch series v12
> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
> Acked-by: Danny Zhou <danny.zhou@intel.com>
> 
> v12 changes
>  - bsd cleanup for unused variable warning
>  - fix awkward line split in debug message
> 
> v11 changes
>  - typo cleanup and check kernel style
> 
> v10 changes
>  - code rework to return actual error code
>  - bug fix for lsc when using uio_pci_generic
> 
> v9 changes
>  - code rework to fix open comment
>  - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
>  - new patch to turn off the feature by default so as to avoid v2.1 abi broken
> 
> v8 changes
>  - remove condition check for only vfio-msix
>  - add multiplex intr support when only one intr vector allowed
>  - lsc and rxq interrupt runtime enable decision
>  - add safe event delete while the event wakeup execution happens
> 
> v7 changes
>  - decouple epoll event and intr operation
>  - add condition check in the case intr vector is disabled
>  - renaming some APIs
> 
> v6 changes
>  - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
>  - rewrite rte_intr_rx_wait/rte_intr_rx_set.
>  - using vector number instead of queue_id as interrupt API params.
>  - patch reorder and split.
> 
> v5 changes
>  - Rebase the patchset onto the HEAD
>  - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
>  - Export wait-for-rx interrupt function for shared libraries
>  - Split-off a new patch file for changed struct rte_intr_handle that
>    other patches depend on, to avoid breaking git bisect
>  - Change sample applicaiton to accomodate EAL function spec change
>    accordingly
> 
> v4 changes
>  - Export interrupt enable/disable functions for shared libraries
>  - Adjust position of new-added structure fields and functions to
>    avoid breaking ABI
> 
> v3 changes
>  - Add return value for interrupt enable/disable functions
>  - Move spinlok from PMD to L3fwd-power
>  - Remove unnecessary variables in e1000_mac_info
>  - Fix miscelleous review comments
> 
> v2 changes
>  - Fix compilation issue in Makefile for missed header file.
>  - Consolidate internal and community review comments of v1 patch set.
> 
> The patch series introduce low-latency one-shot rx interrupt into DPDK with
> polling and interrupt mode switch control example.
> 
> DPDK userspace interrupt notification and handling mechanism is based on UIO
> with below limitation:
> 1) It is designed to handle LSC interrupt only with inefficient suspended
>    pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
>    which then wakes up DPDK polling thread). In this way, it introduces
>    non-deterministic wakeup latency for DPDK polling thread as well as packet
>    latency if it is used to handle Rx interrupt.
> 2) UIO only supports a single interrupt vector which has to been shared by
>    LSC interrupt and interrupts assigned to dedicated rx queues.
> 
> This patchset includes below features:
> 1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF
> only)
> .
> 2) Build on top of the VFIO mechanism instead of UIO, so it could support
>    up to 64 interrupt vectors for rx queue interrupts.
> 3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
>    VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
>    user space.
> 4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
>    switch algorithms in L3fwd-power example.
> 
> Known limitations:
> 1) It does not work for UIO due to a single interrupt eventfd shared by LSC
>    and rx queue interrupt handlers causes a mess. [FIXED]
> 2) LSC interrupt is not supported by VF driver, so it is by default disabled
>    in L3fwd-power now. Feel free to turn in on if you want to support both LSC
>    and rx queue interrupts on a PF.
> 
> Cunming Liang (13):
>   eal/linux: add interrupt vectors support in intr_handle
>   eal/linux: add rte_epoll_wait/ctl support
>   eal/linux: add API to set rx interrupt event monitor
>   eal/linux: fix comments typo on vfio msi
>   eal/linux: map eventfd to VFIO MSI-X intr vector
>   eal/linux: standalone intr event fd create support
>   eal/linux: fix lsc read error in uio_pci_generic
>   eal/bsd: dummy for new intr definition
>   eal/bsd: fix inappropriate linuxapp referred in bsd
>   ethdev: add rx intr enable, disable and ctl functions
>   ixgbe: enable rx queue interrupts for both PF and VF
>   igb: enable rx queue interrupts for PF
>   l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
>     switch
> 
>  drivers/net/e1000/igb_ethdev.c                     | 311 ++++++++++--
>  drivers/net/ixgbe/ixgbe_ethdev.c                   | 527
> ++++++++++++++++++++-
>  drivers/net/ixgbe/ixgbe_ethdev.h                   |   4 +
>  examples/l3fwd-power/main.c                        | 205 ++++++--
>  lib/librte_eal/bsdapp/eal/eal_interrupts.c         |  42 ++
>  .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  74 ++-
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map      |   5 +
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 414 ++++++++++++++--
>  .../linuxapp/eal/include/exec-env/rte_interrupts.h | 153 ++++++
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   8 +
>  lib/librte_ether/rte_ethdev.c                      | 147 ++++++
>  lib/librte_ether/rte_ethdev.h                      | 104 ++++
>  lib/librte_ether/rte_ether_version.map             |   4 +
>  13 files changed, 1870 insertions(+), 128 deletions(-)
> 
> --
> 1.8.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
  2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
  2015-07-23 11:05  4% ` Singh, Jasvinder
  2015-07-23 11:07  4% ` Mrzyglod, DanielX T
@ 2015-07-23 11:34  4% ` Gajdzica, MaciejX T
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-23 11:34 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 23, 2015 1:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
> 
> v2 changes:
> -changed item on LPM table to add LPM IPv6 -removed item for ACL table and
> replaced with item on table ops -added item for hash tables
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
  2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
@ 2015-07-23 11:05  4% ` Singh, Jasvinder
  2015-07-23 11:07  4% ` Mrzyglod, DanielX T
  2015-07-23 11:34  4% ` Gajdzica, MaciejX T
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-23 11:05 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 23, 2015 12:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
> 
> v2 changes:
> -changed item on LPM table to add LPM IPv6 -removed item for ACL table
> and replaced with item on table ops -added item for hash tables
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
  2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
  2015-07-23 11:05  4% ` Singh, Jasvinder
@ 2015-07-23 11:07  4% ` Mrzyglod, DanielX T
  2015-07-23 11:34  4% ` Gajdzica, MaciejX T
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-23 11:07 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 23, 2015 1:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
> 
> v2 changes:
> -changed item on LPM table to add LPM IPv6
> -removed item for ACL table and replaced with item on table ops
> -added item for hash tables
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  doc/guides/rel_notes/deprecation.rst |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 5330d3b..677f111 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -35,3 +35,13 @@ Deprecation Notices
>  * The following fields have been deprecated in rte_eth_stats:
>    imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
>    tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
> +
> +* librte_table (rte_table_lpm.h, rte_table_lpm_ipv6.h): A new parameter to
> hold
> +  the table name will be added to the LPM table parameter structure.
> +
> +* librte_table (rte_table.h): New functions for table entry bulk add/delete will
> +  be added to the table operations structure.
> +
> +* librte_table (rte_table_hash.h): Key mask parameter will be added to the hash
> +  table parameter structure for 8-byte key and 16-byte key extendible bucket
> and
> +  LRU tables.
> --
> 1.7.4.1

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2] announce ABI change for librte_table
@ 2015-07-23 10:59  4% Cristian Dumitrescu
  2015-07-23 11:05  4% ` Singh, Jasvinder
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-23 10:59 UTC (permalink / raw)
  To: dev

v2 changes:
-changed item on LPM table to add LPM IPv6
-removed item for ACL table and replaced with item on table ops 
-added item for hash tables

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/deprecation.rst |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 5330d3b..677f111 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -35,3 +35,13 @@ Deprecation Notices
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
+
+* librte_table (rte_table_lpm.h, rte_table_lpm_ipv6.h): A new parameter to hold
+  the table name will be added to the LPM table parameter structure.
+
+* librte_table (rte_table.h): New functions for table entry bulk add/delete will
+  be added to the table operations structure.
+
+* librte_table (rte_table_hash.h): Key mask parameter will be added to the hash
+  table parameter structure for 8-byte key and 16-byte key extendible bucket and
+  LRU tables.
-- 
1.7.4.1

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-22 19:48  0%     ` Stephen Hemminger
@ 2015-07-23 10:52  0%       ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2015-07-23 10:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, July 22, 2015 8:48 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
> 
> On Wed, 22 Jul 2015 19:28:51 +0100
> Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:
> 
> > Add the ability for the upper layer to query RX/TX queue information.
> >
> > Add new structures:
> > struct rte_eth_rxq_info
> > struct rte_eth_txq_info
> >
> > new functions:
> > rte_eth_rx_queue_info_get
> > rte_eth_tx_queue_info_get
> >
> > into rte_etdev API.
> >
> > Left extra free space in the queue info structures,
> > so extra fields could be added later without ABI breakage.
> >
> > v2 changes:
> > - Add formal check for the qinfo input parameter.
> > - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> >
> > v3 changes:
> > - Updated rte_ether_version.map
> > - Merged with latest changes
> >
> > v4 changes:
> > - rte_ether_version.map: move new functions into DPDK_2.1 sub-space.
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> Since all this data should be rxconf already, Is it possible
> to do a generic version of this and not have to change every driver.

I don't think it is possible to implement these two functions at rte_etdev level only.
At least not with current ethdev/PMD implementation:
-  Inside struct rte_eth_dev_info we have only: 'struct rte_eth_rxconf default_rxconf;'.
We don't have rxconf here for each configured rx queue.
That information is maintained by PMD and inside PMD, different devices have different format for queue structure.
- rte_eth_rxq_info contains not only rxconf but some extra information: mempool in use by that queue,
 min/max possible number of descriptors.
 Also my intention was that in future that structure would be extended to provide some RT info about queue:
 (number of free/used descriptors from SW point of view, etc).  

Konstantin

> 
> You only handled the Intel hardware drivers. But there also
> all the virtual drivers, other vendors etc.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
  2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
  2015-07-16 21:25  4% ` Dumitrescu, Cristian
  2015-07-16 21:28  4% ` Neil Horman
@ 2015-07-23 10:18  4% ` Dumitrescu, Cristian
  2 siblings, 0 replies; 200+ results
From: Dumitrescu, Cristian @ 2015-07-23 10:18 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
@ 2015-07-22 19:48  0%     ` Stephen Hemminger
  2015-07-23 10:52  0%       ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2015-07-22 19:48 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev

On Wed, 22 Jul 2015 19:28:51 +0100
Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:

> Add the ability for the upper layer to query RX/TX queue information.
> 
> Add new structures:
> struct rte_eth_rxq_info
> struct rte_eth_txq_info
> 
> new functions:
> rte_eth_rx_queue_info_get
> rte_eth_tx_queue_info_get
> 
> into rte_etdev API.
> 
> Left extra free space in the queue info structures,
> so extra fields could be added later without ABI breakage.
> 
> v2 changes:
> - Add formal check for the qinfo input parameter.
> - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> 
> v3 changes:
> - Updated rte_ether_version.map
> - Merged with latest changes
> 
> v4 changes:
> - rte_ether_version.map: move new functions into DPDK_2.1 sub-space.
> 
> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Since all this data should be rxconf already, Is it possible
to do a generic version of this and not have to change every driver.

You only handled the Intel hardware drivers. But there also
all the virtual drivers, other vendors etc.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-20 12:19  2% ` [dpdk-dev] [PATCHv3 1/5] " Konstantin Ananyev
  2015-07-22 16:50  0%   ` Zhang, Helin
@ 2015-07-22 18:28  2%   ` Konstantin Ananyev
  2015-07-22 19:48  0%     ` Stephen Hemminger
  1 sibling, 1 reply; 200+ results
From: Konstantin Ananyev @ 2015-07-22 18:28 UTC (permalink / raw)
  To: dev

Add the ability for the upper layer to query RX/TX queue information.

Add new structures:
struct rte_eth_rxq_info
struct rte_eth_txq_info

new functions:
rte_eth_rx_queue_info_get
rte_eth_tx_queue_info_get

into rte_etdev API.

Left extra free space in the queue info structures,
so extra fields could be added later without ABI breakage.

v2 changes:
- Add formal check for the qinfo input parameter.
- As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'

v3 changes:
- Updated rte_ether_version.map
- Merged with latest changes

v4 changes:
- rte_ether_version.map: move new functions into DPDK_2.1 sub-space.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 87 +++++++++++++++++++++++++++++++---
 lib/librte_ether/rte_ether_version.map |  2 +
 3 files changed, 137 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 94104ce..a94c119 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 int
+rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
+rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_tx_queues) {
+		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
 rte_eth_dev_set_mc_addr_list(uint8_t port_id,
 			     struct ether_addr *mc_addr_set,
 			     uint32_t nb_mc_addr)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c901a2c..0c6705e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -960,6 +960,30 @@ struct rte_eth_xstats {
 	uint64_t value;
 };
 
+/**
+ * Ethernet device RX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_rxq_info {
+	struct rte_mempool *mp;     /**< mempool used by that queue. */
+	struct rte_eth_rxconf conf; /**< queue config parameters. */
+	uint8_t scattered_rx;       /**< scattered packets RX supported. */
+	uint16_t nb_desc;           /**< configured number of RXDs. */
+	uint16_t max_desc;          /**< max allowed number of RXDs. */
+	uint16_t min_desc;          /**< min allowed number of RXDs. */
+} __rte_cache_aligned;
+
+/**
+ * Ethernet device TX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_txq_info {
+	struct rte_eth_txconf conf; /**< queue config parameters. */
+	uint16_t nb_desc;           /**< configured number of TXDs. */
+	uint16_t max_desc;          /**< max allowed number of TXDs. */
+	uint16_t min_desc;          /**< min allowed number of TXDs. */
+} __rte_cache_aligned;
+
 struct rte_eth_dev;
 
 struct rte_eth_dev_callback;
@@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
+
+typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
+
 typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);
 /**< @internal Set MTU. */
 
@@ -1451,9 +1481,13 @@ struct eth_dev_ops {
 	rss_hash_update_t rss_hash_update;
 	/** Get current RSS hash configuration. */
 	rss_hash_conf_get_t rss_hash_conf_get;
-	eth_filter_ctrl_t              filter_ctrl;          /**< common filter control*/
+	eth_filter_ctrl_t              filter_ctrl;
+	/**< common filter control. */
 	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
-
+	eth_rxq_info_get_t rxq_info_get;
+	/**< retrieve RX queue information. */
+	eth_txq_info_get_t txq_info_get;
+	/**< retrieve TX queue information. */
 	/** Turn IEEE1588/802.1AS timestamping on. */
 	eth_timesync_enable_t timesync_enable;
 	/** Turn IEEE1588/802.1AS timestamping off. */
@@ -3721,6 +3755,46 @@ int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		struct rte_eth_rxtx_callback *user_cb);
 
 /**
+ * Retrieve information about given port's RX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The RX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo);
+
+/**
+ * Retrieve information about given port's TX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The TX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo);
+
+/*
  * Retrieve number of available registers for access
  *
  * @param port_id
@@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
  */
 int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
 
-#ifdef __cplusplus
-}
-#endif
-
 /**
  * Set the list of multicast addresses to filter on an Ethernet device.
  *
@@ -3882,4 +3952,9 @@ extern int rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
  */
 extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
 					      struct timespec *timestamp);
+
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _RTE_ETHDEV_H_ */
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 23cfee9..2bae2cf 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -117,9 +117,11 @@ DPDK_2.1 {
 	rte_eth_dev_is_valid_port;
 	rte_eth_dev_set_eeprom;
 	rte_eth_dev_set_mc_addr_list;
+	rte_eth_rx_queue_info_get;
 	rte_eth_timesync_disable;
 	rte_eth_timesync_enable;
 	rte_eth_timesync_read_rx_timestamp;
 	rte_eth_timesync_read_tx_timestamp;
+	rte_eth_tx_queue_info_get;
 
 } DPDK_2.0;
-- 
1.8.3.1

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-22 16:50  0%   ` Zhang, Helin
@ 2015-07-22 17:00  0%     ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2015-07-22 17:00 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev



> -----Original Message-----
> From: Zhang, Helin
> Sent: Wednesday, July 22, 2015 5:51 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Konstantin Ananyev
> > Sent: Monday, July 20, 2015 5:19 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue
> > information
> >
> > Add the ability for the upper layer to query RX/TX queue information.
> >
> > Add new structures:
> > struct rte_eth_rxq_info
> > struct rte_eth_txq_info
> >
> > new functions:
> > rte_eth_rx_queue_info_get
> > rte_eth_tx_queue_info_get
> >
> > into rte_etdev API.
> >
> > Left extra free space in the queue info structures, so extra fields could be added
> > later without ABI breakage.
> >
> > v2 changes:
> > - Add formal check for the qinfo input parameter.
> > - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> >
> > v3 changes:
> > - Updated rte_ether_version.map
> > - Merged with latest changes
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev.h          | 87
> > +++++++++++++++++++++++++++++++---
> >  lib/librte_ether/rte_ether_version.map |  2 +
> >  3 files changed, 137 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index
> > 94104ce..a94c119 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t
> > queue_id,  }
> >
> >  int
> > +rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_rxq_info *qinfo)
> > +{
> > +	struct rte_eth_dev *dev;
> > +
> > +	if (qinfo == NULL)
> > +		return -EINVAL;
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +	if (queue_id >= dev->data->nb_rx_queues) {
> > +		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
> > +
> > +	memset(qinfo, 0, sizeof(*qinfo));
> > +	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_txq_info *qinfo)
> > +{
> > +	struct rte_eth_dev *dev;
> > +
> > +	if (qinfo == NULL)
> > +		return -EINVAL;
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +	if (queue_id >= dev->data->nb_tx_queues) {
> > +		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
> > +
> > +	memset(qinfo, 0, sizeof(*qinfo));
> > +	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
> > +	return 0;
> > +}
> > +
> > +int
> >  rte_eth_dev_set_mc_addr_list(uint8_t port_id,
> >  			     struct ether_addr *mc_addr_set,
> >  			     uint32_t nb_mc_addr)
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index
> > c901a2c..0c6705e 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -960,6 +960,30 @@ struct rte_eth_xstats {
> >  	uint64_t value;
> >  };
> >
> > +/**
> > + * Ethernet device RX queue information strcuture.
> > + * Used to retieve information about configured queue.
> > + */
> > +struct rte_eth_rxq_info {
> > +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> > +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> > +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> > +	uint16_t nb_desc;           /**< configured number of RXDs. */
> > +	uint16_t max_desc;          /**< max allowed number of RXDs. */
> > +	uint16_t min_desc;          /**< min allowed number of RXDs. */
> > +} __rte_cache_aligned;
> > +
> > +/**
> > + * Ethernet device TX queue information strcuture.
> > + * Used to retieve information about configured queue.
> > + */
> > +struct rte_eth_txq_info {
> > +	struct rte_eth_txconf conf; /**< queue config parameters. */
> > +	uint16_t nb_desc;           /**< configured number of TXDs. */
> > +	uint16_t max_desc;          /**< max allowed number of TXDs. */
> > +	uint16_t min_desc;          /**< min allowed number of TXDs. */
> > +} __rte_cache_aligned;
> > +
> >  struct rte_eth_dev;
> >
> >  struct rte_eth_dev_callback;
> > @@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct
> > rte_eth_dev *dev,  typedef int (*eth_rx_descriptor_done_t)(void *rxq,
> > uint16_t offset);  /**< @internal Check DD bit of specific RX descriptor */
> >
> > +typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
> > +	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
> > +
> > +typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
> > +	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
> > +
> >  typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);  /**<
> > @internal Set MTU. */
> >
> > @@ -1451,9 +1481,13 @@ struct eth_dev_ops {
> >  	rss_hash_update_t rss_hash_update;
> >  	/** Get current RSS hash configuration. */
> >  	rss_hash_conf_get_t rss_hash_conf_get;
> > -	eth_filter_ctrl_t              filter_ctrl;          /**< common filter
> > control*/
> > +	eth_filter_ctrl_t              filter_ctrl;
> > +	/**< common filter control. */
> >  	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
> > -
> > +	eth_rxq_info_get_t rxq_info_get;
> > +	/**< retrieve RX queue information. */
> > +	eth_txq_info_get_t txq_info_get;
> > +	/**< retrieve TX queue information. */
> >  	/** Turn IEEE1588/802.1AS timestamping on. */
> >  	eth_timesync_enable_t timesync_enable;
> >  	/** Turn IEEE1588/802.1AS timestamping off. */ @@ -3721,6 +3755,46 @@
> > int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
> >  		struct rte_eth_rxtx_callback *user_cb);
> Is it targeting R2.1? If no, the new ops should be added at the end of this structure?
> 
> >
> >  /**
> > + * Retrieve information about given port's RX queue.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *   The RX queue on the Ethernet device for which information
> > + *   will be retrieved.
> > + * @param qinfo
> > + *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
> > + *   the information of the Ethernet device.
> > + *
> > + * @return
> > + *   - 0: Success
> > + *   - -ENOTSUP: routine is not supported by the device PMD.
> > + *   - -EINVAL:  The port_id or the queue_id is out of range.
> > + */
> > +int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_rxq_info *qinfo);
> > +
> > +/**
> > + * Retrieve information about given port's TX queue.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *   The TX queue on the Ethernet device for which information
> > + *   will be retrieved.
> > + * @param qinfo
> > + *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
> > + *   the information of the Ethernet device.
> > + *
> > + * @return
> > + *   - 0: Success
> > + *   - -ENOTSUP: routine is not supported by the device PMD.
> > + *   - -EINVAL:  The port_id or the queue_id is out of range.
> > + */
> > +int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_txq_info *qinfo);
> > +
> > +/*
> >   * Retrieve number of available registers for access
> >   *
> >   * @param port_id
> > @@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct
> > rte_dev_eeprom_info *info);
> >   */
> >  int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info
> > *info);
> >
> > -#ifdef __cplusplus
> > -}
> > -#endif
> 
> 
> > -
> >  /**
> >   * Set the list of multicast addresses to filter on an Ethernet device.
> >   *
> > @@ -3882,4 +3952,9 @@ extern int
> > rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
> >   */
> >  extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
> >  					      struct timespec *timestamp);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> This is fix for the issue introduced by new ieee1588. It must be added in R2.1 no matter
> these patches can be applied or not.
> 
> > +
> >  #endif /* _RTE_ETHDEV_H_ */
> > diff --git a/lib/librte_ether/rte_ether_version.map
> > b/lib/librte_ether/rte_ether_version.map
> > index 23cfee9..8de0928 100644
> > --- a/lib/librte_ether/rte_ether_version.map
> > +++ b/lib/librte_ether/rte_ether_version.map
> > @@ -92,6 +92,7 @@ DPDK_2.0 {
> >  	rte_eth_rx_burst;
> >  	rte_eth_rx_descriptor_done;
> >  	rte_eth_rx_queue_count;
> > +	rte_eth_rx_queue_info_get;
> >  	rte_eth_rx_queue_setup;
> >  	rte_eth_set_queue_rate_limit;
> >  	rte_eth_set_vf_rate_limit;
> > @@ -99,6 +100,7 @@ DPDK_2.0 {
> >  	rte_eth_stats_get;
> >  	rte_eth_stats_reset;
> >  	rte_eth_tx_burst;
> > +	rte_eth_tx_queue_info_get;
> >  	rte_eth_tx_queue_setup;
> >  	rte_eth_xstats_get;
> >  	rte_eth_xstats_reset;
> I am not quite sure about the version map. But I have a question: does it need to create new {} for DPDK_2.1?
> And should your changes be in DPDK_2.1?

Ah yes, I think you right - it should be in 2.1, not 2.0.
Will re-spin.
Konstantin

> 
> Regards,
> Helin
> 
> > --
> > 1.8.3.1

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-20 12:19  2% ` [dpdk-dev] [PATCHv3 1/5] " Konstantin Ananyev
@ 2015-07-22 16:50  0%   ` Zhang, Helin
  2015-07-22 17:00  0%     ` Ananyev, Konstantin
  2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
  1 sibling, 1 reply; 200+ results
From: Zhang, Helin @ 2015-07-22 16:50 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Konstantin Ananyev
> Sent: Monday, July 20, 2015 5:19 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue
> information
> 
> Add the ability for the upper layer to query RX/TX queue information.
> 
> Add new structures:
> struct rte_eth_rxq_info
> struct rte_eth_txq_info
> 
> new functions:
> rte_eth_rx_queue_info_get
> rte_eth_tx_queue_info_get
> 
> into rte_etdev API.
> 
> Left extra free space in the queue info structures, so extra fields could be added
> later without ABI breakage.
> 
> v2 changes:
> - Add formal check for the qinfo input parameter.
> - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> 
> v3 changes:
> - Updated rte_ether_version.map
> - Merged with latest changes
> 
> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.h          | 87
> +++++++++++++++++++++++++++++++---
>  lib/librte_ether/rte_ether_version.map |  2 +
>  3 files changed, 137 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index
> 94104ce..a94c119 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t
> queue_id,  }
> 
>  int
> +rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_rxq_info *qinfo)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	if (qinfo == NULL)
> +		return -EINVAL;
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_eth_devices[port_id];
> +	if (queue_id >= dev->data->nb_rx_queues) {
> +		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
> +		return -EINVAL;
> +	}
> +
> +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
> +
> +	memset(qinfo, 0, sizeof(*qinfo));
> +	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
> +	return 0;
> +}
> +
> +int
> +rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_txq_info *qinfo)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	if (qinfo == NULL)
> +		return -EINVAL;
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_eth_devices[port_id];
> +	if (queue_id >= dev->data->nb_tx_queues) {
> +		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
> +		return -EINVAL;
> +	}
> +
> +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
> +
> +	memset(qinfo, 0, sizeof(*qinfo));
> +	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
> +	return 0;
> +}
> +
> +int
>  rte_eth_dev_set_mc_addr_list(uint8_t port_id,
>  			     struct ether_addr *mc_addr_set,
>  			     uint32_t nb_mc_addr)
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index
> c901a2c..0c6705e 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -960,6 +960,30 @@ struct rte_eth_xstats {
>  	uint64_t value;
>  };
> 
> +/**
> + * Ethernet device RX queue information strcuture.
> + * Used to retieve information about configured queue.
> + */
> +struct rte_eth_rxq_info {
> +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> +	uint16_t nb_desc;           /**< configured number of RXDs. */
> +	uint16_t max_desc;          /**< max allowed number of RXDs. */
> +	uint16_t min_desc;          /**< min allowed number of RXDs. */
> +} __rte_cache_aligned;
> +
> +/**
> + * Ethernet device TX queue information strcuture.
> + * Used to retieve information about configured queue.
> + */
> +struct rte_eth_txq_info {
> +	struct rte_eth_txconf conf; /**< queue config parameters. */
> +	uint16_t nb_desc;           /**< configured number of TXDs. */
> +	uint16_t max_desc;          /**< max allowed number of TXDs. */
> +	uint16_t min_desc;          /**< min allowed number of TXDs. */
> +} __rte_cache_aligned;
> +
>  struct rte_eth_dev;
> 
>  struct rte_eth_dev_callback;
> @@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct
> rte_eth_dev *dev,  typedef int (*eth_rx_descriptor_done_t)(void *rxq,
> uint16_t offset);  /**< @internal Check DD bit of specific RX descriptor */
> 
> +typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
> +	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
> +
> +typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
> +	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
> +
>  typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);  /**<
> @internal Set MTU. */
> 
> @@ -1451,9 +1481,13 @@ struct eth_dev_ops {
>  	rss_hash_update_t rss_hash_update;
>  	/** Get current RSS hash configuration. */
>  	rss_hash_conf_get_t rss_hash_conf_get;
> -	eth_filter_ctrl_t              filter_ctrl;          /**< common filter
> control*/
> +	eth_filter_ctrl_t              filter_ctrl;
> +	/**< common filter control. */
>  	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
> -
> +	eth_rxq_info_get_t rxq_info_get;
> +	/**< retrieve RX queue information. */
> +	eth_txq_info_get_t txq_info_get;
> +	/**< retrieve TX queue information. */
>  	/** Turn IEEE1588/802.1AS timestamping on. */
>  	eth_timesync_enable_t timesync_enable;
>  	/** Turn IEEE1588/802.1AS timestamping off. */ @@ -3721,6 +3755,46 @@
> int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
>  		struct rte_eth_rxtx_callback *user_cb);
Is it targeting R2.1? If no, the new ops should be added at the end of this structure?

> 
>  /**
> + * Retrieve information about given port's RX queue.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param queue_id
> + *   The RX queue on the Ethernet device for which information
> + *   will be retrieved.
> + * @param qinfo
> + *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
> + *   the information of the Ethernet device.
> + *
> + * @return
> + *   - 0: Success
> + *   - -ENOTSUP: routine is not supported by the device PMD.
> + *   - -EINVAL:  The port_id or the queue_id is out of range.
> + */
> +int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_rxq_info *qinfo);
> +
> +/**
> + * Retrieve information about given port's TX queue.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param queue_id
> + *   The TX queue on the Ethernet device for which information
> + *   will be retrieved.
> + * @param qinfo
> + *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
> + *   the information of the Ethernet device.
> + *
> + * @return
> + *   - 0: Success
> + *   - -ENOTSUP: routine is not supported by the device PMD.
> + *   - -EINVAL:  The port_id or the queue_id is out of range.
> + */
> +int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_txq_info *qinfo);
> +
> +/*
>   * Retrieve number of available registers for access
>   *
>   * @param port_id
> @@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct
> rte_dev_eeprom_info *info);
>   */
>  int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info
> *info);
> 
> -#ifdef __cplusplus
> -}
> -#endif


> -
>  /**
>   * Set the list of multicast addresses to filter on an Ethernet device.
>   *
> @@ -3882,4 +3952,9 @@ extern int
> rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
>   */
>  extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
>  					      struct timespec *timestamp);
> +
> +#ifdef __cplusplus
> +}
> +#endif
This is fix for the issue introduced by new ieee1588. It must be added in R2.1 no matter
these patches can be applied or not.

> +
>  #endif /* _RTE_ETHDEV_H_ */
> diff --git a/lib/librte_ether/rte_ether_version.map
> b/lib/librte_ether/rte_ether_version.map
> index 23cfee9..8de0928 100644
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -92,6 +92,7 @@ DPDK_2.0 {
>  	rte_eth_rx_burst;
>  	rte_eth_rx_descriptor_done;
>  	rte_eth_rx_queue_count;
> +	rte_eth_rx_queue_info_get;
>  	rte_eth_rx_queue_setup;
>  	rte_eth_set_queue_rate_limit;
>  	rte_eth_set_vf_rate_limit;
> @@ -99,6 +100,7 @@ DPDK_2.0 {
>  	rte_eth_stats_get;
>  	rte_eth_stats_reset;
>  	rte_eth_tx_burst;
> +	rte_eth_tx_queue_info_get;
>  	rte_eth_tx_queue_setup;
>  	rte_eth_xstats_get;
>  	rte_eth_xstats_reset;
I am not quite sure about the version map. But I have a question: does it need to create new {} for DPDK_2.1?
And should your changes be in DPDK_2.1?

Regards,
Helin

> --
> 1.8.3.1

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct
  2015-07-21 14:10  3% [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct Pablo de Lara
@ 2015-07-22  9:08  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-22  9:08 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

> In order to keep the ABI consistent with the old hash library,
> hash_func_init_val field has been moved, so it remains
> at the same offset as previously, since hash_func and
> hash_func_init_val are fields accesed by the public function
> rte_hash_hash and must keep the same offset as older versions.
> 
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct
@ 2015-07-21 14:10  3% Pablo de Lara
  2015-07-22  9:08  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Pablo de Lara @ 2015-07-21 14:10 UTC (permalink / raw)
  To: dev

In order to keep the ABI consistent with the old hash library,
hash_func_init_val field has been moved, so it remains
at the same offset as previously, since hash_func and
hash_func_init_val are fields accesed by the public function
rte_hash_hash and must keep the same offset as older versions.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_hash/rte_cuckoo_hash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index dec18ce..5cf4af6 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -105,8 +105,8 @@ struct rte_hash {
 	uint32_t num_buckets;           /**< Number of buckets in table. */
 	uint32_t key_len;               /**< Length of hash key. */
 	rte_hash_function hash_func;    /**< Function used to calculate hash. */
-	rte_hash_cmp_eq_t rte_hash_cmp_eq; /**< Function used to compare keys. */
 	uint32_t hash_func_init_val;    /**< Init value used by hash_func. */
+	rte_hash_cmp_eq_t rte_hash_cmp_eq; /**< Function used to compare keys. */
 	uint32_t bucket_bitmask;        /**< Bitmask for getting bucket index
 						from hash signature. */
 	uint32_t key_entry_size;         /**< Size of each key entry. */
-- 
2.4.2

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
  2015-07-21 13:20  7%   ` Dumitrescu, Cristian
@ 2015-07-21 14:03  7%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-21 14:03 UTC (permalink / raw)
  To: Dumitrescu, Cristian; +Cc: dev

2015-07-21 13:20, Dumitrescu, Cristian:
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > --- a/doc/guides/rel_notes/abi.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> 
> There are some pending doc patches on ABI changes that have been sent and ack-ed prior to this change.
> 
> Due to this change, they cannot be applied cleanly anymore.
> Are you OK to integrate them with the small local change required from your side?

Yes it is a basic merge issue that is easily managed locally.

Though, it would be nice to have at least 3 acks on these patches,
as specified in the ABI policy.

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
@ 2015-07-21 13:20  7%   ` Dumitrescu, Cristian
  2015-07-21 14:03  7%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Dumitrescu, Cristian @ 2015-07-21 13:20 UTC (permalink / raw)
  To: Thomas Monjalon, dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Sunday, July 19, 2015 11:52 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
> 
> This chapter is for ABI and API. That's why a renaming is required.
> 
> Remove also the examples which are now in the referenced guidelines.
> 
> Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> ---
>  MAINTAINERS                                       |  2 +-
>  doc/guides/rel_notes/{abi.rst => deprecation.rst} | 16 +++++-----------
>  doc/guides/rel_notes/index.rst                    |  2 +-
>  3 files changed, 7 insertions(+), 13 deletions(-)
>  rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2a32659..6531900 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -60,7 +60,7 @@ F: doc/guides/prog_guide/ext_app_lib_make_help.rst
>  ABI versioning
>  M: Neil Horman <nhorman@tuxdriver.com>
>  F: lib/librte_compat/
> -F: doc/guides/rel_notes/abi.rst
> +F: doc/guides/rel_notes/deprecation.rst
>  F: scripts/validate-abi.sh
> 
> 
> diff --git a/doc/guides/rel_notes/abi.rst
> b/doc/guides/rel_notes/deprecation.rst
> similarity index 51%
> rename from doc/guides/rel_notes/abi.rst
> rename to doc/guides/rel_notes/deprecation.rst
> index 7a08830..eef01f1 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -1,17 +1,11 @@
> -ABI policy
> -==========
> +Deprecation
> +===========
> 
>  See the :doc:`guidelines document for details of the ABI policy
> </guidelines/versioning>`.
> -ABI deprecation notices are to be posted here.
> +API and ABI deprecation notices are to be posted here.
> 
> -
> -Examples of Deprecation Notices
> --------------------------------
> -
> -* The Macro #RTE_FOO is deprecated and will be removed with version 2.0,
> to be replaced with the inline function rte_bar()
> -* The function rte_mbuf_grok has been updated to include new parameter
> in version 2.0.  Backwards compatibility will be maintained for this function
> until the release of version 2.1
> -* The members struct foo have been reorganized in release 2.0.  Existing
> binary applications will have backwards compatibility in release 2.0, while
> newly built binaries will need to reference new structure variant struct foo2.
> Compatibility will be removed in release 2.2, and all applications will require
> updating and rebuilding to the new structure at that time, which will be
> renamed to the original struct foo.
> -* Significant ABI changes are planned for the librte_dostuff library.  The
> upcoming release 2.0 will not contain these changes, but release 2.1 will, and
> no backwards compatibility is planned due to the invasive nature of these
> changes.  Binaries using this library built prior to version 2.1 will require
> updating and recompilation.
> +Help to update from a previous release is provided in
> +:doc:`another section </rel_notes/updating_apps>`.
> 
> 
>  Deprecation Notices
> diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
> index d790783..9d66cd8 100644
> --- a/doc/guides/rel_notes/index.rst
> +++ b/doc/guides/rel_notes/index.rst
> @@ -48,5 +48,5 @@ Contents
>      updating_apps
>      known_issues
>      resolved_issues
> -    abi
> +    deprecation
>      faq
> --
> 2.4.2

Hi Thomas,

There are some pending doc patches on ABI changes that have been sent and ack-ed prior to this change.

Due to this change, they cannot be applied cleanly anymore. Are you OK to integrate them with the small local change required from your side?

Thanks,
Cristian

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
  @ 2015-07-20 12:19  2% ` Konstantin Ananyev
  2015-07-22 16:50  0%   ` Zhang, Helin
  2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
  0 siblings, 2 replies; 200+ results
From: Konstantin Ananyev @ 2015-07-20 12:19 UTC (permalink / raw)
  To: dev

Add the ability for the upper layer to query RX/TX queue information.

Add new structures:
struct rte_eth_rxq_info
struct rte_eth_txq_info

new functions:
rte_eth_rx_queue_info_get
rte_eth_tx_queue_info_get

into rte_etdev API.

Left extra free space in the queue info structures,
so extra fields could be added later without ABI breakage.

v2 changes:
- Add formal check for the qinfo input parameter.
- As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'

v3 changes:
- Updated rte_ether_version.map 
- Merged with latest changes

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 87 +++++++++++++++++++++++++++++++---
 lib/librte_ether/rte_ether_version.map |  2 +
 3 files changed, 137 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 94104ce..a94c119 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 int
+rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
+rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_tx_queues) {
+		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
 rte_eth_dev_set_mc_addr_list(uint8_t port_id,
 			     struct ether_addr *mc_addr_set,
 			     uint32_t nb_mc_addr)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c901a2c..0c6705e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -960,6 +960,30 @@ struct rte_eth_xstats {
 	uint64_t value;
 };
 
+/**
+ * Ethernet device RX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_rxq_info {
+	struct rte_mempool *mp;     /**< mempool used by that queue. */
+	struct rte_eth_rxconf conf; /**< queue config parameters. */
+	uint8_t scattered_rx;       /**< scattered packets RX supported. */
+	uint16_t nb_desc;           /**< configured number of RXDs. */
+	uint16_t max_desc;          /**< max allowed number of RXDs. */
+	uint16_t min_desc;          /**< min allowed number of RXDs. */
+} __rte_cache_aligned;
+
+/**
+ * Ethernet device TX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_txq_info {
+	struct rte_eth_txconf conf; /**< queue config parameters. */
+	uint16_t nb_desc;           /**< configured number of TXDs. */
+	uint16_t max_desc;          /**< max allowed number of TXDs. */
+	uint16_t min_desc;          /**< min allowed number of TXDs. */
+} __rte_cache_aligned;
+
 struct rte_eth_dev;
 
 struct rte_eth_dev_callback;
@@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
+
+typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
+
 typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);
 /**< @internal Set MTU. */
 
@@ -1451,9 +1481,13 @@ struct eth_dev_ops {
 	rss_hash_update_t rss_hash_update;
 	/** Get current RSS hash configuration. */
 	rss_hash_conf_get_t rss_hash_conf_get;
-	eth_filter_ctrl_t              filter_ctrl;          /**< common filter control*/
+	eth_filter_ctrl_t              filter_ctrl;
+	/**< common filter control. */
 	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
-
+	eth_rxq_info_get_t rxq_info_get;
+	/**< retrieve RX queue information. */
+	eth_txq_info_get_t txq_info_get;
+	/**< retrieve TX queue information. */
 	/** Turn IEEE1588/802.1AS timestamping on. */
 	eth_timesync_enable_t timesync_enable;
 	/** Turn IEEE1588/802.1AS timestamping off. */
@@ -3721,6 +3755,46 @@ int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		struct rte_eth_rxtx_callback *user_cb);
 
 /**
+ * Retrieve information about given port's RX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The RX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo);
+
+/**
+ * Retrieve information about given port's TX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The TX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo);
+
+/*
  * Retrieve number of available registers for access
  *
  * @param port_id
@@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
  */
 int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
 
-#ifdef __cplusplus
-}
-#endif
-
 /**
  * Set the list of multicast addresses to filter on an Ethernet device.
  *
@@ -3882,4 +3952,9 @@ extern int rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
  */
 extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
 					      struct timespec *timestamp);
+
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _RTE_ETHDEV_H_ */
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 23cfee9..8de0928 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -92,6 +92,7 @@ DPDK_2.0 {
 	rte_eth_rx_burst;
 	rte_eth_rx_descriptor_done;
 	rte_eth_rx_queue_count;
+	rte_eth_rx_queue_info_get;
 	rte_eth_rx_queue_setup;
 	rte_eth_set_queue_rate_limit;
 	rte_eth_set_vf_rate_limit;
@@ -99,6 +100,7 @@ DPDK_2.0 {
 	rte_eth_stats_get;
 	rte_eth_stats_reset;
 	rte_eth_tx_burst;
+	rte_eth_tx_queue_info_get;
 	rte_eth_tx_queue_setup;
 	rte_eth_xstats_get;
 	rte_eth_xstats_reset;
-- 
1.8.3.1

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes
  2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
  2015-07-19 21:32  0% ` [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
@ 2015-07-20 10:45  0% ` Neil Horman
  2 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-20 10:45 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Sun, Jul 19, 2015 at 12:52:13PM +0200, Thomas Monjalon wrote:
> The main change of these patches is to improve naming consistency
> across ethdev and EAL.
> It should be applied shortly to be part of rc1. If some comments arise,
> it can be fixed/improved in rc2.
> 
> Thomas Monjalon (4):
>   doc: rename ABI chapter to deprecation
>   pci: fix detach and uninit naming
>   ethdev: refactor port release
>   ethdev: fix doxygen internal comments
> 
>  MAINTAINERS                                       |  2 +-
>  doc/guides/rel_notes/{abi.rst => deprecation.rst} | 19 ++++++++-----------
>  doc/guides/rel_notes/index.rst                    |  2 +-
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  2 ++
>  lib/librte_eal/common/eal_common_pci.c            | 20 ++++++++++++--------
>  lib/librte_eal/common/include/rte_pci.h           |  6 ++++--
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  2 ++
>  lib/librte_ether/rte_ethdev.c                     | 11 +++++------
>  lib/librte_ether/rte_ethdev.h                     |  9 ++++-----
>  9 files changed, 39 insertions(+), 34 deletions(-)
>  rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)
> 
> -- 
> 2.4.2
> 
> 

Series
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter
@ 2015-07-20  7:03 13% Jingjing Wu
  0 siblings, 0 replies; 200+ results
From: Jingjing Wu @ 2015-07-20  7:03 UTC (permalink / raw)
  To: dev

To fix the FVL's flow director issue for SCTP flow, rte_eth_fdir_filter
need to be change to support SCTP flow keys extension. Here announce
the ABI deprecation.

Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 5330d3b..63e19c7 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -35,3 +35,7 @@ Deprecation Notices
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
+
+* Significant ABI change is planned for struct rte_eth_fdir_filter to extend
+  the SCTP flow's key input from release 2.1. The change may be enabled in
+  the upcoming release 2.1 with CONFIG_RTE_NEXT_ABI.
-- 
2.4.0

^ permalink raw reply	[relevance 13%]

* [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (8 preceding siblings ...)
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch demonstrates how to handle per rx queue interrupt in a NAPI-like
implementation in userspace. The working thread mainly runs in polling mode
and switch to interrupt mode only if there is no packet received in recent polls.
The working thread returns to polling mode immediately once it receives an
interrupt notification caused by the incoming packets.
The sample keeps running in polling mode if the binding PMD hasn't supported
the rx interrupt yet. Now only ixgbe(pf/vf) and igb support it.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v7 changes
 - using new APIs
 - demo multiple port/queue pair wait on the same epoll instance

v6 changes
 - Split event fd add and wait

v5 changes
 - Change invoked function name and parameter to accomodate EAL change

v3 changes
 - Add spinlock to ensure thread safe when accessing interrupt mask
   register

v2 changes
 - Remove unused function which is for debug purpose

 examples/l3fwd-power/main.c | 205 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 165 insertions(+), 40 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index b3c5f43..14f6fba 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -74,12 +74,14 @@
 #include <rte_string_fns.h>
 #include <rte_timer.h>
 #include <rte_power.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
 
 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
 
-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10
 
 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES           200000000ULL
@@ -153,6 +155,9 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 /* ethernet addresses of ports */
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
 
+/* ethernet addresses of ports */
+static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
 /* Ports set in promiscuous mode off by default. */
@@ -185,6 +190,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128
 
+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
 	uint8_t port_id;
@@ -211,7 +219,7 @@ static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
 
 static struct rte_eth_conf port_conf = {
 	.rxmode = {
-		.mq_mode	= ETH_MQ_RX_RSS,
+		.mq_mode        = ETH_MQ_RX_RSS,
 		.max_rx_pkt_len = ETHER_MAX_LEN,
 		.split_hdr_size = 0,
 		.header_split   = 0, /**< Header Split disabled */
@@ -223,11 +231,17 @@ static struct rte_eth_conf port_conf = {
 	.rx_adv_conf = {
 		.rss_conf = {
 			.rss_key = NULL,
-			.rss_hf = ETH_RSS_IP,
+			.rss_hf = ETH_RSS_UDP,
 		},
 	},
 	.txmode = {
-		.mq_mode = ETH_DCB_NONE,
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+	.intr_conf = {
+		.lsc = 1,
+#ifdef RTE_NEXT_ABI
+		.rxq = 1,
+#endif
 	},
 };
 
@@ -399,19 +413,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer *tim,
 	/* accumulate total execution time in us when callback is invoked */
 	sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
 					(float)SCALING_PERIOD;
-
 	/**
 	 * check whether need to scale down frequency a step if it sleep a lot.
 	 */
-	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-		rte_power_freq_down(lcore_id);
+	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 	else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
 		/**
 		 * scale down a step if average packet per iteration less
 		 * than expectation.
 		 */
-		rte_power_freq_down(lcore_id);
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 
 	/**
 	 * initialize another timer according to current frequency to ensure
@@ -712,22 +729,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 
 }
 
-#define SLEEP_GEAR1_THRESHOLD            100
-#define SLEEP_GEAR2_THRESHOLD            1000
+#define MINIMUM_SLEEP_TIME         1
+#define SUSPEND_THRESHOLD          300
 
 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-	/* If zero count is less than 100, use it as the sleep time in us */
-	if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-		return zero_rx_packet_count;
-	/* If zero count is less than 1000, sleep time should be 100 us */
-	else if ((zero_rx_packet_count >= SLEEP_GEAR1_THRESHOLD) &&
-			(zero_rx_packet_count < SLEEP_GEAR2_THRESHOLD))
-		return SLEEP_GEAR1_THRESHOLD;
-	/* If zero count is greater than 1000, sleep time should be 1000 us */
-	else if (zero_rx_packet_count >= SLEEP_GEAR2_THRESHOLD)
-		return SLEEP_GEAR2_THRESHOLD;
+	/* If zero count is less than 100,  sleep 1us */
+	if (zero_rx_packet_count < SUSPEND_THRESHOLD)
+		return MINIMUM_SLEEP_TIME;
+	/* If zero count is less than 1000, sleep 100 us which is the
+		minimum latency switching from C3/C6 to C0
+	*/
+	else
+		return SUSPEND_THRESHOLD;
 
 	return 0;
 }
@@ -767,6 +782,84 @@ power_freq_scaleup_heuristic(unsigned lcore_id,
 	return FREQ_CURRENT;
 }
 
+/**
+ * force polling thread sleep until one-shot rx interrupt triggers
+ * @param port_id
+ *  Port id.
+ * @param queue_id
+ *  Rx queue id.
+ * @return
+ *  0 on success
+ */
+static int
+sleep_until_rx_interrupt(int num)
+{
+	struct rte_epoll_event event[num];
+	int n, i;
+	uint8_t port_id, queue_id;
+	void *data;
+
+	RTE_LOG(INFO, L3FWD_POWER,
+		"lcore %u sleeps until interrupt triggers\n",
+		rte_lcore_id());
+
+	n = rte_epoll_wait(RTE_EPOLL_PER_THREAD, event, num, -1);
+	for (i = 0; i < n; i++) {
+		data = event[i].epdata.data;
+		port_id = ((uintptr_t)data) >> CHAR_BIT;
+		queue_id = ((uintptr_t)data) &
+			RTE_LEN2MASK(CHAR_BIT, uint8_t);
+		RTE_LOG(INFO, L3FWD_POWER,
+			"lcore %u is waked up from rx interrupt on"
+			" port %d queue %d\n",
+			rte_lcore_id(), port_id, queue_id);
+	}
+
+	return 0;
+}
+
+static int turn_on_intr(struct lcore_conf *qconf)
+{
+	int i;
+	struct lcore_rx_queue *rx_queue;
+	uint8_t port_id, queue_id;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		port_id = rx_queue->port_id;
+		queue_id = rx_queue->queue_id;
+
+		rte_spinlock_lock(&(locks[port_id]));
+		rte_eth_dev_rx_intr_enable(port_id, queue_id);
+		rte_spinlock_unlock(&(locks[port_id]));
+	}
+}
+
+static int event_register(struct lcore_conf *qconf)
+{
+	struct lcore_rx_queue *rx_queue;
+	uint8_t portid, queueid;
+	uint32_t data;
+	int ret;
+	int i;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		portid = rx_queue->port_id;
+		queueid = rx_queue->queue_id;
+		data = portid << CHAR_BIT | queueid;
+
+		ret = rte_eth_dev_rx_intr_ctl_q(portid, queueid,
+						RTE_EPOLL_PER_THREAD,
+						RTE_INTR_EVENT_ADD,
+						(void *)((uintptr_t)data));
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 /* main processing loop */
 static int
 main_loop(__attribute__((unused)) void *dummy)
@@ -780,9 +873,9 @@ main_loop(__attribute__((unused)) void *dummy)
 	struct lcore_conf *qconf;
 	struct lcore_rx_queue *rx_queue;
 	enum freq_scale_hint_t lcore_scaleup_hint;
-
 	uint32_t lcore_rx_idle_count = 0;
 	uint32_t lcore_idle_hint = 0;
+	int intr_en = 0;
 
 	const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
 
@@ -799,13 +892,18 @@ main_loop(__attribute__((unused)) void *dummy)
 	RTE_LOG(INFO, L3FWD_POWER, "entering main loop on lcore %u\n", lcore_id);
 
 	for (i = 0; i < qconf->n_rx_queue; i++) {
-
 		portid = qconf->rx_queue_list[i].port_id;
 		queueid = qconf->rx_queue_list[i].queue_id;
 		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%hhu "
 			"rxqueueid=%hhu\n", lcore_id, portid, queueid);
 	}
 
+	/* add into event wait list */
+	if (event_register(qconf) == 0)
+		intr_en = 1;
+	else
+		RTE_LOG(INFO, L3FWD_POWER, "RX interrupt won't enable.\n");
+
 	while (1) {
 		stats[lcore_id].nb_iteration_looped++;
 
@@ -840,6 +938,7 @@ main_loop(__attribute__((unused)) void *dummy)
 			prev_tsc_power = cur_tsc_power;
 		}
 
+start_rx:
 		/*
 		 * Read packet from RX queues
 		 */
@@ -853,6 +952,7 @@ main_loop(__attribute__((unused)) void *dummy)
 
 			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
 								MAX_PKT_BURST);
+
 			stats[lcore_id].nb_rx_processed += nb_rx;
 			if (unlikely(nb_rx == 0)) {
 				/**
@@ -915,10 +1015,13 @@ main_loop(__attribute__((unused)) void *dummy)
 						rx_queue->freq_up_hint;
 			}
 
-			if (lcore_scaleup_hint == FREQ_HIGHEST)
-				rte_power_freq_max(lcore_id);
-			else if (lcore_scaleup_hint == FREQ_HIGHER)
-				rte_power_freq_up(lcore_id);
+			if (lcore_scaleup_hint == FREQ_HIGHEST) {
+				if (rte_power_freq_max)
+					rte_power_freq_max(lcore_id);
+			} else if (lcore_scaleup_hint == FREQ_HIGHER) {
+				if (rte_power_freq_up)
+					rte_power_freq_up(lcore_id);
+			}
 		} else {
 			/**
 			 * All Rx queues empty in recent consecutive polls,
@@ -933,16 +1036,23 @@ main_loop(__attribute__((unused)) void *dummy)
 					lcore_idle_hint = rx_queue->idle_hint;
 			}
 
-			if ( lcore_idle_hint < SLEEP_GEAR1_THRESHOLD)
+			if (lcore_idle_hint < SUSPEND_THRESHOLD)
 				/**
 				 * execute "pause" instruction to avoid context
-				 * switch for short sleep.
+				 * switch which generally take hundred of
+				 * microseconds for short sleep.
 				 */
 				rte_delay_us(lcore_idle_hint);
-			else
-				/* long sleep force runing thread to suspend */
-				usleep(lcore_idle_hint);
-
+			else {
+				/* suspend until rx interrupt trigges */
+				if (intr_en) {
+					turn_on_intr(qconf);
+					sleep_until_rx_interrupt(
+						qconf->n_rx_queue);
+				}
+				/* start receiving packets immediately */
+				goto start_rx;
+			}
 			stats[lcore_id].sleep_time += lcore_idle_hint;
 		}
 	}
@@ -1273,7 +1383,7 @@ setup_hash(int socketid)
 	char s[64];
 
 	/* create ipv4 hash */
-	snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
 	ipv4_l3fwd_hash_params.name = s;
 	ipv4_l3fwd_hash_params.socket_id = socketid;
 	ipv4_l3fwd_lookup_struct[socketid] =
@@ -1283,7 +1393,7 @@ setup_hash(int socketid)
 				"socket %d\n", socketid);
 
 	/* create ipv6 hash */
-	snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
 	ipv6_l3fwd_hash_params.name = s;
 	ipv6_l3fwd_hash_params.socket_id = socketid;
 	ipv6_l3fwd_lookup_struct[socketid] =
@@ -1477,6 +1587,7 @@ main(int argc, char **argv)
 	unsigned lcore_id;
 	uint64_t hz;
 	uint32_t n_tx_queue, nb_lcores;
+	uint32_t dev_rxq_num, dev_txq_num;
 	uint8_t portid, nb_rx_queue, queue, socketid;
 
 	/* catch SIGINT and restore cpufreq governor to ondemand */
@@ -1526,10 +1637,19 @@ main(int argc, char **argv)
 		printf("Initializing port %d ... ", portid );
 		fflush(stdout);
 
+		rte_eth_dev_info_get(portid, &dev_info);
+		dev_rxq_num = dev_info.max_rx_queues;
+		dev_txq_num = dev_info.max_tx_queues;
+
 		nb_rx_queue = get_port_n_rx_queues(portid);
+		if (nb_rx_queue > dev_rxq_num)
+			rte_exit(EXIT_FAILURE,
+				"Cannot configure not existed rxq: "
+				"port=%d\n", portid);
+
 		n_tx_queue = nb_lcores;
-		if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
-			n_tx_queue = MAX_TX_QUEUE_PER_PORT;
+		if (n_tx_queue > dev_txq_num)
+			n_tx_queue = dev_txq_num;
 		printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
 			nb_rx_queue, (unsigned)n_tx_queue );
 		ret = rte_eth_dev_configure(portid, nb_rx_queue,
@@ -1553,6 +1673,9 @@ main(int argc, char **argv)
 			if (rte_lcore_is_enabled(lcore_id) == 0)
 				continue;
 
+			if (queueid >= dev_txq_num)
+				continue;
+
 			if (numa_on)
 				socketid = \
 				(uint8_t)rte_lcore_to_socket_id(lcore_id);
@@ -1587,8 +1710,9 @@ main(int argc, char **argv)
 		/* init power management library */
 		ret = rte_power_init(lcore_id);
 		if (ret)
-			rte_exit(EXIT_FAILURE, "Power management library "
-				"initialization failed on core%u\n", lcore_id);
+			rte_log(RTE_LOG_ERR, RTE_LOGTYPE_POWER,
+				"Power management library initialization "
+				"failed on core%u", lcore_id);
 
 		/* init timer structures for each enabled lcore */
 		rte_timer_init(&power_timers[lcore_id]);
@@ -1636,7 +1760,6 @@ main(int argc, char **argv)
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, "
 						"port=%d\n", ret, portid);
-
 		/*
 		 * If enabled, put device in promiscuous mode.
 		 * This allows IO forwarding mode to forward packets
@@ -1645,6 +1768,8 @@ main(int argc, char **argv)
 		 */
 		if (promiscuous_on)
 			rte_eth_promiscuous_enable(portid);
+		/* initialize spinlock for each port */
+		rte_spinlock_init(&(locks[portid]));
 	}
 
 	check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (7 preceding siblings ...)
  2015-07-20  3:02  1%       ` [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
  2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start
 - fix link interrupt not working issue in vfio-msix

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove unnecessary variables in e1000_mac_info
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/e1000/igb_ethdev.c | 311 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 277 insertions(+), 34 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index ddc7186..56734a3 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -105,6 +105,9 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
 				struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -218,7 +221,6 @@ static int eth_igb_get_eeprom(struct rte_eth_dev *dev,
 		struct rte_dev_eeprom_info *eeprom);
 static int eth_igb_set_eeprom(struct rte_eth_dev *dev,
 		struct rte_dev_eeprom_info *eeprom);
-
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
 				    struct ether_addr *mc_addr_set,
 				    uint32_t nb_mc_addr);
@@ -229,6 +231,17 @@ static int igb_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
 					  uint32_t flags);
 static int igb_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 					  struct timespec *timestamp);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					uint16_t queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					 uint16_t queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+				       uint8_t queue, uint8_t msix_vector);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+			       uint8_t index, uint8_t offset);
+#endif
+static void eth_igb_configure_msix_intr(struct rte_eth_dev *dev);
 
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
@@ -289,6 +302,10 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.vlan_tpid_set        = eth_igb_vlan_tpid_set,
 	.vlan_offload_set     = eth_igb_vlan_offload_set,
 	.rx_queue_setup       = eth_igb_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+	.rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -639,12 +656,6 @@ eth_igb_dev_init(struct rte_eth_dev *eth_dev)
 		     eth_dev->data->port_id, pci_dev->id.vendor_id,
 		     pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		eth_igb_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	igb_intr_enable(eth_dev);
 
@@ -879,7 +890,11 @@ eth_igb_start(struct rte_eth_dev *dev)
 		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct e1000_adapter *adapter =
 		E1000_DEV_PRIVATE(dev->data->dev_private);
-	int ret, i, mask;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+	int ret, mask;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
 	uint32_t ctrl_ext;
 
 	PMD_INIT_FUNC_TRACE();
@@ -920,6 +935,29 @@ eth_igb_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	igb_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle)) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for rx interrupt */
+	eth_igb_configure_msix_intr(dev);
+
 	/* Configure for OS presence */
 	igb_init_manageability(hw);
 
@@ -947,33 +985,9 @@ eth_igb_start(struct rte_eth_dev *dev)
 		igb_vmdq_vlan_hw_filter_enable(dev);
 	}
 
-	/*
-	 * Configure the Interrupt Moderation register (EITR) with the maximum
-	 * possible value (0xFFFF) to minimize "System Partial Write" issued by
-	 * spurious [DMA] memory updates of RX and TX ring descriptors.
-	 *
-	 * With a EITR granularity of 2 microseconds in the 82576, only 7/8
-	 * spurious memory updates per second should be expected.
-	 * ((65535 * 2) / 1000.1000 ~= 0.131 second).
-	 *
-	 * Because interrupts are not used at all, the MSI-X is not activated
-	 * and interrupt moderation is controlled by EITR[0].
-	 *
-	 * Note that having [almost] disabled memory updates of RX and TX ring
-	 * descriptors through the Interrupt Moderation mechanism, memory
-	 * updates of ring descriptors are now moderated by the configurable
-	 * value of Write-Back Threshold registers.
-	 */
 	if ((hw->mac.type == e1000_82576) || (hw->mac.type == e1000_82580) ||
 		(hw->mac.type == e1000_i350) || (hw->mac.type == e1000_i210) ||
 		(hw->mac.type == e1000_i211)) {
-		uint32_t ivar;
-
-		/* Enable all RX & TX queues in the IVAR registers */
-		ivar = (uint32_t) ((E1000_IVAR_VALID << 16) | E1000_IVAR_VALID);
-		for (i = 0; i < 8; i++)
-			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, i, ivar);
-
 		/* Configure EITR with the maximum possible value (0xFFFF) */
 		E1000_WRITE_REG(hw, E1000_EITR(0), 0xFFFF);
 	}
@@ -1024,8 +1038,25 @@ eth_igb_start(struct rte_eth_dev *dev)
 	e1000_setup_link(hw);
 
 	/* check if lsc interrupt feature is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ret = eth_igb_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   eth_igb_interrupt_handler,
+						   (void *)dev);
+			eth_igb_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		eth_igb_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	igb_intr_enable(dev);
@@ -1058,8 +1089,13 @@ eth_igb_stop(struct rte_eth_dev *dev)
 	struct e1000_flex_filter *p_flex;
 	struct e1000_5tuple_filter *p_5tuple, *p_5tuple_next;
 	struct e1000_2tuple_filter *p_2tuple, *p_2tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	igb_intr_disable(hw);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	igb_pf_reset_hw(hw);
 	E1000_WRITE_REG(hw, E1000_WUC, 0);
 
@@ -1108,6 +1144,15 @@ eth_igb_stop(struct rte_eth_dev *dev)
 		rte_free(p_2tuple);
 	}
 	filter_info->twotuple_mask = 0;
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
@@ -1117,6 +1162,9 @@ eth_igb_close(struct rte_eth_dev *dev)
 	struct e1000_adapter *adapter =
 		E1000_DEV_PRIVATE(dev->data->dev_private);
 	struct rte_eth_link link;
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	eth_igb_stop(dev);
 	adapter->stopped = 1;
@@ -1136,6 +1184,14 @@ eth_igb_close(struct rte_eth_dev *dev)
 
 	igb_dev_free_queues(dev);
 
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
+
 	memset(&link, 0, sizeof(link));
 	rte_igb_dev_atomic_write_link_status(dev, &link);
 }
@@ -1960,6 +2016,35 @@ eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+/* It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	uint32_t mask, regval;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_eth_dev_info dev_info;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	eth_igb_infos_get(dev, &dev_info);
+
+	mask = 0xFFFFFFFF >> (32 - dev_info.max_rx_queues);
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and gets interrupt causes, check it and set a bit flag
  * to update link status.
@@ -4051,5 +4136,163 @@ static struct rte_driver pmd_igbvf_drv = {
 	.init = rte_igbvf_pmd_init,
 };
 
+#ifdef RTE_NEXT_ABI
+static int
+eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+
+	E1000_WRITE_REG(hw, E1000_EIMC, mask);
+	E1000_WRITE_FLUSH(hw);
+
+	return 0;
+}
+
+static int
+eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+	uint32_t regval;
+
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+	E1000_WRITE_FLUSH(hw);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static void
+eth_igb_write_ivar(struct e1000_hw *hw, uint8_t  msix_vector,
+		   uint8_t index, uint8_t offset)
+{
+	uint32_t val = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
+
+	/* clear bits */
+	val &= ~((uint32_t)0xFF << offset);
+
+	/* write vector and valid bit */
+	val |= (msix_vector | E1000_IVAR_VALID) << offset;
+
+	E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, val);
+}
+
+static void
+eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+			   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp = 0;
+
+	if (hw->mac.type == e1000_82575) {
+		if (direction == 0)
+			tmp = E1000_EICR_RX_QUEUE0 << queue;
+		else if (direction == 1)
+			tmp = E1000_EICR_TX_QUEUE0 << queue;
+		E1000_WRITE_REG(hw, E1000_MSIXBM(msix_vector), tmp);
+	} else if (hw->mac.type == e1000_82576) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector, queue & 0x7,
+					   ((queue & 0x8) << 1) +
+					   8 * direction);
+	} else if ((hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector,
+					   queue >> 1,
+					   ((queue & 0x1) << 4) +
+					   8 * direction);
+	}
+}
+#endif
+
+/* Sets up the hardware to generate MSI-X interrupts properly
+ * @hw
+ *  board private structure
+ */
+static void
+eth_igb_configure_msix_intr(struct rte_eth_dev *dev)
+{
+#ifdef RTE_NEXT_ABI
+	int queue_id;
+	uint32_t tmpval, regval, intr_mask;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t vec = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* set interrupt vector for other causes */
+	if (hw->mac.type == e1000_82575) {
+		tmpval = E1000_READ_REG(hw, E1000_CTRL_EXT);
+		/* enable MSI-X PBA support */
+		tmpval |= E1000_CTRL_EXT_PBA_CLR;
+
+		/* Auto-Mask interrupts upon ICR read */
+		tmpval |= E1000_CTRL_EXT_EIAME;
+		tmpval |= E1000_CTRL_EXT_IRCA;
+
+		E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmpval);
+
+		/* enable msix_other interrupt */
+		E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0), 0, E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAM);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | E1000_EIMS_OTHER);
+	} else if ((hw->mac.type == e1000_82576) ||
+			(hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		/* turn on MSI-X capability first */
+		E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE |
+					E1000_GPIE_PBA | E1000_GPIE_EIAME |
+					E1000_GPIE_NSICR);
+
+		intr_mask = (1 << intr_handle->max_intr) - 1;
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | intr_mask);
+
+		/* enable msix_other interrupt */
+		regval = E1000_READ_REG(hw, E1000_EIMS);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | intr_mask);
+		tmpval = (dev->data->nb_rx_queues | E1000_IVAR_VALID) << 8;
+		E1000_WRITE_REG(hw, E1000_IVAR_MISC, tmpval);
+	}
+
+	/* use EIAM to auto-mask when MSI-X interrupt
+	 * is asserted, this saves a register write for every interrupt
+	 */
+	intr_mask = (1 << intr_handle->nb_efd) - 1;
+	regval = E1000_READ_REG(hw, E1000_EIAM);
+	E1000_WRITE_REG(hw, E1000_EIAM, regval | intr_mask);
+
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues; queue_id++) {
+		eth_igb_assign_msix_vector(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	E1000_WRITE_FLUSH(hw);
+#endif
+}
+
 PMD_REGISTER_DRIVER(pmd_igb_drv);
 PMD_REGISTER_DRIVER(pmd_igbvf_drv);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (6 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
@ 2015-07-20  3:02  1%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF Cunming Liang
                         ` (2 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch does below things for ixgbe PF and VF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v10 changes
 - return an actual error code rather than -1

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/ixgbe/ixgbe_ethdev.c | 527 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h |   4 +
 2 files changed, 518 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 3a8cff0..7f43fb6 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -85,6 +85,9 @@
  */
 #define IXGBE_FC_LO    0x40
 
+/* Default minimum inter-interrupt interval for EITR configuration */
+#define IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT    0x79E
+
 /* Timer value included in XOFF frames. */
 #define IXGBE_FC_PAUSE 0x680
 
@@ -187,6 +190,9 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
 			uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev);
 static void ixgbe_dev_interrupt_handler(struct rte_intr_handle *handle,
@@ -202,11 +208,14 @@ static void ixgbe_dcb_init(struct ixgbe_hw *hw,struct ixgbe_dcb_config *dcb_conf
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev);
+static int ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev);
+static int ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_configure(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_start(struct rte_eth_dev *dev);
 static void ixgbevf_dev_stop(struct rte_eth_dev *dev);
 static void ixgbevf_dev_close(struct rte_eth_dev *dev);
 static void ixgbevf_intr_disable(struct ixgbe_hw *hw);
+static void ixgbevf_intr_enable(struct ixgbe_hw *hw);
 static void ixgbevf_dev_stats_get(struct rte_eth_dev *dev,
 		struct rte_eth_stats *stats);
 static void ixgbevf_dev_stats_reset(struct rte_eth_dev *dev);
@@ -216,6 +225,17 @@ static void ixgbevf_vlan_strip_queue_set(struct rte_eth_dev *dev,
 		uint16_t queue, int on);
 static void ixgbevf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on);
+static void ixgbevf_dev_interrupt_handler(struct rte_intr_handle *handle,
+					  void *param);
+#ifdef RTE_NEXT_ABI
+static int ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					    uint16_t queue_id);
+static int ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					     uint16_t queue_id);
+static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+				 uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbevf_configure_msix(struct rte_eth_dev *dev);
 
 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -232,6 +252,15 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev,
 		uint8_t rule_id, uint8_t on);
 static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev,
 		uint8_t	rule_id);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					  uint16_t queue_id);
+static int ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					   uint16_t queue_id);
+static void ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+			       uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbe_configure_msix(struct rte_eth_dev *dev);
 
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 		uint16_t queue_idx, uint16_t tx_rate);
@@ -308,7 +337,7 @@ static int ixgbe_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
  */
 #define UPDATE_VF_STAT(reg, last, cur)	                        \
 {                                                               \
-	u32 latest = IXGBE_READ_REG(hw, reg);                   \
+	uint32_t latest = IXGBE_READ_REG(hw, reg);              \
 	cur += latest - last;                                   \
 	last = latest;                                          \
 }
@@ -391,6 +420,10 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.tx_queue_start	      = ixgbe_dev_tx_queue_start,
 	.tx_queue_stop        = ixgbe_dev_tx_queue_stop,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbe_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbe_dev_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
@@ -461,8 +494,13 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.vlan_offload_set     = ixgbevf_vlan_offload_set,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
+	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbevf_dev_rx_queue_intr_disable,
+#endif
 	.mac_addr_add         = ixgbevf_add_mac_addr,
 	.mac_addr_remove      = ixgbevf_remove_mac_addr,
 	.set_mc_addr_list     = ixgbe_dev_set_mc_addr_list,
@@ -1000,12 +1038,6 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
 			eth_dev->data->port_id, pci_dev->id.vendor_id,
 			pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		ixgbe_dev_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	ixgbe_enable_intr(eth_dev);
 
@@ -1647,6 +1679,10 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct ixgbe_vf_info *vfinfo =
 		*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
 	int err, link_up = 0, negotiate = 0;
 	uint32_t speed = 0;
 	int mask = 0;
@@ -1679,6 +1715,30 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	ixgbe_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int),
+				    0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for sleep until rx interrupt */
+	ixgbe_configure_msix(dev);
+
 	/* initialize transmission unit */
 	ixgbe_dev_tx_init(dev);
 
@@ -1756,8 +1816,25 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 skip_link_setup:
 
 	/* check if lsc interrupt is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ixgbe_dev_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   ixgbe_dev_interrupt_handler,
+						   (void *)dev);
+			ixgbe_dev_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		ixgbe_dev_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	ixgbe_enable_intr(dev);
@@ -1814,6 +1891,7 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	struct ixgbe_filter_info *filter_info =
 		IXGBE_DEV_PRIVATE_TO_FILTER_INFO(dev->data->dev_private);
 	struct ixgbe_5tuple_filter *p_5tuple, *p_5tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 	int vf;
 
 	PMD_INIT_FUNC_TRACE();
@@ -1821,6 +1899,9 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	/* disable interrupts */
 	ixgbe_disable_intr(hw);
 
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	/* reset the NIC */
 	ixgbe_pf_reset_hw(hw);
 	hw->adapter_stopped = 0;
@@ -1861,6 +1942,14 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	memset(filter_info->fivetuple_mask, 0,
 		sizeof(uint32_t) * IXGBE_5TUPLE_ARRAY_SIZE);
 
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 /*
@@ -2535,6 +2624,30 @@ ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+/**
+ * It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+static int
+ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	intr->mask |= IXGBE_EICR_RTX_QUEUE;
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and sets flag (IXGBE_EICR_LSC) for the link_update.
  *
@@ -2561,10 +2674,10 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	PMD_DRV_LOG(INFO, "eicr %x", eicr);
 
 	intr->flags = 0;
-	if (eicr & IXGBE_EICR_LSC) {
-		/* set flag for async link update */
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
 		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
-	}
 
 	if (eicr & IXGBE_EICR_MAILBOX)
 		intr->flags |= IXGBE_FLAG_MAILBOX;
@@ -2572,6 +2685,30 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev)
+{
+	uint32_t eicr;
+	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	/* clear all cause mask */
+	ixgbevf_intr_disable(hw);
+
+	/* read-on-clear nic registers here */
+	eicr = IXGBE_READ_REG(hw, IXGBE_VTEICR);
+	PMD_DRV_LOG(INFO, "eicr %x", eicr);
+
+	intr->flags = 0;
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
+		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
+
+	return 0;
+}
+
 /**
  * It gets and then prints the link status.
  *
@@ -2667,6 +2804,18 @@ ixgbe_dev_interrupt_action(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_DRV_LOG(DEBUG, "enable intr immediately");
+	ixgbevf_intr_enable(hw);
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+	return 0;
+}
+
 /**
  * Interrupt handler which shall be registered for alarm callback for delayed
  * handling specific interrupt to wait for the stable nic state. As the
@@ -2721,13 +2870,24 @@ ixgbe_dev_interrupt_delayed_handler(void *param)
  */
 static void
 ixgbe_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
-							void *param)
+			    void *param)
 {
 	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
 	ixgbe_dev_interrupt_get_status(dev);
 	ixgbe_dev_interrupt_action(dev);
 }
 
+static void
+ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+			      void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+	ixgbevf_dev_interrupt_get_status(dev);
+	ixgbevf_dev_interrupt_action(dev);
+}
+
 static int
 ixgbe_dev_led_on(struct rte_eth_dev *dev)
 {
@@ -3233,6 +3393,19 @@ ixgbevf_intr_disable(struct ixgbe_hw *hw)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
+static void
+ixgbevf_intr_enable(struct ixgbe_hw *hw)
+{
+	PMD_INIT_FUNC_TRACE();
+
+	/* VF enable interrupt autoclean */
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAM, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAC, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, IXGBE_VF_IRQ_ENABLE_MASK);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
 static int
 ixgbevf_dev_configure(struct rte_eth_dev *dev)
 {
@@ -3274,6 +3447,11 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
 	int err, mask = 0;
 
 	PMD_INIT_FUNC_TRACE();
@@ -3304,6 +3482,42 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 
 	ixgbevf_dev_rxtx_start(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+	ixgbevf_configure_msix(dev);
+
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle))
+			rte_intr_callback_register(intr_handle,
+					ixgbevf_dev_interrupt_handler,
+					(void *)dev);
+		else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+	rte_intr_enable(intr_handle);
+
+	/* Re-enable interrupt for VF */
+	ixgbevf_intr_enable(hw);
+
 	return 0;
 }
 
@@ -3311,6 +3525,7 @@ static void
 ixgbevf_dev_stop(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3327,12 +3542,27 @@ ixgbevf_dev_stop(struct rte_eth_dev *dev)
 	dev->data->scattered_rx = 0;
 
 	ixgbe_dev_clear_queues(dev);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
 ixgbevf_dev_close(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3344,6 +3574,14 @@ ixgbevf_dev_close(struct rte_eth_dev *dev)
 
 	/* reprogram the RAR[0] in case user changed it. */
 	ixgbe_set_rar(hw, 0, hw->mac.addr, 0, IXGBE_RAH_AV);
+
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
 }
 
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on)
@@ -3861,6 +4099,269 @@ ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+static int
+ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask |= (1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask &= ~(1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask |= (1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= (1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= (1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask &= ~(1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= ~(1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= ~(1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+
+	return 0;
+}
+
+static void
+ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		     uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	if (direction == -1) {
+		/* other causes */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR_MISC);
+		tmp &= ~0xFF;
+		tmp |= msix_vector;
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR_MISC, tmp);
+	} else {
+		/* rx or tx cause */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		idx = ((16 * (queue & 1)) + (8 * direction));
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR(queue >> 1));
+		tmp &= ~(0xFF << idx);
+		tmp |= (msix_vector << idx);
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR(queue >> 1), tmp);
+	}
+}
+
+/**
+ * set the IVAR registers, mapping interrupt causes to vectors
+ * @param hw
+ *  pointer to ixgbe_hw struct
+ * @direction
+ *  0 for Rx, 1 for Tx, -1 for other causes
+ * @queue
+ *  queue to map the corresponding interrupt to
+ * @msix_vector
+ *  the vector to map to the corresponding queue
+ */
+static void
+ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		if (direction == -1)
+			direction = 0;
+		idx = (((direction * 64) + queue) >> 2) & 0x1F;
+		tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(idx));
+		tmp &= ~(0xFF << (8 * (queue & 0x3)));
+		tmp |= (msix_vector << (8 * (queue & 0x3)));
+		IXGBE_WRITE_REG(hw, IXGBE_IVAR(idx), tmp);
+	} else if ((hw->mac.type == ixgbe_mac_82599EB) ||
+			(hw->mac.type == ixgbe_mac_X540)) {
+		if (direction == -1) {
+			/* other causes */
+			idx = ((queue & 1) * 8);
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR_MISC);
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR_MISC, tmp);
+		} else {
+			/* rx or tx causes */
+			idx = ((16 * (queue & 1)) + (8 * direction));
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(queue >> 1));
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR(queue >> 1), tmp);
+		}
+	}
+}
+#endif
+
+static void
+ixgbevf_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t q_idx;
+	uint32_t vector_idx = 0;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd.
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* Configure all RX queues of VF */
+	for (q_idx = 0; q_idx < dev->data->nb_rx_queues; q_idx++) {
+		/* Force all queue use vector 0,
+		 * as IXGBE_VF_MAXMSIVECOTR = 1
+		 */
+		ixgbevf_set_ivar_map(hw, 0, q_idx, vector_idx);
+		intr_handle->intr_vec[q_idx] = vector_idx;
+	}
+
+	/* Configure VF Rx queue ivar */
+	ixgbevf_set_ivar_map(hw, -1, 1, vector_idx);
+#endif
+}
+
+/**
+ * Sets up the hardware to properly generate MSI-X interrupts
+ * @hw
+ *  board private structure
+ */
+static void
+ixgbe_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t queue_id, vec = 0;
+	uint32_t mask;
+	uint32_t gpie;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* setup GPIE for MSI-x mode */
+	gpie = IXGBE_READ_REG(hw, IXGBE_GPIE);
+	gpie |= IXGBE_GPIE_MSIX_MODE | IXGBE_GPIE_PBA_SUPPORT |
+		IXGBE_GPIE_OCD | IXGBE_GPIE_EIAME;
+	/* auto clearing and auto setting corresponding bits in EIMS
+	 * when MSI-X interrupt is triggered
+	 */
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM, IXGBE_EICS_RTX_QUEUE);
+	} else {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(0), 0xFFFFFFFF);
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(1), 0xFFFFFFFF);
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie);
+
+	/* Populate the IVAR table and set the ITR values to the
+	 * corresponding register.
+	 */
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues;
+	     queue_id++) {
+		/* by default, 1:1 mapping */
+		ixgbe_set_ivar_map(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	switch (hw->mac.type) {
+	case ixgbe_mac_82598EB:
+		ixgbe_set_ivar_map(hw, -1, IXGBE_IVAR_OTHER_CAUSES_INDEX,
+				   intr_handle->max_intr - 1);
+		break;
+	case ixgbe_mac_82599EB:
+	case ixgbe_mac_X540:
+		ixgbe_set_ivar_map(hw, -1, 1, intr_handle->max_intr - 1);
+		break;
+	default:
+		break;
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_EITR(queue_id),
+			IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT & 0xFFF);
+
+	/* set up to autoclear timer, and the vectors */
+	mask = IXGBE_EIMS_ENABLE_MASK;
+	mask &= ~(IXGBE_EIMS_OTHER |
+		  IXGBE_EIMS_MAILBOX |
+		  IXGBE_EIMS_LSC);
+
+	IXGBE_WRITE_REG(hw, IXGBE_EIAC, mask);
+#endif
+}
+
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 	uint16_t queue_idx, uint16_t tx_rate)
 {
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index c16c11d..c3d4f4f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -117,6 +117,9 @@
 	ETH_RSS_IPV6_TCP_EX | \
 	ETH_RSS_IPV6_UDP_EX)
 
+#define IXGBE_VF_IRQ_ENABLE_MASK        3          /* vf irq enable mask */
+#define IXGBE_VF_MAXMSIVECTOR           1
+
 /*
  * Information about the fdir mode.
  */
@@ -332,6 +335,7 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 		uint16_t rx_queue_id);
 
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
-- 
1.8.1.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (5 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  1%       ` [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
                         ` (3 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds two dev_ops functions to enable and disable rx queue interrupts.
In addtion, it adds rte_eth_dev_rx_intr_ctl/rx_intr_q to support per port or per queue rx intr event set.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v9 changes
 - remove unnecessary check after rte_eth_dev_is_valid_port.
   the same as http://www.dpdk.org/dev/patchwork/patch/4784

v8 changes
 - add addtion check for EEXIT

v7 changes
 - remove rx_intr_vec_get
 - add rx_intr_ctl and rx_intr_ctl_q

v6 changes
 - add rx_intr_vec_get to retrieve the vector num of the queue.

v5 changes
 - Rebase the patchset onto the HEAD

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Put new functions at the end of eth_dev_ops to avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions

 lib/librte_ether/rte_ethdev.c          | 147 +++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 104 +++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |   4 +
 3 files changed, 255 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 94104ce..a24c399 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3031,6 +3031,153 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 	}
 	rte_spinlock_unlock(&rte_eth_dev_cb_lock);
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	uint16_t qid;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		vec = intr_handle->intr_vec[qid];
+		rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+		if (rc && rc != -EEXIST) {
+			PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+					" op %d epfd %d vec %u\n",
+					port_id, qid, op, epfd, vec);
+		}
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%u\n", queue_id);
+		return -EINVAL;
+	}
+
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	vec = intr_handle->intr_vec[queue_id];
+	rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+	if (rc && rc != -EEXIST) {
+		PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+				" op %d epfd %d vec %u\n",
+				port_id, queue_id, op, epfd, vec);
+		return rc;
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_enable(uint8_t port_id,
+			   uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+}
+
+int
+rte_eth_dev_rx_intr_disable(uint8_t port_id,
+			    uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+}
+#else
+int
+rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+
+int
+rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+
+int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+
+int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+#endif
+
 #ifdef RTE_NIC_BYPASS
 int rte_eth_dev_bypass_init(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c901a2c..662e106 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -845,6 +845,10 @@ struct rte_eth_fdir {
 struct rte_intr_conf {
 	/** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
 	uint16_t lsc;
+#ifdef RTE_NEXT_ABI
+	/** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
+	uint16_t rxq;
+#endif
 };
 
 /**
@@ -1053,6 +1057,14 @@ typedef int (*eth_tx_queue_setup_t)(struct rte_eth_dev *dev,
 				    const struct rte_eth_txconf *tx_conf);
 /**< @internal Setup a transmit queue of an Ethernet device. */
 
+typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Enable interrupt of a receive queue of an Ethernet device. */
+
+typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Disable interrupt of a receive queue of an Ethernet device. */
+
 typedef void (*eth_queue_release_t)(void *queue);
 /**< @internal Release memory resources allocated by given RX/TX queue. */
 
@@ -1380,6 +1392,12 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
 	eth_rx_queue_count_t       rx_queue_count; /**< Get Rx queue count. */
 	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
+#ifdef RTE_NEXT_ABI
+	/**< Enable Rx queue interrupt. */
+	eth_rx_enable_intr_t       rx_queue_intr_enable;
+	/**< Disable Rx queue interrupt.*/
+	eth_rx_disable_intr_t      rx_queue_intr_disable;
+#endif
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue.*/
 	eth_queue_release_t        tx_queue_release;/**< Release TX queue.*/
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
@@ -2957,6 +2975,92 @@ void _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 				enum rte_eth_event_type event);
 
 /**
+ * When there is no rx packet coming in Rx Queue for a long time, we can
+ * sleep lcore related to RX Queue for power saving, and enable rx interrupt
+ * to be triggered when rx packect arrives.
+ *
+ * The rte_eth_dev_rx_intr_enable() function enables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+int rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id);
+
+/**
+ * When lcore wakes up from rx interrupt indicating packet coming, disable rx
+ * interrupt and returns to polling mode.
+ *
+ * The rte_eth_dev_rx_intr_disable() function disables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+int rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id);
+
+/**
+ * RX Interrupt control per port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data);
+
+/**
+ * RX Interrupt control per queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			      int epfd, int op, void *data);
+
+/**
  * Turn on the LED on the Ethernet device.
  * This function turns on the LED on the Ethernet device.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 23cfee9..8345a6c 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -115,6 +115,10 @@ DPDK_2.1 {
 	rte_eth_dev_get_reg_info;
 	rte_eth_dev_get_reg_length;
 	rte_eth_dev_is_valid_port;
+	rte_eth_dev_rx_intr_ctl;
+	rte_eth_dev_rx_intr_ctl_q;
+	rte_eth_dev_rx_intr_disable;
+	rte_eth_dev_rx_intr_enable;
 	rte_eth_dev_set_eeprom;
 	rte_eth_dev_set_mc_addr_list;
 	rte_eth_timesync_disable;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (4 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
                         ` (4 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

To make bsd compiling happy with new intr changes.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework

v13 changes
 - version map cleanup for v2.1

v12 changes
 - fix unused variables compiling warning

v8 changes
 - add stub for new function

v7 changes
 - remove stub 'linux only' function from source file

 lib/librte_eal/bsdapp/eal/eal_interrupts.c         | 42 +++++++++++++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   | 68 ++++++++++++++++++++++
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |  5 ++
 3 files changed, 115 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 26a55c7..51a13fa 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -68,3 +68,45 @@ rte_eal_intr_init(void)
 {
 	return 0;
 }
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+
+	return -ENOTSUP;
+}
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
+
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
index d4c388f..91d1900 100644
--- a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
@@ -50,6 +50,74 @@ struct rte_intr_handle {
 	int fd;                          /**< file descriptor */
 	int uio_cfg_fd;                  /**< UIO config file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	int max_intr;                    /**< max interrupt requested */
+	uint32_t nb_efd;                 /**< number of available efds */
+	int *intr_vec;               /**< intr vector number array */
+#endif
 };
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+
+/**
+ * It enables the fastpath event fds if it's necessary.
+ * It creates event fds when multi-vectors allowed,
+ * otherwise it multiplexes the single event fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disable the fastpath event fds.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The fastpath interrupt is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int rte_intr_dp_is_en(struct rte_intr_handle *intr_handle);
+
+/**
+ * The interrupt handle instance allows other cause or not.
+ * Other cause stands for none fastpath interrupt.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int rte_intr_allow_others(struct rte_intr_handle *intr_handle);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index b2d4441..cfeb0fb 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -116,6 +116,11 @@ DPDK_2.1 {
 	global:
 
 	rte_eal_pci_detach;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
+	rte_intr_rx_ctl;
 	rte_memzone_free;
 
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (3 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition Cunming Liang
                         ` (5 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch exposes intr event fd create and release for PMD.
The device driver can assign the number of event associated with interrupt vector.
It also provides misc functions to check 1) allows other slowpath intr(e.g. lsc);
2) intr event on fastpath is enabled or not.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - minor changes on API decription comments

v13 changes
 - version map cleanup for v2.1

v11 changes
 - typo cleanup

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 97 ++++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 45 ++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |  4 +
 3 files changed, 146 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 12105cc..1cea4bf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -44,6 +44,7 @@
 #include <sys/epoll.h>
 #include <sys/signalfd.h>
 #include <sys/ioctl.h>
+#include <sys/eventfd.h>
 
 #include <rte_common.h>
 #include <rte_interrupts.h>
@@ -68,6 +69,7 @@
 #include "eal_vfio.h"
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
+#define NB_OTHER_INTR               1
 
 static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
 
@@ -1123,6 +1125,73 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
 
 	return rc;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	uint32_t i;
+	int fd;
+	uint32_t n = RTE_MIN(nb_efd, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+
+	if (intr_handle->type == RTE_INTR_HANDLE_VFIO_MSIX) {
+		for (i = 0; i < n; i++) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				RTE_LOG(ERR, EAL,
+					"can't setup eventfd, error %i (%s)\n",
+					errno, strerror(errno));
+				return -1;
+			}
+			intr_handle->efds[i] = fd;
+		}
+		intr_handle->nb_efd   = n;
+		intr_handle->max_intr = NB_OTHER_INTR + n;
+	} else {
+		intr_handle->efds[0]  = intr_handle->fd;
+		intr_handle->nb_efd   = RTE_MIN(nb_efd, 1U);
+		intr_handle->max_intr = NB_OTHER_INTR;
+	}
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	uint32_t i;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < intr_handle->nb_efd; i++) {
+		rev = &intr_handle->elist[i];
+		if (rev->status == RTE_EPOLL_INVALID)
+			continue;
+		if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
+			/* force free if the entry valid */
+			eal_epoll_data_safe_free(rev);
+			rev->status = RTE_EPOLL_INVALID;
+		}
+	}
+
+	if (intr_handle->max_intr > intr_handle->nb_efd) {
+		for (i = 0; i < intr_handle->nb_efd; i++)
+			close(intr_handle->efds[i]);
+	}
+	intr_handle->nb_efd = 0;
+	intr_handle->max_intr = 0;
+}
+
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	return !(!intr_handle->nb_efd);
+}
+
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	return !!(intr_handle->max_intr - intr_handle->nb_efd);
+}
+
 #else
 int
 rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
@@ -1135,4 +1204,32 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 	RTE_SET_USED(data);
 	return -ENOTSUP;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
+
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index acf4be9..b05f4c8 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -168,4 +168,49 @@ int
 rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 		int epfd, int op, unsigned int vec, void *data);
 
+/**
+ * It enables the packet I/O interrupt event if it's necessary.
+ * It creates event fd for each interrupt vector when MSIX is used,
+ * otherwise it multiplexes a single event fd.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disables the packet I/O interrupt event.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The packet I/O interrupt on datapath is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle);
+
+/**
+ * The interrupt handle instance allows other causes or not.
+ * Other causes stand for any none packet I/O interrupts.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 095b2c5..f44bc34 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -118,6 +118,10 @@ DPDK_2.1 {
 	rte_eal_pci_detach;
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
 	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (2 preceding siblings ...)
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support Cunming Liang
                         ` (6 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch maps each of the eventfd to the interrupt vector of VFIO MSI-X.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v8 changes
 - move eventfd creation out of the setup_interrupts to a standalone function

v7 changes
 - cleanup unnecessary code change
 - split event and intr operation to other patches

 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 56 ++++++++++------------------
 1 file changed, 20 insertions(+), 36 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 5acc3b7..12105cc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -128,6 +128,9 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT
 
 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+/* irq set buffer length for queue interrupts and LSC interrupt */
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+			      sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))
 
 /* enable legacy (INTx) interrupts */
 static int
@@ -245,23 +248,6 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
 						intr_handle->fd);
 		return -1;
 	}
-
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -294,7 +280,7 @@ vfio_disable_msi(struct rte_intr_handle *intr_handle) {
 static int
 vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 	int len, ret;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	struct vfio_irq_set *irq_set;
 	int *fd_ptr;
 
@@ -302,12 +288,26 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
 	irq_set->argsz = len;
+#ifdef RTE_NEXT_ABI
+	if (!intr_handle->max_intr)
+		intr_handle->max_intr = 1;
+	else if (intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+		intr_handle->max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+
+	irq_set->count = intr_handle->max_intr;
+#else
 	irq_set->count = 1;
+#endif
 	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
 	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
 	irq_set->start = 0;
 	fd_ptr = (int *) &irq_set->data;
-	*fd_ptr = intr_handle->fd;
+#ifdef RTE_NEXT_ABI
+	memcpy(fd_ptr, intr_handle->efds, sizeof(intr_handle->efds));
+	fd_ptr[intr_handle->max_intr - 1] = intr_handle->fd;
+#else
+	fd_ptr[0] = intr_handle->fd;
+#endif
 
 	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
@@ -317,22 +317,6 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 		return -1;
 	}
 
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI-X interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -340,7 +324,7 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 static int
 vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	int len, ret;
 
 	len = sizeof(struct vfio_irq_set);
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
                         ` (7 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector events monitor on specified epoll instance.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v12 changes:
 - fix awkward line split in using RTE_LOG

v10 changes:
 - add RTE_INTR_HANDLE_UIO_INTX for uio_pci_generic

v8 changes
 - fix EWOULDBLOCK and EINTR processing
 - add event status check

v7 changes
 - rename rte_intr_rx_set to rte_intr_rx_ctl.
 - rte_intr_rx_ctl uses rte_epoll_ctl to register epoll event instance.
 - the intr rx event instance includes a intr process callback.

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 117 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  20 ++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   1 +
 3 files changed, 138 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 55be263..ffccb0e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -899,6 +899,51 @@ rte_eal_intr_init(void)
 	return -ret;
 }
 
+#ifdef RTE_NEXT_ABI
+static void
+eal_intr_proc_rxtx_intr(int fd, const struct rte_intr_handle *intr_handle)
+{
+	union rte_intr_read_buffer buf;
+	int bytes_read = 1;
+
+	switch (intr_handle->type) {
+	case RTE_INTR_HANDLE_UIO:
+	case RTE_INTR_HANDLE_UIO_INTX:
+		bytes_read = sizeof(buf.uio_intr_count);
+		break;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		bytes_read = sizeof(buf.vfio_intr_count);
+		break;
+#endif
+	default:
+		bytes_read = 1;
+		RTE_LOG(INFO, EAL, "unexpected intr type\n");
+		break;
+	}
+
+	/**
+	 * read out to clear the ready-to-be-read flag
+	 * for epoll_wait.
+	 */
+	do {
+		bytes_read = read(fd, &buf, bytes_read);
+		if (bytes_read < 0) {
+			if (errno == EINTR || errno == EWOULDBLOCK ||
+			    errno == EAGAIN)
+				continue;
+			RTE_LOG(ERR, EAL,
+				"Error reading from fd %d: %s\n",
+				fd, strerror(errno));
+		} else if (bytes_read == 0)
+			RTE_LOG(ERR, EAL, "Read nothing from fd %d\n", fd);
+		return;
+	} while (1);
+}
+#endif
+
 static int
 eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
 			struct rte_epoll_event *events)
@@ -1035,3 +1080,75 @@ rte_epoll_ctl(int epfd, int op, int fd,
 
 	return 0;
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
+		int op, unsigned int vec, void *data)
+{
+	struct rte_epoll_event *rev;
+	struct rte_epoll_data *epdata;
+	int epfd_op;
+	int rc = 0;
+
+	if (!intr_handle || intr_handle->nb_efd == 0 ||
+	    vec >= intr_handle->nb_efd) {
+		RTE_LOG(ERR, EAL, "Wrong intr vector number.\n");
+		return -EPERM;
+	}
+
+	switch (op) {
+	case RTE_INTR_EVENT_ADD:
+		epfd_op = EPOLL_CTL_ADD;
+		rev = &intr_handle->elist[vec];
+		if (rev->status != RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event already been added.\n");
+			return -EEXIST;
+		}
+
+		/* attach to intr vector fd */
+		epdata = &rev->epdata;
+		epdata->event  = EPOLLIN | EPOLLPRI | EPOLLET;
+		epdata->data   = data;
+		epdata->cb_fun = (rte_intr_event_cb_t)eal_intr_proc_rxtx_intr;
+		epdata->cb_arg = (void *)intr_handle;
+		rc = rte_epoll_ctl(epfd, epfd_op, intr_handle->efds[vec], rev);
+		if (!rc)
+			RTE_LOG(DEBUG, EAL,
+				"efd %d associated with vec %d added on epfd %d"
+				"\n", rev->fd, vec, epfd);
+		else
+			rc = -EPERM;
+		break;
+	case RTE_INTR_EVENT_DEL:
+		epfd_op = EPOLL_CTL_DEL;
+		rev = &intr_handle->elist[vec];
+		if (rev->status == RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event does not exist.\n");
+			return -EPERM;
+		}
+
+		rc = rte_epoll_ctl(rev->epfd, epfd_op, rev->fd, rev);
+		if (rc)
+			rc = -EPERM;
+		break;
+	default:
+		RTE_LOG(ERR, EAL, "event op type mismatch\n");
+		rc = -EPERM;
+	}
+
+	return rc;
+}
+#else
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+	return -ENOTSUP;
+}
+#endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 886608c..acf4be9 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -148,4 +148,24 @@ rte_epoll_ctl(int epfd, int op, int fd,
 int
 rte_intr_tls_epfd(void);
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 39cc2d2..095b2c5 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -118,6 +118,7 @@ DPDK_2.1 {
 	rte_eal_pci_detach;
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
 
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
                         ` (8 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold the rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback which is executed during wakeup.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v11 changes
 - cleanup spelling error

v9 changes
 - rework on coding style

v8 changes
 - support delete event in safety during the wakeup execution
 - add EINTR process during epoll_wait

v7 changes
 - split v6[4/8] into two patches, one for epoll event(this one)
   another for rx intr(next patch)
 - introduce rte_epoll_event definition
 - rte_epoll_wait/ctl for more generic RTE epoll API

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 139 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  80 ++++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   3 +
 3 files changed, 222 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 61e7c85..55be263 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -69,6 +69,8 @@
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
 
+static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+
 /**
  * union for pipe fds.
  */
@@ -896,3 +898,140 @@ rte_eal_intr_init(void)
 
 	return -ret;
 }
+
+static int
+eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
+			struct rte_epoll_event *events)
+{
+	unsigned int i, count = 0;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < n; i++) {
+		rev = evs[i].data.ptr;
+		if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
+						 RTE_EPOLL_EXEC))
+			continue;
+
+		events[count].status        = RTE_EPOLL_VALID;
+		events[count].fd            = rev->fd;
+		events[count].epfd          = rev->epfd;
+		events[count].epdata.event  = rev->epdata.event;
+		events[count].epdata.data   = rev->epdata.data;
+		if (rev->epdata.cb_fun)
+			rev->epdata.cb_fun(rev->fd,
+					   rev->epdata.cb_arg);
+
+		rte_compiler_barrier();
+		rev->status = RTE_EPOLL_VALID;
+		count++;
+	}
+	return count;
+}
+
+static inline int
+eal_init_tls_epfd(void)
+{
+	int pfd = epoll_create(255);
+
+	if (pfd < 0) {
+		RTE_LOG(ERR, EAL,
+			"Cannot create epoll instance\n");
+		return -1;
+	}
+	return pfd;
+}
+
+int
+rte_intr_tls_epfd(void)
+{
+	if (RTE_PER_LCORE(_epfd) == -1)
+		RTE_PER_LCORE(_epfd) = eal_init_tls_epfd();
+
+	return RTE_PER_LCORE(_epfd);
+}
+
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout)
+{
+	struct epoll_event evs[maxevents];
+	int rc;
+
+	if (!events) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	while (1) {
+		rc = epoll_wait(epfd, evs, maxevents, timeout);
+		if (likely(rc > 0)) {
+			/* epoll_wait has at least one fd ready to read */
+			rc = eal_epoll_process_event(evs, rc, events);
+			break;
+		} else if (rc < 0) {
+			if (errno == EINTR)
+				continue;
+			/* epoll_wait fail */
+			RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
+				strerror(errno));
+			rc = -1;
+			break;
+		}
+	}
+
+	return rc;
+}
+
+static inline void
+eal_epoll_data_safe_free(struct rte_epoll_event *ev)
+{
+	while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
+				    RTE_EPOLL_INVALID))
+		while (ev->status != RTE_EPOLL_VALID)
+			rte_pause();
+	memset(&ev->epdata, 0, sizeof(ev->epdata));
+	ev->fd = -1;
+	ev->epfd = -1;
+}
+
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event)
+{
+	struct epoll_event ev;
+
+	if (!event) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	if (op == EPOLL_CTL_ADD) {
+		event->status = RTE_EPOLL_VALID;
+		event->fd = fd;  /* ignore fd in event */
+		event->epfd = epfd;
+		ev.data.ptr = (void *)event;
+	}
+
+	ev.events = event->epdata.event;
+	if (epoll_ctl(epfd, op, fd, &ev) < 0) {
+		RTE_LOG(ERR, EAL, "Error op %d fd %d epoll_ctl, %s\n",
+			op, fd, strerror(errno));
+		if (op == EPOLL_CTL_ADD)
+			/* rollback status when CTL_ADD fail */
+			event->status = RTE_EPOLL_INVALID;
+		return -1;
+	}
+
+	if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
+		eal_epoll_data_safe_free(event);
+
+	return 0;
+}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index ac33eda..886608c 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -51,6 +51,32 @@ enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_MAX
 };
 
+#define RTE_INTR_EVENT_ADD            1UL
+#define RTE_INTR_EVENT_DEL            2UL
+
+typedef void (*rte_intr_event_cb_t)(int fd, void *arg);
+
+struct rte_epoll_data {
+	uint32_t event;               /**< event type */
+	void *data;                   /**< User data */
+	rte_intr_event_cb_t cb_fun;   /**< IN: callback fun */
+	void *cb_arg;	              /**< IN: callback arg */
+};
+
+enum {
+	RTE_EPOLL_INVALID = 0,
+	RTE_EPOLL_VALID,
+	RTE_EPOLL_EXEC,
+};
+
+/** interrupt epoll event obj, taken by epoll_event.ptr */
+struct rte_epoll_event {
+	volatile uint32_t status;  /**< OUT: event status */
+	int fd;                    /**< OUT: event fd */
+	int epfd;       /**< OUT: epoll instance the ev associated with */
+	struct rte_epoll_data epdata;
+};
+
 /** Handle for interrupts. */
 struct rte_intr_handle {
 	union {
@@ -64,8 +90,62 @@ struct rte_intr_handle {
 	uint32_t max_intr;             /**< max interrupt requested */
 	uint32_t nb_efd;               /**< number of available efd(event fd) */
 	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
+				       /**< intr vector epoll event */
 	int *intr_vec;                 /**< intr vector number array */
 #endif
 };
 
+#define RTE_EPOLL_PER_THREAD        -1  /**< to hint using per thread epfd */
+
+/**
+ * It waits for events on the epoll instance.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller wait for events.
+ * @param events
+ *   Memory area contains the events that will be available for the caller.
+ * @param maxevents
+ *   Up to maxevents are returned, must greater than zero.
+ * @param timeout
+ *   Specifying a timeout of -1 causes a block indefinitely.
+ *   Specifying a timeout equal to zero cause to return immediately.
+ * @return
+ *   - On success, returns the number of available event.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout);
+
+/**
+ * It performs control operations on epoll instance referred by the epfd.
+ * It requests that the operation op be performed for the target fd.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller perform control operations.
+ * @param op
+ *   The operation be performed for the target fd.
+ * @param fd
+ *   The target fd on which the control ops perform.
+ * @param event
+ *   Describes the object linked to the fd.
+ *   Note: The caller must take care the object deletion after CTL_DEL.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event);
+
+/**
+ * The function returns the per thread epoll instance.
+ *
+ * @return
+ *   epfd the epoll instance referred to.
+ */
+int
+rte_intr_tls_epfd(void);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index b2d4441..39cc2d2 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -116,6 +116,9 @@ DPDK_2.1 {
 	global:
 
 	rte_eal_pci_detach;
+	rte_epoll_ctl;
+	rte_epoll_wait;
+	rte_intr_tls_epfd;
 	rte_memzone_free;
 
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
                         ` (9 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds interrupt vectors support in rte_intr_handle.
'vec_en' is set when interrupt vectors are detected and associated event fds are set.
Those event fds are stored in efds[].
'intr_vec' is reserved for device driver to initialize the vector mapping table.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove unnecessary RTE_NEXT_ABI comment

v14 changes
 - per-patch basis ABI compatibility rework

v7 changes:
 - add eptrs[], it's used to store the register rte_epoll_event instances.
 - add vec_en, to log the vector capability status.

v6 changes:
 - add mapping table between irq vector number and queue id.

v5 changes:
 - Create this new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect.

 lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index bdeb3fc..ac33eda 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#define RTE_MAX_RXTX_INTR_VEC_ID     32
+
 enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,          /**< uio device handle */
@@ -58,6 +60,12 @@ struct rte_intr_handle {
 	};
 	int fd;	 /**< interrupt event file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	uint32_t max_intr;             /**< max interrupt requested */
+	uint32_t nb_efd;               /**< number of available efd(event fd) */
+	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	int *intr_vec;                 /**< intr vector number array */
+#endif
 };
 
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (9 preceding siblings ...)
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
@ 2015-07-20  3:02  4%     ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
                         ` (10 more replies)
  10 siblings, 11 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

v15 changes
 - remove unnecessary RTE_NEXT_ABI comment
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map
 - minor comments rework

v13 changes
 - version map cleanup for v2.1
 - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility

Patch series v12
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Danny Zhou <danny.zhou@intel.com>

v12 changes
 - bsd cleanup for unused variable warning
 - fix awkward line split in debug message

v11 changes
 - typo cleanup and check kernel style

v10 changes
 - code rework to return actual error code
 - bug fix for lsc when using uio_pci_generic

v9 changes
 - code rework to fix open comment
 - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
 - new patch to turn off the feature by default so as to avoid v2.1 abi broken

v8 changes
 - remove condition check for only vfio-msix
 - add multiplex intr support when only one intr vector allowed
 - lsc and rxq interrupt runtime enable decision
 - add safe event delete while the event wakeup execution happens

v7 changes
 - decouple epoll event and intr operation
 - add condition check in the case intr vector is disabled
 - renaming some APIs

v6 changes
 - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set.
 - using vector number instead of queue_id as interrupt API params.
 - patch reorder and split.

v5 changes
 - Rebase the patchset onto the HEAD
 - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
 - Export wait-for-rx interrupt function for shared libraries
 - Split-off a new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect
 - Change sample applicaiton to accomodate EAL function spec change
   accordingly

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Adjust position of new-added structure fields and functions to
   avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions
 - Move spinlok from PMD to L3fwd-power
 - Remove unnecessary variables in e1000_mac_info
 - Fix miscelleous review comments

v2 changes
 - Fix compilation issue in Makefile for missed header file.
 - Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
   pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
   which then wakes up DPDK polling thread). In this way, it introduces
   non-deterministic wakeup latency for DPDK polling thread as well as packet
   latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
   LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF only)
.
2) Build on top of the VFIO mechanism instead of UIO, so it could support
   up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
   VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
   user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
   switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
   and rx queue interrupt handlers causes a mess. [FIXED]
2) LSC interrupt is not supported by VF driver, so it is by default disabled
   in L3fwd-power now. Feel free to turn in on if you want to support both LSC
   and rx queue interrupts on a PF.

Cunming Liang (13):
  eal/linux: add interrupt vectors support in intr_handle
  eal/linux: add rte_epoll_wait/ctl support
  eal/linux: add API to set rx interrupt event monitor
  eal/linux: fix comments typo on vfio msi
  eal/linux: map eventfd to VFIO MSI-X intr vector
  eal/linux: standalone intr event fd create support
  eal/linux: fix lsc read error in uio_pci_generic
  eal/bsd: dummy for new intr definition
  eal/bsd: fix inappropriate linuxapp referred in bsd
  ethdev: add rx intr enable, disable and ctl functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
    switch

 drivers/net/e1000/igb_ethdev.c                     | 311 ++++++++++--
 drivers/net/ixgbe/ixgbe_ethdev.c                   | 527 ++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   4 +
 examples/l3fwd-power/main.c                        | 205 ++++++--
 lib/librte_eal/bsdapp/eal/eal_interrupts.c         |  42 ++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  74 ++-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |   5 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 414 ++++++++++++++--
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 153 ++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   8 +
 lib/librte_ether/rte_ethdev.c                      | 147 ++++++
 lib/librte_ether/rte_ethdev.h                      | 104 ++++
 lib/librte_ether/rte_ether_version.map             |   4 +
 13 files changed, 1870 insertions(+), 128 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-19 23:31  0%       ` Thomas Monjalon
@ 2015-07-20  2:02  0%         ` Liang, Cunming
  0 siblings, 0 replies; 200+ results
From: Liang, Cunming @ 2015-07-20  2:02 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, shemming



On 7/20/2015 7:31 AM, Thomas Monjalon wrote:
> 2015-07-17 14:16, Cunming Liang:
>> +#ifdef RTE_NEXT_ABI
>> +	/**
>> +	 * RTE_NEXT_ABI will be removed from v2.2.
>> +	 * It's only used to avoid ABI(unannounced) broken in v2.1.
>> +	 * Make sure being aware of the impact before turning on the feature.
>> +	 */
> We are not going to put this comment each time NEXT_ABI is used with ifdef.
Ok, will remove the comment.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
@ 2015-07-19 23:31  0%       ` Thomas Monjalon
  2015-07-20  2:02  0%         ` Liang, Cunming
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-19 23:31 UTC (permalink / raw)
  To: Cunming Liang; +Cc: dev, shemming

2015-07-17 14:16, Cunming Liang:
> +#ifdef RTE_NEXT_ABI
> +	/**
> +	 * RTE_NEXT_ABI will be removed from v2.2.
> +	 * It's only used to avoid ABI(unannounced) broken in v2.1.
> +	 * Make sure being aware of the impact before turning on the feature.
> +	 */

We are not going to put this comment each time NEXT_ABI is used with ifdef.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR
  2015-07-07  9:04  0% ` Liu, Yong
@ 2015-07-19 22:54  3%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-19 22:54 UTC (permalink / raw)
  To: Wu, Jingjing; +Cc: dev

> > This patch set fixes the issue SCTP flow cannot be matched by FVL's flow
> > director. The issue's root cause is that due to the NIC's firmware update,
> > the input set of sctp flow are changed to source IP, destination IP,
> > source port, destination port and Verification-Tag, which are source IP,
> > destination IP and Verification-Tag previously.
> > And because this fix will affect the struct rte_eth_fdir_flow, use
> > RTE_NEXT_ABI to avoid ABI breaking.
> > 
> > Jingjing Wu (3):
> >   ethdev: change the input set of sctp flow
> >   i40e: make sport and dport of sctp flow involved in match
> >   testpmd: add sport and dport configuration for sctp flow
> 
> Tested-by: Marvin Liu <yong.liu@intel.com>

Applied, thanks

An ABI deprecation announce must be sent.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes
  2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
@ 2015-07-19 21:32  0% ` Thomas Monjalon
  2015-07-20 10:45  0% ` Neil Horman
  2 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-19 21:32 UTC (permalink / raw)
  To: dev

2015-07-19 12:52, Thomas Monjalon:
> The main change of these patches is to improve naming consistency
> across ethdev and EAL.
> It should be applied shortly to be part of rc1. If some comments arise,
> it can be fixed/improved in rc2.
> 
> Thomas Monjalon (4):
>   doc: rename ABI chapter to deprecation
>   pci: fix detach and uninit naming
>   ethdev: refactor port release
>   ethdev: fix doxygen internal comments

Applied, do not hesitate to comment if something must be fixed.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
  2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
@ 2015-07-19 10:52 36% ` Thomas Monjalon
  2015-07-21 13:20  7%   ` Dumitrescu, Cristian
  2015-07-19 21:32  0% ` [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
  2015-07-20 10:45  0% ` Neil Horman
  2 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-19 10:52 UTC (permalink / raw)
  To: dev

This chapter is for ABI and API. That's why a renaming is required.

Remove also the examples which are now in the referenced guidelines.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 MAINTAINERS                                       |  2 +-
 doc/guides/rel_notes/{abi.rst => deprecation.rst} | 16 +++++-----------
 doc/guides/rel_notes/index.rst                    |  2 +-
 3 files changed, 7 insertions(+), 13 deletions(-)
 rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2a32659..6531900 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -60,7 +60,7 @@ F: doc/guides/prog_guide/ext_app_lib_make_help.rst
 ABI versioning
 M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
-F: doc/guides/rel_notes/abi.rst
+F: doc/guides/rel_notes/deprecation.rst
 F: scripts/validate-abi.sh
 
 
diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/deprecation.rst
similarity index 51%
rename from doc/guides/rel_notes/abi.rst
rename to doc/guides/rel_notes/deprecation.rst
index 7a08830..eef01f1 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -1,17 +1,11 @@
-ABI policy
-==========
+Deprecation
+===========
 
 See the :doc:`guidelines document for details of the ABI policy </guidelines/versioning>`.
-ABI deprecation notices are to be posted here.
+API and ABI deprecation notices are to be posted here.
 
-
-Examples of Deprecation Notices
--------------------------------
-
-* The Macro #RTE_FOO is deprecated and will be removed with version 2.0, to be replaced with the inline function rte_bar()
-* The function rte_mbuf_grok has been updated to include new parameter in version 2.0.  Backwards compatibility will be maintained for this function until the release of version 2.1
-* The members struct foo have been reorganized in release 2.0.  Existing binary applications will have backwards compatibility in release 2.0, while newly built binaries will need to reference new structure variant struct foo2.  Compatibility will be removed in release 2.2, and all applications will require updating and rebuilding to the new structure at that time, which will be renamed to the original struct foo.
-* Significant ABI changes are planned for the librte_dostuff library.  The upcoming release 2.0 will not contain these changes, but release 2.1 will, and no backwards compatibility is planned due to the invasive nature of these changes.  Binaries using this library built prior to version 2.1 will require updating and recompilation.
+Help to update from a previous release is provided in
+:doc:`another section </rel_notes/updating_apps>`.
 
 
 Deprecation Notices
diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
index d790783..9d66cd8 100644
--- a/doc/guides/rel_notes/index.rst
+++ b/doc/guides/rel_notes/index.rst
@@ -48,5 +48,5 @@ Contents
     updating_apps
     known_issues
     resolved_issues
-    abi
+    deprecation
     faq
-- 
2.4.2

^ permalink raw reply	[relevance 36%]

* [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes
@ 2015-07-19 10:52  4% Thomas Monjalon
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Thomas Monjalon @ 2015-07-19 10:52 UTC (permalink / raw)
  To: dev

The main change of these patches is to improve naming consistency
across ethdev and EAL.
It should be applied shortly to be part of rc1. If some comments arise,
it can be fixed/improved in rc2.

Thomas Monjalon (4):
  doc: rename ABI chapter to deprecation
  pci: fix detach and uninit naming
  ethdev: refactor port release
  ethdev: fix doxygen internal comments

 MAINTAINERS                                       |  2 +-
 doc/guides/rel_notes/{abi.rst => deprecation.rst} | 19 ++++++++-----------
 doc/guides/rel_notes/index.rst                    |  2 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  2 ++
 lib/librte_eal/common/eal_common_pci.c            | 20 ++++++++++++--------
 lib/librte_eal/common/include/rte_pci.h           |  6 ++++--
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  2 ++
 lib/librte_ether/rte_ethdev.c                     | 11 +++++------
 lib/librte_ether/rte_ethdev.h                     |  9 ++++-----
 9 files changed, 39 insertions(+), 34 deletions(-)
 rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)

-- 
2.4.2

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-17 11:45  7%       ` Mcnamara, John
@ 2015-07-17 12:25  4%         ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-17 12:25 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Fri, Jul 17, 2015 at 11:45:10AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, July 13, 2015 3:00 PM
> > To: Mcnamara, John
> > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > 
> > > > > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > > >
> > > > >
> > >
> > Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil
> 
> Hi,
> 
> Just pinging Chao Zhu on this again so that it isn't forgotten.
> 
> Neil, just to be clear, are you looking for a validate-abi.sh check on PPC?
> 
Yes, correct.
> Just for context, the lro flag doesn't seem to be used anywhere that would be affected by endianness:
> 
>     $ ag -w "\->lro"             
>     drivers/net/ixgbe/ixgbe_rxtx.c
>     3767:   if (dev->data->lro) {
>     3967:   dev->data->lro = 1;
> 
>     drivers/net/ixgbe/ixgbe_ethdev.c
>     1689:   dev->data->lro = 0;
> 
But this data is visible to the outside application, correct?  If so then we
can't rely on internal-only usage as a guide.  If it is only internally visible,
then yes, you are correct, endianess is not an issue then
neil

> John.
> -- 
> 
> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
  2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:10  4% ` Mrzyglod, DanielX T
@ 2015-07-17 12:03  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-17 12:03 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 6:08 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
  2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:09  4% ` Mrzyglod, DanielX T
@ 2015-07-17 12:02  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-17 12:02 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 6:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 13:59  4%     ` Neil Horman
@ 2015-07-17 11:45  7%       ` Mcnamara, John
  2015-07-17 12:25  4%         ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-17 11:45 UTC (permalink / raw)
  To: Neil Horman, Chao Zhu; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, July 13, 2015 3:00 PM
> To: Mcnamara, John
> Cc: dev@dpdk.org; vladz@cloudius-systems.com
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> 
> > > > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > >
> > > >
> >
> Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil

Hi,

Just pinging Chao Zhu on this again so that it isn't forgotten.

Neil, just to be clear, are you looking for a validate-abi.sh check on PPC?

Just for context, the lro flag doesn't seem to be used anywhere that would be affected by endianness:

    $ ag -w "\->lro"             
    drivers/net/ixgbe/ixgbe_rxtx.c
    3767:   if (dev->data->lro) {
    3967:   dev->data->lro = 1;

    drivers/net/ixgbe/ixgbe_ethdev.c
    1689:   dev->data->lro = 0;

John.
-- 

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
  2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
@ 2015-07-17  8:10  4% ` Mrzyglod, DanielX T
  2015-07-17 12:03  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-17  8:10 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:08 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
  2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
@ 2015-07-17  8:09  4% ` Mrzyglod, DanielX T
  2015-07-17 12:02  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-17  8:09 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
  2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
  2015-07-16 15:51  4% ` Singh, Jasvinder
  2015-07-17  7:56  4% ` Gajdzica, MaciejX T
@ 2015-07-17  8:08  4% ` Mrzyglod, DanielX T
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-17  8:08 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 5:27 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
  2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
  2015-07-16 15:51  4% ` Singh, Jasvinder
@ 2015-07-17  7:56  4% ` Gajdzica, MaciejX T
  2015-07-17  8:08  4% ` Mrzyglod, DanielX T
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-17  7:56 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 5:27 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
  2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
@ 2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:09  4% ` Mrzyglod, DanielX T
  2015-07-17 12:02  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-17  7:54 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
  2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
@ 2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:10  4% ` Mrzyglod, DanielX T
  2015-07-17 12:03  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-17  7:54 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:08 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (8 preceding siblings ...)
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch demonstrates how to handle per rx queue interrupt in a NAPI-like
implementation in userspace. The working thread mainly runs in polling mode
and switch to interrupt mode only if there is no packet received in recent polls.
The working thread returns to polling mode immediately once it receives an
interrupt notification caused by the incoming packets.
The sample keeps running in polling mode if the binding PMD hasn't supported
the rx interrupt yet. Now only ixgbe(pf/vf) and igb support it.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v7 changes
 - using new APIs
 - demo multiple port/queue pair wait on the same epoll instance

v6 changes
 - Split event fd add and wait

v5 changes
 - Change invoked function name and parameter to accomodate EAL change

v3 changes
 - Add spinlock to ensure thread safe when accessing interrupt mask
   register

v2 changes
 - Remove unused function which is for debug purpose

 examples/l3fwd-power/main.c | 202 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 162 insertions(+), 40 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index b3c5f43..bec78e1 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -74,12 +74,14 @@
 #include <rte_string_fns.h>
 #include <rte_timer.h>
 #include <rte_power.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
 
 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
 
-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10
 
 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES           200000000ULL
@@ -153,6 +155,9 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 /* ethernet addresses of ports */
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
 
+/* ethernet addresses of ports */
+static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
 /* Ports set in promiscuous mode off by default. */
@@ -185,6 +190,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128
 
+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
 	uint8_t port_id;
@@ -211,7 +219,7 @@ static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
 
 static struct rte_eth_conf port_conf = {
 	.rxmode = {
-		.mq_mode	= ETH_MQ_RX_RSS,
+		.mq_mode = ETH_MQ_RX_RSS,
 		.max_rx_pkt_len = ETHER_MAX_LEN,
 		.split_hdr_size = 0,
 		.header_split   = 0, /**< Header Split disabled */
@@ -223,11 +231,14 @@ static struct rte_eth_conf port_conf = {
 	.rx_adv_conf = {
 		.rss_conf = {
 			.rss_key = NULL,
-			.rss_hf = ETH_RSS_IP,
+			.rss_hf = ETH_RSS_UDP,
 		},
 	},
 	.txmode = {
-		.mq_mode = ETH_DCB_NONE,
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+	.intr_conf = {
+		.lsc = 1,
 	},
 };
 
@@ -399,19 +410,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer *tim,
 	/* accumulate total execution time in us when callback is invoked */
 	sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
 					(float)SCALING_PERIOD;
-
 	/**
 	 * check whether need to scale down frequency a step if it sleep a lot.
 	 */
-	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-		rte_power_freq_down(lcore_id);
+	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 	else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
 		/**
 		 * scale down a step if average packet per iteration less
 		 * than expectation.
 		 */
-		rte_power_freq_down(lcore_id);
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 
 	/**
 	 * initialize another timer according to current frequency to ensure
@@ -712,22 +726,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 
 }
 
-#define SLEEP_GEAR1_THRESHOLD            100
-#define SLEEP_GEAR2_THRESHOLD            1000
+#define MINIMUM_SLEEP_TIME         1
+#define SUSPEND_THRESHOLD          300
 
 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-	/* If zero count is less than 100, use it as the sleep time in us */
-	if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-		return zero_rx_packet_count;
-	/* If zero count is less than 1000, sleep time should be 100 us */
-	else if ((zero_rx_packet_count >= SLEEP_GEAR1_THRESHOLD) &&
-			(zero_rx_packet_count < SLEEP_GEAR2_THRESHOLD))
-		return SLEEP_GEAR1_THRESHOLD;
-	/* If zero count is greater than 1000, sleep time should be 1000 us */
-	else if (zero_rx_packet_count >= SLEEP_GEAR2_THRESHOLD)
-		return SLEEP_GEAR2_THRESHOLD;
+	/* If zero count is less than 100,  sleep 1us */
+	if (zero_rx_packet_count < SUSPEND_THRESHOLD)
+		return MINIMUM_SLEEP_TIME;
+	/* If zero count is less than 1000, sleep 100 us which is the
+		minimum latency switching from C3/C6 to C0
+	*/
+	else
+		return SUSPEND_THRESHOLD;
 
 	return 0;
 }
@@ -767,6 +779,84 @@ power_freq_scaleup_heuristic(unsigned lcore_id,
 	return FREQ_CURRENT;
 }
 
+/**
+ * force polling thread sleep until one-shot rx interrupt triggers
+ * @param port_id
+ *  Port id.
+ * @param queue_id
+ *  Rx queue id.
+ * @return
+ *  0 on success
+ */
+static int
+sleep_until_rx_interrupt(int num)
+{
+	struct rte_epoll_event event[num];
+	int n, i;
+	uint8_t port_id, queue_id;
+	void *data;
+
+	RTE_LOG(INFO, L3FWD_POWER,
+		"lcore %u sleeps until interrupt triggers\n",
+		rte_lcore_id());
+
+	n = rte_epoll_wait(RTE_EPOLL_PER_THREAD, event, num, -1);
+	for (i = 0; i < n; i++) {
+		data = event[i].epdata.data;
+		port_id = ((uintptr_t)data) >> CHAR_BIT;
+		queue_id = ((uintptr_t)data) &
+			RTE_LEN2MASK(CHAR_BIT, uint8_t);
+		RTE_LOG(INFO, L3FWD_POWER,
+			"lcore %u is waked up from rx interrupt on"
+			" port %d queue %d\n",
+			rte_lcore_id(), port_id, queue_id);
+	}
+
+	return 0;
+}
+
+static int turn_on_intr(struct lcore_conf *qconf)
+{
+	int i;
+	struct lcore_rx_queue *rx_queue;
+	uint8_t port_id, queue_id;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		port_id = rx_queue->port_id;
+		queue_id = rx_queue->queue_id;
+
+		rte_spinlock_lock(&(locks[port_id]));
+		rte_eth_dev_rx_intr_enable(port_id, queue_id);
+		rte_spinlock_unlock(&(locks[port_id]));
+	}
+}
+
+static int event_register(struct lcore_conf *qconf)
+{
+	struct lcore_rx_queue *rx_queue;
+	uint8_t portid, queueid;
+	uint32_t data;
+	int ret;
+	int i;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		portid = rx_queue->port_id;
+		queueid = rx_queue->queue_id;
+		data = portid << CHAR_BIT | queueid;
+
+		ret = rte_eth_dev_rx_intr_ctl_q(portid, queueid,
+						RTE_EPOLL_PER_THREAD,
+						RTE_INTR_EVENT_ADD,
+						(void *)((uintptr_t)data));
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 /* main processing loop */
 static int
 main_loop(__attribute__((unused)) void *dummy)
@@ -780,9 +870,9 @@ main_loop(__attribute__((unused)) void *dummy)
 	struct lcore_conf *qconf;
 	struct lcore_rx_queue *rx_queue;
 	enum freq_scale_hint_t lcore_scaleup_hint;
-
 	uint32_t lcore_rx_idle_count = 0;
 	uint32_t lcore_idle_hint = 0;
+	int intr_en = 0;
 
 	const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
 
@@ -799,13 +889,18 @@ main_loop(__attribute__((unused)) void *dummy)
 	RTE_LOG(INFO, L3FWD_POWER, "entering main loop on lcore %u\n", lcore_id);
 
 	for (i = 0; i < qconf->n_rx_queue; i++) {
-
 		portid = qconf->rx_queue_list[i].port_id;
 		queueid = qconf->rx_queue_list[i].queue_id;
 		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%hhu "
 			"rxqueueid=%hhu\n", lcore_id, portid, queueid);
 	}
 
+	/* add into event wait list */
+	if (event_register(qconf) == 0)
+		intr_en = 1;
+	else
+		RTE_LOG(INFO, L3FWD_POWER, "RX interrupt won't enable.\n");
+
 	while (1) {
 		stats[lcore_id].nb_iteration_looped++;
 
@@ -840,6 +935,7 @@ main_loop(__attribute__((unused)) void *dummy)
 			prev_tsc_power = cur_tsc_power;
 		}
 
+start_rx:
 		/*
 		 * Read packet from RX queues
 		 */
@@ -853,6 +949,7 @@ main_loop(__attribute__((unused)) void *dummy)
 
 			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
 								MAX_PKT_BURST);
+
 			stats[lcore_id].nb_rx_processed += nb_rx;
 			if (unlikely(nb_rx == 0)) {
 				/**
@@ -915,10 +1012,13 @@ main_loop(__attribute__((unused)) void *dummy)
 						rx_queue->freq_up_hint;
 			}
 
-			if (lcore_scaleup_hint == FREQ_HIGHEST)
-				rte_power_freq_max(lcore_id);
-			else if (lcore_scaleup_hint == FREQ_HIGHER)
-				rte_power_freq_up(lcore_id);
+			if (lcore_scaleup_hint == FREQ_HIGHEST) {
+				if (rte_power_freq_max)
+					rte_power_freq_max(lcore_id);
+			} else if (lcore_scaleup_hint == FREQ_HIGHER) {
+				if (rte_power_freq_up)
+					rte_power_freq_up(lcore_id);
+			}
 		} else {
 			/**
 			 * All Rx queues empty in recent consecutive polls,
@@ -933,16 +1033,23 @@ main_loop(__attribute__((unused)) void *dummy)
 					lcore_idle_hint = rx_queue->idle_hint;
 			}
 
-			if ( lcore_idle_hint < SLEEP_GEAR1_THRESHOLD)
+			if (lcore_idle_hint < SUSPEND_THRESHOLD)
 				/**
 				 * execute "pause" instruction to avoid context
-				 * switch for short sleep.
+				 * switch which generally take hundred of
+				 * microseconds for short sleep.
 				 */
 				rte_delay_us(lcore_idle_hint);
-			else
-				/* long sleep force runing thread to suspend */
-				usleep(lcore_idle_hint);
-
+			else {
+				/* suspend until rx interrupt trigges */
+				if (intr_en) {
+					turn_on_intr(qconf);
+					sleep_until_rx_interrupt(
+						qconf->n_rx_queue);
+				}
+				/* start receiving packets immediately */
+				goto start_rx;
+			}
 			stats[lcore_id].sleep_time += lcore_idle_hint;
 		}
 	}
@@ -1273,7 +1380,7 @@ setup_hash(int socketid)
 	char s[64];
 
 	/* create ipv4 hash */
-	snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
 	ipv4_l3fwd_hash_params.name = s;
 	ipv4_l3fwd_hash_params.socket_id = socketid;
 	ipv4_l3fwd_lookup_struct[socketid] =
@@ -1283,7 +1390,7 @@ setup_hash(int socketid)
 				"socket %d\n", socketid);
 
 	/* create ipv6 hash */
-	snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
 	ipv6_l3fwd_hash_params.name = s;
 	ipv6_l3fwd_hash_params.socket_id = socketid;
 	ipv6_l3fwd_lookup_struct[socketid] =
@@ -1477,6 +1584,7 @@ main(int argc, char **argv)
 	unsigned lcore_id;
 	uint64_t hz;
 	uint32_t n_tx_queue, nb_lcores;
+	uint32_t dev_rxq_num, dev_txq_num;
 	uint8_t portid, nb_rx_queue, queue, socketid;
 
 	/* catch SIGINT and restore cpufreq governor to ondemand */
@@ -1526,10 +1634,19 @@ main(int argc, char **argv)
 		printf("Initializing port %d ... ", portid );
 		fflush(stdout);
 
+		rte_eth_dev_info_get(portid, &dev_info);
+		dev_rxq_num = dev_info.max_rx_queues;
+		dev_txq_num = dev_info.max_tx_queues;
+
 		nb_rx_queue = get_port_n_rx_queues(portid);
+		if (nb_rx_queue > dev_rxq_num)
+			rte_exit(EXIT_FAILURE,
+				"Cannot configure not existed rxq: "
+				"port=%d\n", portid);
+
 		n_tx_queue = nb_lcores;
-		if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
-			n_tx_queue = MAX_TX_QUEUE_PER_PORT;
+		if (n_tx_queue > dev_txq_num)
+			n_tx_queue = dev_txq_num;
 		printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
 			nb_rx_queue, (unsigned)n_tx_queue );
 		ret = rte_eth_dev_configure(portid, nb_rx_queue,
@@ -1553,6 +1670,9 @@ main(int argc, char **argv)
 			if (rte_lcore_is_enabled(lcore_id) == 0)
 				continue;
 
+			if (queueid >= dev_txq_num)
+				continue;
+
 			if (numa_on)
 				socketid = \
 				(uint8_t)rte_lcore_to_socket_id(lcore_id);
@@ -1587,8 +1707,9 @@ main(int argc, char **argv)
 		/* init power management library */
 		ret = rte_power_init(lcore_id);
 		if (ret)
-			rte_exit(EXIT_FAILURE, "Power management library "
-				"initialization failed on core%u\n", lcore_id);
+			rte_log(RTE_LOG_ERR, RTE_LOGTYPE_POWER,
+				"Power management library initialization "
+				"failed on core%u", lcore_id);
 
 		/* init timer structures for each enabled lcore */
 		rte_timer_init(&power_timers[lcore_id]);
@@ -1636,7 +1757,6 @@ main(int argc, char **argv)
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, "
 						"port=%d\n", ret, portid);
-
 		/*
 		 * If enabled, put device in promiscuous mode.
 		 * This allows IO forwarding mode to forward packets
@@ -1645,6 +1765,8 @@ main(int argc, char **argv)
 		 */
 		if (promiscuous_on)
 			rte_eth_promiscuous_enable(portid);
+		/* initialize spinlock for each port */
+		rte_spinlock_init(&(locks[portid]));
 	}
 
 	check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (7 preceding siblings ...)
  2015-07-17  6:16  1%     ` [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start
 - fix link interrupt not working issue in vfio-msix

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove unnecessary variables in e1000_mac_info
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/e1000/igb_ethdev.c | 311 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 277 insertions(+), 34 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index eb97218..fd92c80 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -104,6 +104,9 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
 				struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -201,7 +204,6 @@ static int eth_igb_filter_ctrl(struct rte_eth_dev *dev,
 		     enum rte_filter_type filter_type,
 		     enum rte_filter_op filter_op,
 		     void *arg);
-
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
 				    struct ether_addr *mc_addr_set,
 				    uint32_t nb_mc_addr);
@@ -212,6 +214,17 @@ static int igb_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
 					  uint32_t flags);
 static int igb_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 					  struct timespec *timestamp);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					uint16_t queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					 uint16_t queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+				       uint8_t queue, uint8_t msix_vector);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+			       uint8_t index, uint8_t offset);
+#endif
+static void eth_igb_configure_msix_intr(struct rte_eth_dev *dev);
 
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
@@ -272,6 +285,10 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.vlan_tpid_set        = eth_igb_vlan_tpid_set,
 	.vlan_offload_set     = eth_igb_vlan_offload_set,
 	.rx_queue_setup       = eth_igb_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+	.rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -609,12 +626,6 @@ eth_igb_dev_init(struct rte_eth_dev *eth_dev)
 		     eth_dev->data->port_id, pci_dev->id.vendor_id,
 		     pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		eth_igb_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	igb_intr_enable(eth_dev);
 
@@ -777,7 +788,11 @@ eth_igb_start(struct rte_eth_dev *dev)
 {
 	struct e1000_hw *hw =
 		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	int ret, i, mask;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
+	int ret, mask;
 	uint32_t ctrl_ext;
 
 	PMD_INIT_FUNC_TRACE();
@@ -817,6 +832,29 @@ eth_igb_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	igb_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle)) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for rx interrupt */
+	eth_igb_configure_msix_intr(dev);
+
 	/* Configure for OS presence */
 	igb_init_manageability(hw);
 
@@ -844,33 +882,9 @@ eth_igb_start(struct rte_eth_dev *dev)
 		igb_vmdq_vlan_hw_filter_enable(dev);
 	}
 
-	/*
-	 * Configure the Interrupt Moderation register (EITR) with the maximum
-	 * possible value (0xFFFF) to minimize "System Partial Write" issued by
-	 * spurious [DMA] memory updates of RX and TX ring descriptors.
-	 *
-	 * With a EITR granularity of 2 microseconds in the 82576, only 7/8
-	 * spurious memory updates per second should be expected.
-	 * ((65535 * 2) / 1000.1000 ~= 0.131 second).
-	 *
-	 * Because interrupts are not used at all, the MSI-X is not activated
-	 * and interrupt moderation is controlled by EITR[0].
-	 *
-	 * Note that having [almost] disabled memory updates of RX and TX ring
-	 * descriptors through the Interrupt Moderation mechanism, memory
-	 * updates of ring descriptors are now moderated by the configurable
-	 * value of Write-Back Threshold registers.
-	 */
 	if ((hw->mac.type == e1000_82576) || (hw->mac.type == e1000_82580) ||
 		(hw->mac.type == e1000_i350) || (hw->mac.type == e1000_i210) ||
 		(hw->mac.type == e1000_i211)) {
-		uint32_t ivar;
-
-		/* Enable all RX & TX queues in the IVAR registers */
-		ivar = (uint32_t) ((E1000_IVAR_VALID << 16) | E1000_IVAR_VALID);
-		for (i = 0; i < 8; i++)
-			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, i, ivar);
-
 		/* Configure EITR with the maximum possible value (0xFFFF) */
 		E1000_WRITE_REG(hw, E1000_EITR(0), 0xFFFF);
 	}
@@ -921,8 +935,25 @@ eth_igb_start(struct rte_eth_dev *dev)
 	e1000_setup_link(hw);
 
 	/* check if lsc interrupt feature is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ret = eth_igb_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   eth_igb_interrupt_handler,
+						   (void *)dev);
+			eth_igb_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		eth_igb_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	igb_intr_enable(dev);
@@ -955,8 +986,13 @@ eth_igb_stop(struct rte_eth_dev *dev)
 	struct e1000_flex_filter *p_flex;
 	struct e1000_5tuple_filter *p_5tuple, *p_5tuple_next;
 	struct e1000_2tuple_filter *p_2tuple, *p_2tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	igb_intr_disable(hw);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	igb_pf_reset_hw(hw);
 	E1000_WRITE_REG(hw, E1000_WUC, 0);
 
@@ -1005,6 +1041,15 @@ eth_igb_stop(struct rte_eth_dev *dev)
 		rte_free(p_2tuple);
 	}
 	filter_info->twotuple_mask = 0;
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
@@ -1012,6 +1057,9 @@ eth_igb_close(struct rte_eth_dev *dev)
 {
 	struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct rte_eth_link link;
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	eth_igb_stop(dev);
 	e1000_phy_hw_reset(hw);
@@ -1029,6 +1077,14 @@ eth_igb_close(struct rte_eth_dev *dev)
 
 	igb_dev_clear_queues(dev);
 
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
+
 	memset(&link, 0, sizeof(link));
 	rte_igb_dev_atomic_write_link_status(dev, &link);
 }
@@ -1853,6 +1909,35 @@ eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+/* It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	uint32_t mask, regval;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_eth_dev_info dev_info;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	eth_igb_infos_get(dev, &dev_info);
+
+	mask = 0xFFFFFFFF >> (32 - dev_info.max_rx_queues);
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and gets interrupt causes, check it and set a bit flag
  * to update link status.
@@ -3788,5 +3873,163 @@ static struct rte_driver pmd_igbvf_drv = {
 	.init = rte_igbvf_pmd_init,
 };
 
+#ifdef RTE_NEXT_ABI
+static int
+eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+
+	E1000_WRITE_REG(hw, E1000_EIMC, mask);
+	E1000_WRITE_FLUSH(hw);
+
+	return 0;
+}
+
+static int
+eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+	uint32_t regval;
+
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+	E1000_WRITE_FLUSH(hw);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static void
+eth_igb_write_ivar(struct e1000_hw *hw, uint8_t  msix_vector,
+		   uint8_t index, uint8_t offset)
+{
+	uint32_t val = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
+
+	/* clear bits */
+	val &= ~((uint32_t)0xFF << offset);
+
+	/* write vector and valid bit */
+	val |= (msix_vector | E1000_IVAR_VALID) << offset;
+
+	E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, val);
+}
+
+static void
+eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+			   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp = 0;
+
+	if (hw->mac.type == e1000_82575) {
+		if (direction == 0)
+			tmp = E1000_EICR_RX_QUEUE0 << queue;
+		else if (direction == 1)
+			tmp = E1000_EICR_TX_QUEUE0 << queue;
+		E1000_WRITE_REG(hw, E1000_MSIXBM(msix_vector), tmp);
+	} else if (hw->mac.type == e1000_82576) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector, queue & 0x7,
+					   ((queue & 0x8) << 1) +
+					   8 * direction);
+	} else if ((hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector,
+					   queue >> 1,
+					   ((queue & 0x1) << 4) +
+					   8 * direction);
+	}
+}
+#endif
+
+/* Sets up the hardware to generate MSI-X interrupts properly
+ * @hw
+ *  board private structure
+ */
+static void
+eth_igb_configure_msix_intr(struct rte_eth_dev *dev)
+{
+#ifdef RTE_NEXT_ABI
+	int queue_id;
+	uint32_t tmpval, regval, intr_mask;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t vec = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* set interrupt vector for other causes */
+	if (hw->mac.type == e1000_82575) {
+		tmpval = E1000_READ_REG(hw, E1000_CTRL_EXT);
+		/* enable MSI-X PBA support */
+		tmpval |= E1000_CTRL_EXT_PBA_CLR;
+
+		/* Auto-Mask interrupts upon ICR read */
+		tmpval |= E1000_CTRL_EXT_EIAME;
+		tmpval |= E1000_CTRL_EXT_IRCA;
+
+		E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmpval);
+
+		/* enable msix_other interrupt */
+		E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0), 0, E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAM);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | E1000_EIMS_OTHER);
+	} else if ((hw->mac.type == e1000_82576) ||
+			(hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		/* turn on MSI-X capability first */
+		E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE |
+					E1000_GPIE_PBA | E1000_GPIE_EIAME |
+					E1000_GPIE_NSICR);
+
+		intr_mask = (1 << intr_handle->max_intr) - 1;
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | intr_mask);
+
+		/* enable msix_other interrupt */
+		regval = E1000_READ_REG(hw, E1000_EIMS);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | intr_mask);
+		tmpval = (dev->data->nb_rx_queues | E1000_IVAR_VALID) << 8;
+		E1000_WRITE_REG(hw, E1000_IVAR_MISC, tmpval);
+	}
+
+	/* use EIAM to auto-mask when MSI-X interrupt
+	 * is asserted, this saves a register write for every interrupt
+	 */
+	intr_mask = (1 << intr_handle->nb_efd) - 1;
+	regval = E1000_READ_REG(hw, E1000_EIAM);
+	E1000_WRITE_REG(hw, E1000_EIAM, regval | intr_mask);
+
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues; queue_id++) {
+		eth_igb_assign_msix_vector(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	E1000_WRITE_FLUSH(hw);
+#endif
+}
+
 PMD_REGISTER_DRIVER(pmd_igb_drv);
 PMD_REGISTER_DRIVER(pmd_igbvf_drv);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (6 preceding siblings ...)
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
@ 2015-07-17  6:16  1%     ` Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF Cunming Liang
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch does below things for ixgbe PF and VF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v10 changes
 - return an actual error code rather than -1

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/ixgbe/ixgbe_ethdev.c | 527 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h |   4 +
 2 files changed, 518 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 8d68125..8145da9 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -82,6 +82,9 @@
  */
 #define IXGBE_FC_LO    0x40
 
+/* Default minimum inter-interrupt interval for EITR configuration */
+#define IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT    0x79E
+
 /* Timer value included in XOFF frames. */
 #define IXGBE_FC_PAUSE 0x680
 
@@ -179,6 +182,9 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
 			uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev);
 static void ixgbe_dev_interrupt_handler(struct rte_intr_handle *handle,
@@ -191,11 +197,14 @@ static void ixgbe_dcb_init(struct ixgbe_hw *hw,struct ixgbe_dcb_config *dcb_conf
 
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev);
+static int ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev);
+static int ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_configure(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_start(struct rte_eth_dev *dev);
 static void ixgbevf_dev_stop(struct rte_eth_dev *dev);
 static void ixgbevf_dev_close(struct rte_eth_dev *dev);
 static void ixgbevf_intr_disable(struct ixgbe_hw *hw);
+static void ixgbevf_intr_enable(struct ixgbe_hw *hw);
 static void ixgbevf_dev_stats_get(struct rte_eth_dev *dev,
 		struct rte_eth_stats *stats);
 static void ixgbevf_dev_stats_reset(struct rte_eth_dev *dev);
@@ -205,6 +214,17 @@ static void ixgbevf_vlan_strip_queue_set(struct rte_eth_dev *dev,
 		uint16_t queue, int on);
 static void ixgbevf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on);
+static void ixgbevf_dev_interrupt_handler(struct rte_intr_handle *handle,
+					  void *param);
+#ifdef RTE_NEXT_ABI
+static int ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					    uint16_t queue_id);
+static int ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					     uint16_t queue_id);
+static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+				 uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbevf_configure_msix(struct rte_eth_dev *dev);
 
 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -221,6 +241,15 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev,
 		uint8_t rule_id, uint8_t on);
 static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev,
 		uint8_t	rule_id);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					  uint16_t queue_id);
+static int ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					   uint16_t queue_id);
+static void ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+			       uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbe_configure_msix(struct rte_eth_dev *dev);
 
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 		uint16_t queue_idx, uint16_t tx_rate);
@@ -282,7 +311,7 @@ static int ixgbe_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
  */
 #define UPDATE_VF_STAT(reg, last, cur)	                        \
 {                                                               \
-	u32 latest = IXGBE_READ_REG(hw, reg);                   \
+	uint32_t latest = IXGBE_READ_REG(hw, reg);                   \
 	cur += latest - last;                                   \
 	last = latest;                                          \
 }
@@ -363,6 +392,10 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.tx_queue_start	      = ixgbe_dev_tx_queue_start,
 	.tx_queue_stop        = ixgbe_dev_tx_queue_stop,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbe_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbe_dev_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
@@ -427,8 +460,13 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.vlan_offload_set     = ixgbevf_vlan_offload_set,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
+	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbevf_dev_rx_queue_intr_disable,
+#endif
 	.mac_addr_add         = ixgbevf_add_mac_addr,
 	.mac_addr_remove      = ixgbevf_remove_mac_addr,
 	.set_mc_addr_list     = ixgbe_dev_set_mc_addr_list,
@@ -928,12 +966,6 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
 			eth_dev->data->port_id, pci_dev->id.vendor_id,
 			pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		ixgbe_dev_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	ixgbe_enable_intr(eth_dev);
 
@@ -1489,6 +1521,10 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct ixgbe_vf_info *vfinfo =
 		*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
 	int err, link_up = 0, negotiate = 0;
 	uint32_t speed = 0;
 	int mask = 0;
@@ -1521,6 +1557,30 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	ixgbe_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int),
+				    0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for sleep until rx interrupt */
+	ixgbe_configure_msix(dev);
+
 	/* initialize transmission unit */
 	ixgbe_dev_tx_init(dev);
 
@@ -1598,8 +1658,25 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 skip_link_setup:
 
 	/* check if lsc interrupt is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ixgbe_dev_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   ixgbe_dev_interrupt_handler,
+						   (void *)dev);
+			ixgbe_dev_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		ixgbe_dev_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	ixgbe_enable_intr(dev);
@@ -1656,6 +1733,7 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	struct ixgbe_filter_info *filter_info =
 		IXGBE_DEV_PRIVATE_TO_FILTER_INFO(dev->data->dev_private);
 	struct ixgbe_5tuple_filter *p_5tuple, *p_5tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 	int vf;
 
 	PMD_INIT_FUNC_TRACE();
@@ -1663,6 +1741,9 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	/* disable interrupts */
 	ixgbe_disable_intr(hw);
 
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	/* reset the NIC */
 	ixgbe_pf_reset_hw(hw);
 	hw->adapter_stopped = FALSE;
@@ -1703,6 +1784,14 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	memset(filter_info->fivetuple_mask, 0,
 		sizeof(uint32_t) * IXGBE_5TUPLE_ARRAY_SIZE);
 
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 /*
@@ -2298,6 +2387,30 @@ ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+/**
+ * It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+static int
+ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	intr->mask |= IXGBE_EICR_RTX_QUEUE;
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and sets flag (IXGBE_EICR_LSC) for the link_update.
  *
@@ -2324,10 +2437,10 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	PMD_DRV_LOG(INFO, "eicr %x", eicr);
 
 	intr->flags = 0;
-	if (eicr & IXGBE_EICR_LSC) {
-		/* set flag for async link update */
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
 		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
-	}
 
 	if (eicr & IXGBE_EICR_MAILBOX)
 		intr->flags |= IXGBE_FLAG_MAILBOX;
@@ -2335,6 +2448,30 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev)
+{
+	uint32_t eicr;
+	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	/* clear all cause mask */
+	ixgbevf_intr_disable(hw);
+
+	/* read-on-clear nic registers here */
+	eicr = IXGBE_READ_REG(hw, IXGBE_VTEICR);
+	PMD_DRV_LOG(INFO, "eicr %x", eicr);
+
+	intr->flags = 0;
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
+		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
+
+	return 0;
+}
+
 /**
  * It gets and then prints the link status.
  *
@@ -2430,6 +2567,18 @@ ixgbe_dev_interrupt_action(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_DRV_LOG(DEBUG, "enable intr immediately");
+	ixgbevf_intr_enable(hw);
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+	return 0;
+}
+
 /**
  * Interrupt handler which shall be registered for alarm callback for delayed
  * handling specific interrupt to wait for the stable nic state. As the
@@ -2484,13 +2633,24 @@ ixgbe_dev_interrupt_delayed_handler(void *param)
  */
 static void
 ixgbe_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
-							void *param)
+			    void *param)
 {
 	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
 	ixgbe_dev_interrupt_get_status(dev);
 	ixgbe_dev_interrupt_action(dev);
 }
 
+static void
+ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+			      void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+	ixgbevf_dev_interrupt_get_status(dev);
+	ixgbevf_dev_interrupt_action(dev);
+}
+
 static int
 ixgbe_dev_led_on(struct rte_eth_dev *dev)
 {
@@ -2988,6 +3148,19 @@ ixgbevf_intr_disable(struct ixgbe_hw *hw)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
+static void
+ixgbevf_intr_enable(struct ixgbe_hw *hw)
+{
+	PMD_INIT_FUNC_TRACE();
+
+	/* VF enable interrupt autoclean */
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAM, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAC, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, IXGBE_VF_IRQ_ENABLE_MASK);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
 static int
 ixgbevf_dev_configure(struct rte_eth_dev *dev)
 {
@@ -3029,6 +3202,11 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
 	int err, mask = 0;
 
 	PMD_INIT_FUNC_TRACE();
@@ -3059,6 +3237,42 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 
 	ixgbevf_dev_rxtx_start(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+	ixgbevf_configure_msix(dev);
+
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle))
+			rte_intr_callback_register(intr_handle,
+					       ixgbevf_dev_interrupt_handler,
+					       (void *)dev);
+		else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+	rte_intr_enable(intr_handle);
+
+	/* Re-enable interrupt for VF */
+	ixgbevf_intr_enable(hw);
+
 	return 0;
 }
 
@@ -3066,6 +3280,7 @@ static void
 ixgbevf_dev_stop(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3082,12 +3297,27 @@ ixgbevf_dev_stop(struct rte_eth_dev *dev)
 	dev->data->scattered_rx = 0;
 
 	ixgbe_dev_clear_queues(dev);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
 ixgbevf_dev_close(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3097,6 +3327,14 @@ ixgbevf_dev_close(struct rte_eth_dev *dev)
 
 	/* reprogram the RAR[0] in case user changed it. */
 	ixgbe_set_rar(hw, 0, hw->mac.addr, 0, IXGBE_RAH_AV);
+
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
 }
 
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on)
@@ -3614,6 +3852,269 @@ ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+static int
+ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask |= (1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask &= ~(1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask |= (1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= (1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= (1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask &= ~(1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= ~(1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= ~(1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+
+	return 0;
+}
+
+static void
+ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		     uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	if (direction == -1) {
+		/* other causes */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR_MISC);
+		tmp &= ~0xFF;
+		tmp |= msix_vector;
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR_MISC, tmp);
+	} else {
+		/* rx or tx cause */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		idx = ((16 * (queue & 1)) + (8 * direction));
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR(queue >> 1));
+		tmp &= ~(0xFF << idx);
+		tmp |= (msix_vector << idx);
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR(queue >> 1), tmp);
+	}
+}
+
+/**
+ * set the IVAR registers, mapping interrupt causes to vectors
+ * @param hw
+ *  pointer to ixgbe_hw struct
+ * @direction
+ *  0 for Rx, 1 for Tx, -1 for other causes
+ * @queue
+ *  queue to map the corresponding interrupt to
+ * @msix_vector
+ *  the vector to map to the corresponding queue
+ */
+static void
+ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		if (direction == -1)
+			direction = 0;
+		idx = (((direction * 64) + queue) >> 2) & 0x1F;
+		tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(idx));
+		tmp &= ~(0xFF << (8 * (queue & 0x3)));
+		tmp |= (msix_vector << (8 * (queue & 0x3)));
+		IXGBE_WRITE_REG(hw, IXGBE_IVAR(idx), tmp);
+	} else if ((hw->mac.type == ixgbe_mac_82599EB) ||
+			(hw->mac.type == ixgbe_mac_X540)) {
+		if (direction == -1) {
+			/* other causes */
+			idx = ((queue & 1) * 8);
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR_MISC);
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR_MISC, tmp);
+		} else {
+			/* rx or tx causes */
+			idx = ((16 * (queue & 1)) + (8 * direction));
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(queue >> 1));
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR(queue >> 1), tmp);
+		}
+	}
+}
+#endif
+
+static void
+ixgbevf_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t q_idx;
+	uint32_t vector_idx = 0;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd.
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* Configure all RX queues of VF */
+	for (q_idx = 0; q_idx < dev->data->nb_rx_queues; q_idx++) {
+		/* Force all queue use vector 0,
+		 * as IXGBE_VF_MAXMSIVECOTR = 1
+		 */
+		ixgbevf_set_ivar_map(hw, 0, q_idx, vector_idx);
+		intr_handle->intr_vec[q_idx] = vector_idx;
+	}
+
+	/* Configure VF Rx queue ivar */
+	ixgbevf_set_ivar_map(hw, -1, 1, vector_idx);
+#endif
+}
+
+/**
+ * Sets up the hardware to properly generate MSI-X interrupts
+ * @hw
+ *  board private structure
+ */
+static void
+ixgbe_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t queue_id, vec = 0;
+	uint32_t mask;
+	uint32_t gpie;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* setup GPIE for MSI-x mode */
+	gpie = IXGBE_READ_REG(hw, IXGBE_GPIE);
+	gpie |= IXGBE_GPIE_MSIX_MODE | IXGBE_GPIE_PBA_SUPPORT |
+		IXGBE_GPIE_OCD | IXGBE_GPIE_EIAME;
+	/* auto clearing and auto setting corresponding bits in EIMS
+	 * when MSI-X interrupt is triggered
+	 */
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM, IXGBE_EICS_RTX_QUEUE);
+	} else {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(0), 0xFFFFFFFF);
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(1), 0xFFFFFFFF);
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie);
+
+	/* Populate the IVAR table and set the ITR values to the
+	 * corresponding register.
+	 */
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues;
+	     queue_id++) {
+		/* by default, 1:1 mapping */
+		ixgbe_set_ivar_map(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	switch (hw->mac.type) {
+	case ixgbe_mac_82598EB:
+		ixgbe_set_ivar_map(hw, -1, IXGBE_IVAR_OTHER_CAUSES_INDEX,
+				   intr_handle->max_intr - 1);
+		break;
+	case ixgbe_mac_82599EB:
+	case ixgbe_mac_X540:
+		ixgbe_set_ivar_map(hw, -1, 1, intr_handle->max_intr - 1);
+		break;
+	default:
+		break;
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_EITR(queue_id),
+			IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT & 0xFFF);
+
+	/* set up to autoclear timer, and the vectors */
+	mask = IXGBE_EIMS_ENABLE_MASK;
+	mask &= ~(IXGBE_EIMS_OTHER |
+		  IXGBE_EIMS_MAILBOX |
+		  IXGBE_EIMS_LSC);
+
+	IXGBE_WRITE_REG(hw, IXGBE_EIAC, mask);
+#endif
+}
+
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 	uint16_t queue_idx, uint16_t tx_rate)
 {
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 755b674..d813f65 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -117,6 +117,9 @@
 	ETH_RSS_IPV6_TCP_EX | \
 	ETH_RSS_IPV6_UDP_EX)
 
+#define IXGBE_VF_IRQ_ENABLE_MASK        3          /* vf irq enable mask */
+#define IXGBE_VF_MAXMSIVECTOR           1
+
 /*
  * Information about the fdir mode.
  */
@@ -330,6 +333,7 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 		uint16_t rx_queue_id);
 
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
-- 
1.8.1.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (5 preceding siblings ...)
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition Cunming Liang
@ 2015-07-17  6:16  3%     ` Cunming Liang
  2015-07-17  6:16  1%     ` [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds two dev_ops functions to enable and disable rx queue interrupts.
In addtion, it adds rte_eth_dev_rx_intr_ctl/rx_intr_q to support per port or per queue rx intr event set.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v9 changes
 - remove unnecessary check after rte_eth_dev_is_valid_port.
   the same as http://www.dpdk.org/dev/patchwork/patch/4784

v8 changes
 - add addtion check for EEXIT

v7 changes
 - remove rx_intr_vec_get
 - add rx_intr_ctl and rx_intr_ctl_q

v6 changes
 - add rx_intr_vec_get to retrieve the vector num of the queue.

v5 changes
 - Rebase the patchset onto the HEAD

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Put new functions at the end of eth_dev_ops to avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions

 lib/librte_ether/rte_ethdev.c          | 109 +++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 154 +++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |   4 +
 3 files changed, 267 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ddf3658..d7aa840 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3006,6 +3006,115 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 	}
 	rte_spinlock_unlock(&rte_eth_dev_cb_lock);
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	uint16_t qid;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		vec = intr_handle->intr_vec[qid];
+		rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+		if (rc && rc != -EEXIST) {
+			PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+					" op %d epfd %d vec %u\n",
+					port_id, qid, op, epfd, vec);
+		}
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%u\n", queue_id);
+		return -EINVAL;
+	}
+
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	vec = intr_handle->intr_vec[queue_id];
+	rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+	if (rc && rc != -EEXIST) {
+		PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+				" op %d epfd %d vec %u\n",
+				port_id, queue_id, op, epfd, vec);
+		return rc;
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_enable(uint8_t port_id,
+			   uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+}
+
+int
+rte_eth_dev_rx_intr_disable(uint8_t port_id,
+			    uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+}
+#endif
+
 #ifdef RTE_NIC_BYPASS
 int rte_eth_dev_bypass_init(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d76bbb3..602bd2b 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -834,6 +834,10 @@ struct rte_eth_fdir {
 struct rte_intr_conf {
 	/** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
 	uint16_t lsc;
+#ifdef RTE_NEXT_ABI
+	/** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
+	uint16_t rxq;
+#endif
 };
 
 /**
@@ -1042,6 +1046,14 @@ typedef int (*eth_tx_queue_setup_t)(struct rte_eth_dev *dev,
 				    const struct rte_eth_txconf *tx_conf);
 /**< @internal Setup a transmit queue of an Ethernet device. */
 
+typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Enable interrupt of a receive queue of an Ethernet device. */
+
+typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Disable interrupt of a receive queue of an Ethernet device. */
+
 typedef void (*eth_queue_release_t)(void *queue);
 /**< @internal Release memory resources allocated by given RX/TX queue. */
 
@@ -1351,6 +1363,12 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
 	eth_rx_queue_count_t       rx_queue_count; /**< Get Rx queue count. */
 	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
+#ifdef RTE_NEXT_ABI
+	/**< Enable Rx queue interrupt. */
+	eth_rx_enable_intr_t       rx_queue_intr_enable;
+	/**< Disable Rx queue interrupt.*/
+	eth_rx_disable_intr_t      rx_queue_intr_disable;
+#endif
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue.*/
 	eth_queue_release_t        tx_queue_release;/**< Release TX queue.*/
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
@@ -2907,6 +2925,142 @@ void _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 				enum rte_eth_event_type event);
 
 /**
+ * When there is no rx packet coming in Rx Queue for a long time, we can
+ * sleep lcore related to RX Queue for power saving, and enable rx interrupt
+ * to be triggered when rx packect arrives.
+ *
+ * The rte_eth_dev_rx_intr_enable() function enables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id);
+#else
+static inline int
+rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+#endif
+
+/**
+ * When lcore wakes up from rx interrupt indicating packet coming, disable rx
+ * interrupt and returns to polling mode.
+ *
+ * The rte_eth_dev_rx_intr_disable() function disables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id);
+#else
+static inline int
+rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+#endif
+
+/**
+ * RX Interrupt control per port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data);
+#else
+static inline int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+#endif
+
+/**
+ * RX Interrupt control per queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data);
+#else
+static inline int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+#endif
+
+/**
  * Turn on the LED on the Ethernet device.
  * This function turns on the LED on the Ethernet device.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 39baf11..fa09d75 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -109,6 +109,10 @@ DPDK_2.0 {
 DPDK_2.1 {
 	global:
 
+	rte_eth_dev_rx_intr_ctl;
+	rte_eth_dev_rx_intr_ctl_q;
+	rte_eth_dev_rx_intr_disable;
+	rte_eth_dev_rx_intr_enable;
 	rte_eth_dev_set_mc_addr_list;
 	rte_eth_timesync_disable;
 	rte_eth_timesync_enable;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (4 preceding siblings ...)
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support Cunming Liang
@ 2015-07-17  6:16  8%     ` Cunming Liang
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

To make bsd compiling happy with new intr changes.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v13 changes
 - version map cleanup for v2.1

v12 changes
 - fix unused variables compiling warning

v8 changes
 - add stub for new function

v7 changes
 - remove stub 'linux only' function from source file

 lib/librte_eal/bsdapp/eal/eal_interrupts.c         | 28 +++++++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   | 85 ++++++++++++++++++++++
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |  5 ++
 3 files changed, 118 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 26a55c7..a550ece 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -68,3 +68,31 @@ rte_eal_intr_init(void)
 {
 	return 0;
 }
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+
+	return -ENOTSUP;
+}
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
index d4c388f..eaf5410 100644
--- a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#include <rte_common.h>
+
 enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,      /**< uio device handle */
@@ -50,6 +52,89 @@ struct rte_intr_handle {
 	int fd;                          /**< file descriptor */
 	int uio_cfg_fd;                  /**< UIO config file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	/**
+	 * RTE_NEXT_ABI will be removed from v2.2.
+	 * It's only used to avoid ABI(unannounced) broken in v2.1.
+	 * Make sure being aware of the impact before turning on the feature.
+	 */
+	int max_intr;                    /**< max interrupt requested */
+	uint32_t nb_efd;                 /**< number of available efds */
+	int *intr_vec;               /**< intr vector number array */
+#endif
 };
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+
+/**
+ * It enables the fastpath event fds if it's necessary.
+ * It creates event fds when multi-vectors allowed,
+ * otherwise it multiplexes the single event fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disable the fastpath event fds.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The fastpath interrupt is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+
+/**
+ * The interrupt handle instance allows other cause or not.
+ * Other cause stands for none fastpath interrupt.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index e537b42..b527ad4 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -116,4 +116,9 @@ DPDK_2.1 {
 	global:
 
 	rte_memzone_free;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
+	rte_intr_rx_ctl;
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 8%]

* [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (3 preceding siblings ...)
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
@ 2015-07-17  6:16  3%     ` Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition Cunming Liang
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch exposes intr event fd create and release for PMD.
The device driver can assign the number of event associated with interrupt vector.
It also provides misc functions to check 1) allows other slowpath intr(e.g. lsc);
2) intr event on fastpath is enabled or not.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - minor changes on API decription comments

v13 changes
 - version map cleanup for v2.1

v11 changes
 - typo cleanup

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 57 ++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 87 ++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |  4 +
 3 files changed, 148 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b18ab86..0266d98 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -44,6 +44,7 @@
 #include <sys/epoll.h>
 #include <sys/signalfd.h>
 #include <sys/ioctl.h>
+#include <sys/eventfd.h>
 
 #include <rte_common.h>
 #include <rte_interrupts.h>
@@ -68,6 +69,7 @@
 #include "eal_vfio.h"
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
+#define NB_OTHER_INTR               1
 
 static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
 
@@ -1121,4 +1123,59 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
 
 	return rc;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	uint32_t i;
+	int fd;
+	uint32_t n = RTE_MIN(nb_efd, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+
+	if (intr_handle->type == RTE_INTR_HANDLE_VFIO_MSIX) {
+		for (i = 0; i < n; i++) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				RTE_LOG(ERR, EAL,
+					"cannot setup eventfd,"
+					"error %i (%s)\n",
+					errno, strerror(errno));
+				return -1;
+			}
+			intr_handle->efds[i] = fd;
+		}
+		intr_handle->nb_efd   = n;
+		intr_handle->max_intr = NB_OTHER_INTR + n;
+	} else {
+		intr_handle->efds[0]  = intr_handle->fd;
+		intr_handle->nb_efd   = RTE_MIN(nb_efd, 1U);
+		intr_handle->max_intr = NB_OTHER_INTR;
+	}
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	uint32_t i;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < intr_handle->nb_efd; i++) {
+		rev = &intr_handle->elist[i];
+		if (rev->status == RTE_EPOLL_INVALID)
+			continue;
+		if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
+			/* force free if the entry valid */
+			eal_epoll_data_safe_free(rev);
+			rev->status = RTE_EPOLL_INVALID;
+		}
+	}
+
+	if (intr_handle->max_intr > intr_handle->nb_efd) {
+		for (i = 0; i < intr_handle->nb_efd; i++)
+			close(intr_handle->efds[i]);
+	}
+	intr_handle->nb_efd = 0;
+	intr_handle->max_intr = 0;
+}
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 918246f..3f17f29 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -191,4 +191,91 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 }
 #endif
 
+/**
+ * It enables the packet I/O interrupt event if it's necessary.
+ * It creates event fd for each interrupt vector when MSIX is used,
+ * otherwise it multiplexes a single event fd.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+#else
+static inline int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+	return 0;
+}
+#endif
+
+/**
+ * It disables the packet I/O interrupt event.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+extern void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+#else
+static inline void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
+#endif
+
+/**
+ * The packet I/O interrupt on datapath is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	return !(!intr_handle->nb_efd);
+}
+#else
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+#endif
+
+/**
+ * The interrupt handle instance allows other causes or not.
+ * Other causes stand for any none packet I/O interrupts.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	return !!(intr_handle->max_intr - intr_handle->nb_efd);
+}
+#else
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
+#endif
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 1cd4cc5..a0d9cb2 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -117,6 +117,10 @@ DPDK_2.1 {
 
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
 	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (2 preceding siblings ...)
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
@ 2015-07-17  6:16  3%     ` Cunming Liang
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support Cunming Liang
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch assigns event fds to each vfio msix interrupt vector by ioctl.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v8 changes
 - move eventfd creation out of the setup_interrupts to a standalone function

v7 changes
 - cleanup unnecessary code change
 - split event and intr operation to other patches

 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 56 ++++++++++------------------
 1 file changed, 20 insertions(+), 36 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index cca2efd..b18ab86 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -128,6 +128,9 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT
 
 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+/* irq set buffer length for queue interrupts and LSC interrupt */
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+			      sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))
 
 /* enable legacy (INTx) interrupts */
 static int
@@ -245,23 +248,6 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
 						intr_handle->fd);
 		return -1;
 	}
-
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -294,7 +280,7 @@ vfio_disable_msi(struct rte_intr_handle *intr_handle) {
 static int
 vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 	int len, ret;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	struct vfio_irq_set *irq_set;
 	int *fd_ptr;
 
@@ -302,12 +288,26 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
 	irq_set->argsz = len;
+#ifdef RTE_NEXT_ABI
+	if (!intr_handle->max_intr)
+		intr_handle->max_intr = 1;
+	else if (intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+		intr_handle->max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+
+	irq_set->count = intr_handle->max_intr;
+#else
 	irq_set->count = 1;
+#endif
 	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
 	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
 	irq_set->start = 0;
 	fd_ptr = (int *) &irq_set->data;
-	*fd_ptr = intr_handle->fd;
+#ifdef RTE_NEXT_ABI
+	memcpy(fd_ptr, intr_handle->efds, sizeof(intr_handle->efds));
+	fd_ptr[intr_handle->max_intr - 1] = intr_handle->fd;
+#else
+	fd_ptr[0] = intr_handle->fd;
+#endif
 
 	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
@@ -317,22 +317,6 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 		return -1;
 	}
 
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI-X interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -340,7 +324,7 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 static int
 vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	int len, ret;
 
 	len = sizeof(struct vfio_irq_set);
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector events monitor on specified epoll instance.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v12 changes:
 - fix awkward line split in using RTE_LOG

v10 changes:
 - add RTE_INTR_HANDLE_UIO_INTX for uio_pci_generic

v8 changes
 - fix EWOULDBLOCK and EINTR processing
 - add event status check

v7 changes
 - rename rte_intr_rx_set to rte_intr_rx_ctl.
 - rte_intr_rx_ctl uses rte_epoll_ctl to register epoll event instance.
 - the intr rx event instance includes a intr process callback.

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 105 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  38 ++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   1 +
 3 files changed, 144 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 5fe5b99..4e34abc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -897,6 +897,51 @@ rte_eal_intr_init(void)
 	return -ret;
 }
 
+#ifdef RTE_NEXT_ABI
+static void
+eal_intr_proc_rxtx_intr(int fd, const struct rte_intr_handle *intr_handle)
+{
+	union rte_intr_read_buffer buf;
+	int bytes_read = 1;
+
+	switch (intr_handle->type) {
+	case RTE_INTR_HANDLE_UIO:
+	case RTE_INTR_HANDLE_UIO_INTX:
+		bytes_read = sizeof(buf.uio_intr_count);
+		break;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		bytes_read = sizeof(buf.vfio_intr_count);
+		break;
+#endif
+	default:
+		bytes_read = 1;
+		RTE_LOG(INFO, EAL, "unexpected intr type\n");
+		break;
+	}
+
+	/**
+	 * read out to clear the ready-to-be-read flag
+	 * for epoll_wait.
+	 */
+	do {
+		bytes_read = read(fd, &buf, bytes_read);
+		if (bytes_read < 0) {
+			if (errno == EINTR || errno == EWOULDBLOCK ||
+			    errno == EAGAIN)
+				continue;
+			RTE_LOG(ERR, EAL,
+				"Error reading from fd %d: %s\n",
+				fd, strerror(errno));
+		} else if (bytes_read == 0)
+			RTE_LOG(ERR, EAL, "Read nothing from fd %d\n", fd);
+		return;
+	} while (1);
+}
+#endif
+
 static int
 eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
 			struct rte_epoll_event *events)
@@ -1033,3 +1078,63 @@ rte_epoll_ctl(int epfd, int op, int fd,
 
 	return 0;
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
+		int op, unsigned int vec, void *data)
+{
+	struct rte_epoll_event *rev;
+	struct rte_epoll_data *epdata;
+	int epfd_op;
+	int rc = 0;
+
+	if (!intr_handle || intr_handle->nb_efd == 0 ||
+	    vec >= intr_handle->nb_efd) {
+		RTE_LOG(ERR, EAL, "Wrong intr vector number.\n");
+		return -EPERM;
+	}
+
+	switch (op) {
+	case RTE_INTR_EVENT_ADD:
+		epfd_op = EPOLL_CTL_ADD;
+		rev = &intr_handle->elist[vec];
+		if (rev->status != RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event already been added.\n");
+			return -EEXIST;
+		}
+
+		/* attach to intr vector fd */
+		epdata = &rev->epdata;
+		epdata->event  = EPOLLIN | EPOLLPRI | EPOLLET;
+		epdata->data   = data;
+		epdata->cb_fun = (rte_intr_event_cb_t)eal_intr_proc_rxtx_intr;
+		epdata->cb_arg = (void *)intr_handle;
+		rc = rte_epoll_ctl(epfd, epfd_op, intr_handle->efds[vec], rev);
+		if (!rc)
+			RTE_LOG(DEBUG, EAL,
+				"efd %d associated with vec %d added on epfd %d"
+				"\n", rev->fd, vec, epfd);
+		else
+			rc = -EPERM;
+		break;
+	case RTE_INTR_EVENT_DEL:
+		epfd_op = EPOLL_CTL_DEL;
+		rev = &intr_handle->elist[vec];
+		if (rev->status == RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event does not exist.\n");
+			return -EPERM;
+		}
+
+		rc = rte_epoll_ctl(rev->epfd, epfd_op, rev->fd, rev);
+		if (rc)
+			rc = -EPERM;
+		break;
+	default:
+		RTE_LOG(ERR, EAL, "event op type mismatch\n");
+		rc = -EPERM;
+	}
+
+	return rc;
+}
+#endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index b55b4ee..918246f 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,10 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#ifndef RTE_NEXT_ABI
+#include <rte_common.h>
+#endif
+
 #define RTE_MAX_RXTX_INTR_VEC_ID     32
 
 enum rte_intr_handle_type {
@@ -153,4 +157,38 @@ rte_epoll_ctl(int epfd, int op, int fd,
 int
 rte_intr_tls_epfd(void);
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+#else
+static inline int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+	return -ENOTSUP;
+}
+#endif
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 3c4c710..1cd4cc5 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -117,6 +117,7 @@ DPDK_2.1 {
 
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold the rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback which is executed during wakeup.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v11 changes
 - cleanup spelling error

v9 changes
 - rework on coding style

v8 changes
 - support delete event in safety during the wakeup execution
 - add EINTR process during epoll_wait

v7 changes
 - split v6[4/8] into two patches, one for epoll event(this one)
   another for rx intr(next patch)
 - introduce rte_epoll_event definition
 - rte_epoll_wait/ctl for more generic RTE epoll API

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 139 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  80 ++++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   3 +
 3 files changed, 222 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b5f369e..5fe5b99 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -69,6 +69,8 @@
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
 
+static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+
 /**
  * union for pipe fds.
  */
@@ -894,3 +896,140 @@ rte_eal_intr_init(void)
 
 	return -ret;
 }
+
+static int
+eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
+			struct rte_epoll_event *events)
+{
+	unsigned int i, count = 0;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < n; i++) {
+		rev = evs[i].data.ptr;
+		if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
+						 RTE_EPOLL_EXEC))
+			continue;
+
+		events[count].status        = RTE_EPOLL_VALID;
+		events[count].fd            = rev->fd;
+		events[count].epfd          = rev->epfd;
+		events[count].epdata.event  = rev->epdata.event;
+		events[count].epdata.data   = rev->epdata.data;
+		if (rev->epdata.cb_fun)
+			rev->epdata.cb_fun(rev->fd,
+					   rev->epdata.cb_arg);
+
+		rte_compiler_barrier();
+		rev->status = RTE_EPOLL_VALID;
+		count++;
+	}
+	return count;
+}
+
+static inline int
+eal_init_tls_epfd(void)
+{
+	int pfd = epoll_create(255);
+
+	if (pfd < 0) {
+		RTE_LOG(ERR, EAL,
+			"Cannot create epoll instance\n");
+		return -1;
+	}
+	return pfd;
+}
+
+int
+rte_intr_tls_epfd(void)
+{
+	if (RTE_PER_LCORE(_epfd) == -1)
+		RTE_PER_LCORE(_epfd) = eal_init_tls_epfd();
+
+	return RTE_PER_LCORE(_epfd);
+}
+
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout)
+{
+	struct epoll_event evs[maxevents];
+	int rc;
+
+	if (!events) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	while (1) {
+		rc = epoll_wait(epfd, evs, maxevents, timeout);
+		if (likely(rc > 0)) {
+			/* epoll_wait has at least one fd ready to read */
+			rc = eal_epoll_process_event(evs, rc, events);
+			break;
+		} else if (rc < 0) {
+			if (errno == EINTR)
+				continue;
+			/* epoll_wait fail */
+			RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
+				strerror(errno));
+			rc = -1;
+			break;
+		}
+	}
+
+	return rc;
+}
+
+static inline void
+eal_epoll_data_safe_free(struct rte_epoll_event *ev)
+{
+	while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
+				    RTE_EPOLL_INVALID))
+		while (ev->status != RTE_EPOLL_VALID)
+			rte_pause();
+	memset(&ev->epdata, 0, sizeof(ev->epdata));
+	ev->fd = -1;
+	ev->epfd = -1;
+}
+
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event)
+{
+	struct epoll_event ev;
+
+	if (!event) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	if (op == EPOLL_CTL_ADD) {
+		event->status = RTE_EPOLL_VALID;
+		event->fd = fd;  /* ignore fd in event */
+		event->epfd = epfd;
+		ev.data.ptr = (void *)event;
+	}
+
+	ev.events = event->epdata.event;
+	if (epoll_ctl(epfd, op, fd, &ev) < 0) {
+		RTE_LOG(ERR, EAL, "Error op %d fd %d epoll_ctl, %s\n",
+			op, fd, strerror(errno));
+		if (op == EPOLL_CTL_ADD)
+			/* rollback status when CTL_ADD fail */
+			event->status = RTE_EPOLL_INVALID;
+		return -1;
+	}
+
+	if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
+		eal_epoll_data_safe_free(event);
+
+	return 0;
+}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 12b33c9..b55b4ee 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -51,6 +51,32 @@ enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_MAX
 };
 
+#define RTE_INTR_EVENT_ADD            1UL
+#define RTE_INTR_EVENT_DEL            2UL
+
+typedef void (*rte_intr_event_cb_t)(int fd, void *arg);
+
+struct rte_epoll_data {
+	uint32_t event;               /**< event type */
+	void *data;                   /**< User data */
+	rte_intr_event_cb_t cb_fun;   /**< IN: callback fun */
+	void *cb_arg;	              /**< IN: callback arg */
+};
+
+enum {
+	RTE_EPOLL_INVALID = 0,
+	RTE_EPOLL_VALID,
+	RTE_EPOLL_EXEC,
+};
+
+/** interrupt epoll event obj, taken by epoll_event.ptr */
+struct rte_epoll_event {
+	volatile uint32_t status;  /**< OUT: event status */
+	int fd;                    /**< OUT: event fd */
+	int epfd;       /**< OUT: epoll instance the ev associated with */
+	struct rte_epoll_data epdata;
+};
+
 /** Handle for interrupts. */
 struct rte_intr_handle {
 	union {
@@ -69,8 +95,62 @@ struct rte_intr_handle {
 	uint32_t max_intr;             /**< max interrupt requested */
 	uint32_t nb_efd;               /**< number of available efd(event fd) */
 	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
+				       /**< intr vector epoll event */
 	int *intr_vec;                 /**< intr vector number array */
 #endif
 };
 
+#define RTE_EPOLL_PER_THREAD        -1  /**< to hint using per thread epfd */
+
+/**
+ * It waits for events on the epoll instance.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller wait for events.
+ * @param events
+ *   Memory area contains the events that will be available for the caller.
+ * @param maxevents
+ *   Up to maxevents are returned, must greater than zero.
+ * @param timeout
+ *   Specifying a timeout of -1 causes a block indefinitely.
+ *   Specifying a timeout equal to zero cause to return immediately.
+ * @return
+ *   - On success, returns the number of available event.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout);
+
+/**
+ * It performs control operations on epoll instance referred by the epfd.
+ * It requests that the operation op be performed for the target fd.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller perform control operations.
+ * @param op
+ *   The operation be performed for the target fd.
+ * @param fd
+ *   The target fd on which the control ops perform.
+ * @param event
+ *   Describes the object linked to the fd.
+ *   Note: The caller must take care the object deletion after CTL_DEL.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event);
+
+/**
+ * The function returns the per thread epoll instance.
+ *
+ * @return
+ *   epfd the epoll instance referred to.
+ */
+int
+rte_intr_tls_epfd(void);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index e537b42..3c4c710 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -115,5 +115,8 @@ DPDK_2.0 {
 DPDK_2.1 {
 	global:
 
+	rte_epoll_ctl;
+	rte_epoll_wait;
+	rte_intr_tls_epfd;
 	rte_memzone_free;
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
@ 2015-07-17  6:16  8%     ` Cunming Liang
  2015-07-19 23:31  0%       ` Thomas Monjalon
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
                       ` (9 subsequent siblings)
  10 siblings, 1 reply; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds interrupt vectors support in rte_intr_handle.
'vec_en' is set when interrupt vectors are detected and associated event fds are set.
Those event fds are stored in efds[].
'intr_vec' is reserved for device driver to initialize the vector mapping table.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v7 changes:
 - add eptrs[], it's used to store the register rte_epoll_event instances.
 - add vec_en, to log the vector capability status.

v6 changes:
 - add mapping table between irq vector number and queue id.

v5 changes:
 - Create this new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect.

 .../linuxapp/eal/include/exec-env/rte_interrupts.h          | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index bdeb3fc..12b33c9 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#define RTE_MAX_RXTX_INTR_VEC_ID     32
+
 enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,          /**< uio device handle */
@@ -58,6 +60,17 @@ struct rte_intr_handle {
 	};
 	int fd;	 /**< interrupt event file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	/**
+	 * RTE_NEXT_ABI will be removed from v2.2.
+	 * It's only used to avoid ABI(unannounced) broken in v2.1.
+	 * Make sure being aware of the impact before turning on the feature.
+	 */
+	uint32_t max_intr;             /**< max interrupt requested */
+	uint32_t nb_efd;               /**< number of available efd(event fd) */
+	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	int *intr_vec;                 /**< intr vector number array */
+#endif
 };
 
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
-- 
1.8.1.4

^ permalink raw reply	[relevance 8%]

* [dpdk-dev] [PATCH v14 00/13] Interrupt mode PMD
    2015-07-09 13:58  3%   ` David Marchand
@ 2015-07-17  6:16  4%   ` Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
                       ` (10 more replies)
  1 sibling, 11 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map
 - minor comments rework

v13 changes
 - version map cleanup for v2.1
 - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility

Patch series v12
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Danny Zhou <danny.zhou@intel.com>

v12 changes
 - bsd cleanup for unused variable warning
 - fix awkward line split in debug message

v11 changes
 - typo cleanup and check kernel style

v10 changes
 - code rework to return actual error code
 - bug fix for lsc when using uio_pci_generic

v9 changes
 - code rework to fix open comment
 - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
 - new patch to turn off the feature by default so as to avoid v2.1 abi broken

v8 changes
 - remove condition check for only vfio-msix
 - add multiplex intr support when only one intr vector allowed
 - lsc and rxq interrupt runtime enable decision
 - add safe event delete while the event wakeup execution happens

v7 changes
 - decouple epoll event and intr operation
 - add condition check in the case intr vector is disabled
 - renaming some APIs

v6 changes
 - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set.
 - using vector number instead of queue_id as interrupt API params.
 - patch reorder and split.

v5 changes
 - Rebase the patchset onto the HEAD
 - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
 - Export wait-for-rx interrupt function for shared libraries
 - Split-off a new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect
 - Change sample applicaiton to accomodate EAL function spec change
   accordingly

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Adjust position of new-added structure fields and functions to
   avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions
 - Move spinlok from PMD to L3fwd-power
 - Remove unnecessary variables in e1000_mac_info
 - Fix miscelleous review comments

v2 changes
 - Fix compilation issue in Makefile for missed header file.
 - Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
   pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
   which then wakes up DPDK polling thread). In this way, it introduces
   non-deterministic wakeup latency for DPDK polling thread as well as packet
   latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
   LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF only)
.
2) Build on top of the VFIO mechanism instead of UIO, so it could support
   up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
   VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
   user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
   switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
   and rx queue interrupt handlers causes a mess. [FIXED]
2) LSC interrupt is not supported by VF driver, so it is by default disabled
   in L3fwd-power now. Feel free to turn in on if you want to support both LSC
   and rx queue interrupts on a PF.

Cunming Liang (13):
  eal/linux: add interrupt vectors support in intr_handle
  eal/linux: add rte_epoll_wait/ctl support
  eal/linux: add API to set rx interrupt event monitor
  eal/linux: fix comments typo on vfio msi
  eal/linux: map eventfd to VFIO MSI-X intr vector
  eal/linux: standalone intr event fd create support
  eal/linux: fix lsc read error in uio_pci_generic
  eal/bsd: dummy for new intr definition
  eal/bsd: fix inappropriate linuxapp referred in bsd
  ethdev: add rx intr enable, disable and ctl functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
    switch

 drivers/net/e1000/igb_ethdev.c                     | 311 ++++++++++--
 drivers/net/ixgbe/ixgbe_ethdev.c                   | 527 ++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   4 +
 examples/l3fwd-power/main.c                        | 202 ++++++--
 lib/librte_eal/bsdapp/eal/eal_interrupts.c         |  28 ++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  91 +++-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |   5 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 362 ++++++++++++--
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 218 +++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   8 +
 lib/librte_ether/rte_ethdev.c                      | 109 +++++
 lib/librte_ether/rte_ethdev.h                      | 154 ++++++
 lib/librte_ether/rte_ether_version.map             |   4 +
 13 files changed, 1895 insertions(+), 128 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v13 00/14] Interrupt mode PMD
  2015-07-09 13:58  3%   ` David Marchand
@ 2015-07-17  6:04  0%     ` Liang, Cunming
  0 siblings, 0 replies; 200+ results
From: Liang, Cunming @ 2015-07-17  6:04 UTC (permalink / raw)
  To: David Marchand; +Cc: Stephen Hemminger, dev, Wang, Liang-min


> -----Original Message-----
> From: David Marchand [mailto:david.marchand@6wind.com] 
> Sent: Thursday, July 09, 2015 9:59 PM
> To: Liang, Cunming
> Cc: dev@dpdk.org; Stephen Hemminger; Thomas Monjalon; Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong; Neil Horman
> Subject: Re: [PATCH v13 00/14] Interrupt mode PMD

> On Fri, Jun 19, 2015 at 6:00 AM, Cunming Liang <cunming.liang@intel.com> wrote:
> v13 changes
> - version map cleanup for v2.1
> - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
>
> Please, this patchset ends with a patch that deals with ABI compatibility while it should do so on a per-patch basis.
> Besides, some patches are introducing stuff that is reworked in other patches without a clear reason.
> 
> Can you rework this to ease review and ensure patch atomicity ?
> 
> Thanks.
> 
> -- 
> David Marchand

Will split it, thanks.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:26  7% [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code John McNamara
  2015-07-13 10:42  7% ` Neil Horman
@ 2015-07-16 22:22  4% ` Vlad Zolotarov
  1 sibling, 0 replies; 200+ results
From: Vlad Zolotarov @ 2015-07-16 22:22 UTC (permalink / raw)
  To: John McNamara, dev



On 07/13/15 13:26, John McNamara wrote:
> Fix for ABI breakage introduced in LRO addition. Moves
> lro bitfield to the end of the struct/member.
>
> Fixes: 8eecb3295aed (ixgbe: add LRO support)
>
> Signed-off-by: John McNamara <john.mcnamara@intel.com>
> ---
>   lib/librte_ether/rte_ethdev.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 79bde89..1c3ace1 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>   	uint8_t port_id;           /**< Device [external] port identifier. */
>   	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
>   		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
> -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
>   		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */

Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com>

>   };
>   
>   /**

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5 4/4] rte_sched: hide structure of port hierarchy
  @ 2015-07-16 21:34  3% ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2015-07-16 21:34 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev

Right now the scheduler hierarchy is encoded as a bitfield
that is visible as part of the ABI. This creates an barrier
limiting future expansion of the hierarchy.

As a transistional step. hide the actual layout of the hierarchy
and mark the exposed structure as deprecated. This will allow for
expansion in later release.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/librte_sched/rte_sched.c           | 54 ++++++++++++++++++++++++++++++++++
 lib/librte_sched/rte_sched.h           | 54 ++++++++++------------------------
 lib/librte_sched/rte_sched_version.map |  9 ++++++
 3 files changed, 79 insertions(+), 38 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index ec565d2..4593af8 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -184,6 +184,21 @@ enum grinder_state {
 	e_GRINDER_READ_MBUF
 };
 
+/*
+ * Path through the scheduler hierarchy used by the scheduler enqueue
+ * operation to identify the destination queue for the current
+ * packet. Stored in the field pkt.hash.sched of struct rte_mbuf of
+ * each packet, typically written by the classification stage and read
+ * by scheduler enqueue.
+ */
+struct __rte_sched_port_hierarchy {
+	uint32_t queue:2;                /**< Queue ID (0 .. 3) */
+	uint32_t traffic_class:2;        /**< Traffic class ID (0 .. 3)*/
+	uint32_t pipe:20;                /**< Pipe ID */
+	uint32_t subport:6;              /**< Subport ID */
+	uint32_t color:2;                /**< Color */
+};
+
 struct rte_sched_grinder {
 	/* Pipe cache */
 	uint16_t pcache_qmask[RTE_SCHED_GRINDER_PCACHE_SIZE];
@@ -910,6 +925,45 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 	return 0;
 }
 
+void
+rte_sched_port_pkt_write(struct rte_mbuf *pkt,
+			 uint32_t subport, uint32_t pipe, uint32_t traffic_class,
+			 uint32_t queue, enum rte_meter_color color)
+{
+	struct __rte_sched_port_hierarchy *sched
+		= (struct __rte_sched_port_hierarchy *) &pkt->hash.sched;
+
+	sched->color = (uint32_t) color;
+	sched->subport = subport;
+	sched->pipe = pipe;
+	sched->traffic_class = traffic_class;
+	sched->queue = queue;
+}
+
+void
+rte_sched_port_pkt_read_tree_path(const struct rte_mbuf *pkt,
+				  uint32_t *subport, uint32_t *pipe,
+				  uint32_t *traffic_class, uint32_t *queue)
+{
+	const struct __rte_sched_port_hierarchy *sched
+		= (const struct __rte_sched_port_hierarchy *) &pkt->hash.sched;
+
+	*subport = sched->subport;
+	*pipe = sched->pipe;
+	*traffic_class = sched->traffic_class;
+	*queue = sched->queue;
+}
+
+
+enum rte_meter_color
+rte_sched_port_pkt_read_color(const struct rte_mbuf *pkt)
+{
+	const struct __rte_sched_port_hierarchy *sched
+		= (const struct __rte_sched_port_hierarchy *) &pkt->hash.sched;
+
+	return (enum rte_meter_color) sched->color;
+}
+
 int
 rte_sched_subport_read_stats(struct rte_sched_port *port,
 	uint32_t subport_id,
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 729f8c8..1ead267 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -195,17 +195,19 @@ struct rte_sched_port_params {
 #endif
 };
 
-/** Path through the scheduler hierarchy used by the scheduler enqueue operation to
-identify the destination queue for the current packet. Stored in the field hash.sched
-of struct rte_mbuf of each packet, typically written by the classification stage and read by
-scheduler enqueue.*/
+/*
+ * Path through scheduler hierarchy
+ *
+ * Note: direct access to internal bitfields is deprecated to allow for future expansion.
+ * Use rte_sched_port_pkt_read/write API instead
+ */
 struct rte_sched_port_hierarchy {
 	uint32_t queue:2;                /**< Queue ID (0 .. 3) */
 	uint32_t traffic_class:2;        /**< Traffic class ID (0 .. 3)*/
 	uint32_t pipe:20;                /**< Pipe ID */
 	uint32_t subport:6;              /**< Subport ID */
 	uint32_t color:2;                /**< Color */
-};
+} __attribute__ ((deprecated));
 
 /*
  * Configuration
@@ -328,11 +330,6 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 	struct rte_sched_queue_stats *stats,
 	uint16_t *qlen);
 
-/*
- * Run-time
- *
- ***/
-
 /**
  * Scheduler hierarchy path write to packet descriptor. Typically called by the
  * packet classification stage.
@@ -350,18 +347,10 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
  * @param color
  *   Packet color set
  */
-static inline void
+void
 rte_sched_port_pkt_write(struct rte_mbuf *pkt,
-	uint32_t subport, uint32_t pipe, uint32_t traffic_class, uint32_t queue, enum rte_meter_color color)
-{
-	struct rte_sched_port_hierarchy *sched = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
-
-	sched->color = (uint32_t) color;
-	sched->subport = subport;
-	sched->pipe = pipe;
-	sched->traffic_class = traffic_class;
-	sched->queue = queue;
-}
+			 uint32_t subport, uint32_t pipe, uint32_t traffic_class,
+			 uint32_t queue, enum rte_meter_color color);
 
 /**
  * Scheduler hierarchy path read from packet descriptor (struct rte_mbuf). Typically
@@ -380,24 +369,13 @@ rte_sched_port_pkt_write(struct rte_mbuf *pkt,
  *   Queue ID within pipe traffic class (0 .. 3)
  *
  */
-static inline void
-rte_sched_port_pkt_read_tree_path(struct rte_mbuf *pkt, uint32_t *subport, uint32_t *pipe, uint32_t *traffic_class, uint32_t *queue)
-{
-	struct rte_sched_port_hierarchy *sched = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
-
-	*subport = sched->subport;
-	*pipe = sched->pipe;
-	*traffic_class = sched->traffic_class;
-	*queue = sched->queue;
-}
-
-static inline enum rte_meter_color
-rte_sched_port_pkt_read_color(struct rte_mbuf *pkt)
-{
-	struct rte_sched_port_hierarchy *sched = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
+void
+rte_sched_port_pkt_read_tree_path(const struct rte_mbuf *pkt,
+				  uint32_t *subport, uint32_t *pipe,
+				  uint32_t *traffic_class, uint32_t *queue);
 
-	return (enum rte_meter_color) sched->color;
-}
+enum rte_meter_color
+rte_sched_port_pkt_read_color(const struct rte_mbuf *pkt);
 
 /**
  * Hierarchical scheduler port enqueue. Writes up to n_pkts to port scheduler and
diff --git a/lib/librte_sched/rte_sched_version.map b/lib/librte_sched/rte_sched_version.map
index 9f74e8b..3aa159a 100644
--- a/lib/librte_sched/rte_sched_version.map
+++ b/lib/librte_sched/rte_sched_version.map
@@ -20,3 +20,12 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_2.1 {
+	global:
+
+	rte_sched_port_pkt_write;
+	rte_sched_port_pkt_read_tree_path;
+	rte_sched_port_pkt_read_color;
+
+} DPDK_2.0;
-- 
2.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
  2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
  2015-07-16 21:25  4% ` Dumitrescu, Cristian
@ 2015-07-16 21:28  4% ` Neil Horman
  2015-07-23 10:18  4% ` Dumitrescu, Cristian
  2 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-16 21:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

On Thu, Jul 16, 2015 at 02:21:39PM -0700, Stephen Hemminger wrote:
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  doc/guides/rel_notes/abi.rst | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 9e98d62..a4d100b 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -34,3 +34,12 @@ Deprecation Notices
>    creates a dummy/empty malloc library to fulfill binaries with dynamic linking
>    dependencies on librte_malloc.so. Such dummy library will not be created from
>    release 2.2 so binaries will need to be rebuilt.
> +
> +* librte_sched (rte_sched.h): The scheduler hierarchy structure
> +  (rte_sched_port_hierarchy) will change to allow for a larger number of subport
> +  entries. The number of available traffic_classes and queues may also change.
> +  The mbuf structure element for sched hierarchy will also change from a single
> +  32 bit to a 64 bit structure.
> +
> +* librte_sched (rte_sched.h): The scheduler statistics structure will change
> +  to allow keeping track of RED actions.
> -- 
> 2.1.4
> 
> 
ACK

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
  2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
@ 2015-07-16 21:25  4% ` Dumitrescu, Cristian
  2015-07-16 21:28  4% ` Neil Horman
  2015-07-23 10:18  4% ` Dumitrescu, Cristian
  2 siblings, 0 replies; 200+ results
From: Dumitrescu, Cristian @ 2015-07-16 21:25 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, July 16, 2015 10:22 PM
> To: Dumitrescu, Cristian
> Cc: dev@dpdk.org; Stephen Hemminger
> Subject: [PATCH] doc: announce ABI change for librte_sched
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  doc/guides/rel_notes/abi.rst | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 9e98d62..a4d100b 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -34,3 +34,12 @@ Deprecation Notices
>    creates a dummy/empty malloc library to fulfill binaries with dynamic linking
>    dependencies on librte_malloc.so. Such dummy library will not be created
> from
>    release 2.2 so binaries will need to be rebuilt.
> +
> +* librte_sched (rte_sched.h): The scheduler hierarchy structure
> +  (rte_sched_port_hierarchy) will change to allow for a larger number of
> subport
> +  entries. The number of available traffic_classes and queues may also
> change.
> +  The mbuf structure element for sched hierarchy will also change from a
> single
> +  32 bit to a 64 bit structure.
> +
> +* librte_sched (rte_sched.h): The scheduler statistics structure will change
> +  to allow keeping track of RED actions.
> --
> 2.1.4

Hi Stephen,

Agree with both, how about the new clear flag to the stats read function, shall we add a separate note on this as well?

Thanks,
Cristian

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
@ 2015-07-16 21:21 14% Stephen Hemminger
  2015-07-16 21:25  4% ` Dumitrescu, Cristian
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stephen Hemminger @ 2015-07-16 21:21 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/rel_notes/abi.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..a4d100b 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,12 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_sched (rte_sched.h): The scheduler hierarchy structure
+  (rte_sched_port_hierarchy) will change to allow for a larger number of subport
+  entries. The number of available traffic_classes and queues may also change.
+  The mbuf structure element for sched hierarchy will also change from a single
+  32 bit to a 64 bit structure.
+
+* librte_sched (rte_sched.h): The scheduler statistics structure will change
+  to allow keeping track of RED actions.
-- 
2.1.4

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
@ 2015-07-16 17:07 14% Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 17:07 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..194e8c6 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,8 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_pipeline (rte_pipeline.h): The prototype for the pipeline input port,
+  output port and table action handlers will be updated: the pipeline parameter
+  will be added, the packets mask parameter will be either removed (for input
+  port action handler) or made input-only.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
@ 2015-07-16 16:59 14% Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 16:59 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..aa7c036 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,11 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_table (rte_table_lpm.h): A new parameter to hold the table name (const
+  char *name) will be added to the LPM table parameter structure
+  (struct rte_table_lpm_params).
+
+* librte_table (rte_table_acl.h): Structures rte_table_acl_rule_add_params and
+  rte_table_acl_rule_delete_params will change to store an array of rules as
+  opposed to a single rule.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
  2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
@ 2015-07-16 15:51  4% ` Singh, Jasvinder
  2015-07-17  7:56  4% ` Gajdzica, MaciejX T
  2015-07-17  8:08  4% ` Mrzyglod, DanielX T
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-16 15:51 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 4:27 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
@ 2015-07-16 15:27 14% Cristian Dumitrescu
  2015-07-16 15:51  4% ` Singh, Jasvinder
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 15:27 UTC (permalink / raw)
  To: dev

v2 changes: 
-text simplification

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..6c064e2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,8 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_port (rte_port.h): Macros to access the packet meta-data stored within
+  the packet buffer will be adjusted to cover the packet mbuf structure as well,
+  as currently they are able to access any packet buffer location except the
+  packet mbuf structure.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:25  4% ` Thomas Monjalon
@ 2015-07-16 15:09  4%   ` Dumitrescu, Cristian
  0 siblings, 0 replies; 200+ results
From: Dumitrescu, Cristian @ 2015-07-16 15:09 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, July 16, 2015 1:26 PM
> To: Dumitrescu, Cristian
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 2015-07-16 13:19, Cristian Dumitrescu:
> > +* librte_port (rte_port.h): Macros to access the packet meta-data stored
> within
> > +  the packet buffer will be adjusted to cover the packet mbuf structure as
> well,
> > +  as currently they are able to access any packet buffer location except the
> > +  packet mbuf structure. The consequence is that applications currently
> using
> > +  these macros will have to adjust the value of the offset parameter of
> these
> > +  macros by increasing it with sizeof(struc rte_mbuf). The affected macros
> are:
> > +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
> > +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes,
> most likely
> > +  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be
> changed from
> > +  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
> > +  ``(&((uint8_t *) (mbuf))[offset])``.
> 
> Cristian,
> General comment: you are too verbose :)
> Specifically on this patch: same comment ;)

No problem, will simplify and resend. Thanks, Thomas.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
                   ` (2 preceding siblings ...)
  2015-07-16 12:30  4% ` Mrzyglod, DanielX T
@ 2015-07-16 12:49  4% ` Singh, Jasvinder
  3 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-16 12:49 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 1:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
  2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
  2015-07-16 11:50  4% ` Gajdzica, MaciejX T
  2015-07-16 12:28  4% ` Mrzyglod, DanielX T
@ 2015-07-16 12:49  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-16 12:49 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 12:37 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
  2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
  2015-07-16 11:50  4% ` Gajdzica, MaciejX T
@ 2015-07-16 12:28  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-16 12:28 UTC (permalink / raw)
  To: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 1:37 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  doc/guides/rel_notes/abi.rst |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 126f73e..942f3ea 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -29,3 +29,8 @@ Deprecation Notices
>    and several ``PKT_RX_`` flags will be removed, to support unified packet type
>    from release 2.1. Those changes may be enabled in the upcoming release 2.1
>    with CONFIG_RTE_NEXT_ABI.
> +
> +* librte_cfgfile (rte_cfgfile.h): In order to allow for longer names and values,
> +  the value of macros CFG_NAME_LEN and CFG_NAME_VAL will be increased.
> Most
> +  likely, the new values will be 64 and 256, respectively.
> +
> --
> 1.7.4.1

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
  2015-07-16 12:25  4% ` Gajdzica, MaciejX T
  2015-07-16 12:25  4% ` Thomas Monjalon
@ 2015-07-16 12:30  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  3 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-16 12:30 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 2:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  doc/guides/rel_notes/abi.rst |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 9e98d62..271e08e 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -34,3 +34,15 @@ Deprecation Notices
>    creates a dummy/empty malloc library to fulfill binaries with dynamic linking
>    dependencies on librte_malloc.so. Such dummy library will not be created from
>    release 2.2 so binaries will need to be rebuilt.
> +
> +* librte_port (rte_port.h): Macros to access the packet meta-data stored within
> +  the packet buffer will be adjusted to cover the packet mbuf structure as well,
> +  as currently they are able to access any packet buffer location except the
> +  packet mbuf structure. The consequence is that applications currently using
> +  these macros will have to adjust the value of the offset parameter of these
> +  macros by increasing it with sizeof(struc rte_mbuf). The affected macros are:
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes, most
> likely
> +  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be
> changed from
> +  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
> +  ``(&((uint8_t *) (mbuf))[offset])``.
> --
> 1.7.4.1

> 1.7.4.1


Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
  2015-07-16 12:25  4% ` Gajdzica, MaciejX T
@ 2015-07-16 12:25  4% ` Thomas Monjalon
  2015-07-16 15:09  4%   ` Dumitrescu, Cristian
  2015-07-16 12:30  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  3 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-16 12:25 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev

2015-07-16 13:19, Cristian Dumitrescu:
> +* librte_port (rte_port.h): Macros to access the packet meta-data stored within
> +  the packet buffer will be adjusted to cover the packet mbuf structure as well,
> +  as currently they are able to access any packet buffer location except the
> +  packet mbuf structure. The consequence is that applications currently using
> +  these macros will have to adjust the value of the offset parameter of these
> +  macros by increasing it with sizeof(struc rte_mbuf). The affected macros are:
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes, most likely
> +  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be changed from
> +  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
> +  ``(&((uint8_t *) (mbuf))[offset])``.

Cristian,
General comment: you are too verbose :)
Specifically on this patch: same comment ;)

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
@ 2015-07-16 12:25  4% ` Gajdzica, MaciejX T
  2015-07-16 12:25  4% ` Thomas Monjalon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-16 12:25 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 2:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
@ 2015-07-16 12:19 14% Cristian Dumitrescu
  2015-07-16 12:25  4% ` Gajdzica, MaciejX T
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 12:19 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..271e08e 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,15 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_port (rte_port.h): Macros to access the packet meta-data stored within
+  the packet buffer will be adjusted to cover the packet mbuf structure as well,
+  as currently they are able to access any packet buffer location except the
+  packet mbuf structure. The consequence is that applications currently using
+  these macros will have to adjust the value of the offset parameter of these
+  macros by increasing it with sizeof(struc rte_mbuf). The affected macros are:
+  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
+  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes, most likely
+  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be changed from
+  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
+  ``(&((uint8_t *) (mbuf))[offset])``.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
  2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
@ 2015-07-16 11:50  4% ` Gajdzica, MaciejX T
  2015-07-16 12:28  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-16 11:50 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 1:37 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
@ 2015-07-16 11:36 14% Cristian Dumitrescu
  2015-07-16 11:50  4% ` Gajdzica, MaciejX T
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 11:36 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 126f73e..942f3ea 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -29,3 +29,8 @@ Deprecation Notices
   and several ``PKT_RX_`` flags will be removed, to support unified packet type
   from release 2.1. Those changes may be enabled in the upcoming release 2.1
   with CONFIG_RTE_NEXT_ABI.
+
+* librte_cfgfile (rte_cfgfile.h): In order to allow for longer names and values,
+  the value of macros CFG_NAME_LEN and CFG_NAME_VAL will be increased. Most
+  likely, the new values will be 64 and 256, respectively.
+
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps
  2015-07-15 13:11  3% [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
  2015-07-15 13:11  9% ` [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
@ 2015-07-16  7:54  0% ` Olivier MATZ
  1 sibling, 0 replies; 200+ results
From: Olivier MATZ @ 2015-07-16  7:54 UTC (permalink / raw)
  To: Maryam Tahhan, dev

Hi Maryam,

On 07/15/2015 03:11 PM, Maryam Tahhan wrote:
> This patch set implements xstats_get() and xstats_reset() in dev_ops for
> ixgbe to expose detailed error statistics to DPDK applications. The
> dump_cfg application was extended to demonstrate the usage of
> retrieving statistics for DPDK interfaces and renamed to proc_info
> in order reflect this new functionality. This patch set also removes non
> generic statistics from the statistics strings at the ethdev level and
> marks the relevant registers as depricated in struct rte_eth_stats.
>
> v2:
>   - Fixed patch dependencies.
>   - Broke down patches into smaller logical changes.
>
> v3:
>   - Removes non-generic stats fields in rte_stats_strings and deprecates
>     the fields related to them in struct rte_eth_stats.
>   - Modifies rte_eth_xstats_get() to return generic stats and extended
>     stats.
>
> v4:
>   - Replace count use in the loop in ixgbe_dev_xstats_get() function
>     definition with i.
>   - Breakdown "ixgbe: add NIC specific stats removed from ethdev" into
>     two patches, one that adds the stats and another that extends
>     ierrors to include more error stats.
>   - Remove second call to ixgbe_dev_xstats_get() from
>     rte_eth_xstats_get().
>
> v5:
>   - Added documentation for proc_info.
>   - Fixed proc_info copyright year.
>   - Display queue stats for all devices in proc_info.
>
> v6:
>   - Modified the driver implementation of ixgbe_dev_xstats_get() so that
>     it doesn't worry about the generic stats written by the generic layer.
>
> Maryam Tahhan (9):
>    ixgbe: move stats register reads to a new function
>    ixgbe: add functions to get and reset xstats
>    ethdev: expose extended error stats
>    ethdev: remove HW specific stats in stats structs
>    ixgbe: add NIC specific stats removed from ethdev
>    ixgbe: return more errors in ierrors
>    app: remove dump_cfg
>    app: add a new app proc_info
>    doc: Add documentation for proc_info
>
>   MAINTAINERS                            |   4 +
>   app/Makefile                           |   2 +-
>   app/dump_cfg/Makefile                  |  45 -----
>   app/dump_cfg/main.c                    |  92 ---------
>   app/proc_info/Makefile                 |  45 +++++
>   app/proc_info/main.c                   | 354 +++++++++++++++++++++++++++++++++
>   doc/guides/rel_notes/abi.rst           |  12 ++
>   doc/guides/sample_app_ug/index.rst     |   1 +
>   doc/guides/sample_app_ug/proc_info.rst |  71 +++++++
>   drivers/net/ixgbe/ixgbe_ethdev.c       | 193 ++++++++++++++----
>   lib/librte_ether/rte_ethdev.c          |  40 ++--
>   lib/librte_ether/rte_ethdev.h          |  30 ++-
>   mk/rte.sdktest.mk                      |   4 +-
>   13 files changed, 685 insertions(+), 208 deletions(-)
>   delete mode 100644 app/dump_cfg/Makefile
>   delete mode 100644 app/dump_cfg/main.c
>   create mode 100644 app/proc_info/Makefile
>   create mode 100644 app/proc_info/main.c
>   create mode 100644 doc/guides/sample_app_ug/proc_info.rst
>   mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v10 00/19] unified packet type
  2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
@ 2015-07-15 23:51  0%     ` Zhang, Helin
  0 siblings, 0 replies; 200+ results
From: Zhang, Helin @ 2015-07-15 23:51 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, July 15, 2015 4:01 PM
> To: Zhang, Helin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v10 00/19] unified packet type
> 
> 2015-07-10 00:31, Helin Zhang:
> > Currently only 6 bits which are stored in ol_flags are used to
> > indicate the packet types. This is not enough, as some NIC hardware
> > can recognize quite a lot of packet types, e.g i40e hardware can
> > recognize more than 150 packet types. Hiding those packet types hides
> > hardware offload capabilities which could be quite useful for improving
> performance and for end users.
> > So an unified packet types are needed to support all possible PMDs. A
> > 16 bits packet_type in mbuf structure can be changed to 32 bits and
> > used for this purpose. In addition, all packet types stored in ol_flag
> > field should be deleted at all, and 6 bits of ol_flags can be save as the benifit.
> >
> > Initially, 32 bits of packet_type can be divided into several sub
> > fields to indicate different packet type information of a packet. The
> > initial design is to divide those bits into fields for L2 types, L3
> > types, L4 types, tunnel types, inner L2 types, inner L3 types and
> > inner L4 types. All PMDs should translate the offloaded packet types
> > into these 7 fields of information, for user applications.
> >
> > To avoid breaking ABI compatibility, currently all the code changes
> > for unified packet type are disabled at compile time by default. Users
> > can enable it manually by defining the macro of RTE_NEXT_ABI. The code
> > changes will be valid by default in a future release, and the old
> > version will be deleted accordingly, after the ABI change process is done.
> 
> Applied with fixes for cxgbe and mlx4, thanks everyone
> 
> The macro RTE_ETH_IS_TUNNEL_PKT may need to take RTE_PTYPE_INNER_*
> into account.

Thank you so much!
Thanks to all the contributors!

Helin

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] doc: announce ABI changes planned for unified packet type
  2015-07-09  0:56  4%   ` Wu, Jingjing
@ 2015-07-15 23:37  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-15 23:37 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev

> > The significant ABI changes of all shared libraries are planned to support
> > unified packet type which will be taken effect from release 2.2. Here
> > announces that ABI changes in detail.
> > 
> > Signed-off-by: Helin Zhang <helin.zhang@intel.com>
> Acked-by: Jingjing Wu <jingjing.wu@intel.com>

Applied with rewording to take new NEXT_ABI option into account.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v10 00/19] unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (18 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks Helin Zhang
@ 2015-07-15 23:00  0%   ` Thomas Monjalon
  2015-07-15 23:51  0%     ` Zhang, Helin
  19 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-15 23:00 UTC (permalink / raw)
  To: Helin Zhang; +Cc: dev

2015-07-10 00:31, Helin Zhang:
> Currently only 6 bits which are stored in ol_flags are used to indicate the
> packet types. This is not enough, as some NIC hardware can recognize quite
> a lot of packet types, e.g i40e hardware can recognize more than 150 packet
> types. Hiding those packet types hides hardware offload capabilities which
> could be quite useful for improving performance and for end users.
> So an unified packet types are needed to support all possible PMDs. A 16
> bits packet_type in mbuf structure can be changed to 32 bits and used for
> this purpose. In addition, all packet types stored in ol_flag field should
> be deleted at all, and 6 bits of ol_flags can be save as the benifit.
> 
> Initially, 32 bits of packet_type can be divided into several sub fields to
> indicate different packet type information of a packet. The initial design
> is to divide those bits into fields for L2 types, L3 types, L4 types, tunnel
> types, inner L2 types, inner L3 types and inner L4 types. All PMDs should
> translate the offloaded packet types into these 7 fields of information, for
> user applications.
> 
> To avoid breaking ABI compatibility, currently all the code changes for
> unified packet type are disabled at compile time by default. Users can enable
> it manually by defining the macro of RTE_NEXT_ABI. The code changes will be
> valid by default in a future release, and the old version will be deleted
> accordingly, after the ABI change process is done.

Applied with fixes for cxgbe and mlx4, thanks everyone

The macro RTE_ETH_IS_TUNNEL_PKT may need to take RTE_PTYPE_INNER_* into account.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
@ 2015-07-15 16:32  1%       ` Sergio Gonzalez Monroy
  2015-07-15 16:32 19%       ` [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  1 sibling, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15 16:32 UTC (permalink / raw)
  To: dev

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/common/eal_common_memzone.c        | 290 ++++++----------------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c               |  68 +++--
 lib/librte_eal/common/malloc_elem.h               |  14 +-
 lib/librte_eal/common/malloc_heap.c               | 161 ++++++------
 lib/librte_eal/common/malloc_heap.h               |   6 +-
 lib/librte_eal/common/rte_malloc.c                |  10 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c          |   2 +-
 9 files changed, 226 insertions(+), 330 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 9c1da71..31bf6d8 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 
+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"
 
-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
 	const struct rte_mem_config *mcfg;
+	const struct rte_memzone *mz;
 	unsigned i = 0;
 
 	/* get pointer to global configuration */
@@ -68,62 +68,50 @@ memzone_lookup_thread_unsafe(const char *name)
 	 * the algorithm is not optimal (linear), but there are few
 	 * zones and this function should be called at init only
 	 */
-	for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-		if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+	for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+		mz = &mcfg->memzone[i];
+		if (mz->addr != NULL && !strncmp(name, mz->name, RTE_MEMZONE_NAMESIZE))
 			return &mcfg->memzone[i];
 	}
 
 	return NULL;
 }
 
-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-	size_t bound)
+/* This function will return the greatest free block if a heap has been
+ * specified. If no heap has been specified, it will return the heap and
+ * length of the greatest free block available in all heaps */
+static size_t
+find_heap_max_free_elem(int *s, unsigned align)
 {
-	phys_addr_t addr_offset, bmask, end, start;
-	size_t step;
-
-	step = RTE_MAX(align, bound);
-	bmask = ~((phys_addr_t)bound - 1);
-
-	/* calculate offset to closest alignment */
-	start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-	addr_offset = start - ms->phys_addr;
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	int i, socket = *s;
+	size_t len = 0;
 
-	while (addr_offset + len < ms->len) {
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-		/* check, do we meet boundary condition */
-		end = start + len - (len != 0);
-		if ((start & bmask) == (end & bmask))
-			break;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		if ((socket != SOCKET_ID_ANY) && (socket != i))
+			continue;
 
-		/* calculate next offset */
-		start = RTE_ALIGN_CEIL(start + 1, step);
-		addr_offset = start - ms->phys_addr;
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size > len) {
+			len = stats.greatest_free_size;
+			*s = i;
+		}
 	}
 
-	return addr_offset;
+	return (len - MALLOC_ELEM_OVERHEAD - align);
 }
 
 static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
-		int socket_id, uint64_t size_mask, unsigned align,
-		unsigned bound)
+		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
 	struct rte_mem_config *mcfg;
-	unsigned i = 0;
-	int memseg_idx = -1;
-	uint64_t addr_offset, seg_offset = 0;
 	size_t requested_len;
-	size_t memseg_len = 0;
-	phys_addr_t memseg_physaddr;
-	void *memseg_addr;
+	int socket, i;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -155,7 +143,6 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	if (align < RTE_CACHE_LINE_SIZE)
 		align = RTE_CACHE_LINE_SIZE;
 
-
 	/* align length on cache boundary. Check for overflow before doing so */
 	if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) {
 		rte_errno = EINVAL; /* requested size too big */
@@ -169,108 +156,65 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	requested_len = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE,  len);
 
 	/* check that boundary condition is valid */
-	if (bound != 0 &&
-			(requested_len > bound || !rte_is_power_of_2(bound))) {
+	if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
 
-	/* find the smallest segment matching requirements */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		/* last segment */
-		if (free_memseg[i].addr == NULL)
-			break;
-
-		/* empty segment, skip it */
-		if (free_memseg[i].len == 0)
-			continue;
-
-		/* bad socket ID */
-		if (socket_id != SOCKET_ID_ANY &&
-		    free_memseg[i].socket_id != SOCKET_ID_ANY &&
-		    socket_id != free_memseg[i].socket_id)
-			continue;
-
-		/*
-		 * calculate offset to closest alignment that
-		 * meets boundary conditions.
-		 */
-		addr_offset = align_phys_boundary(free_memseg + i,
-			requested_len, align, bound);
+	if ((socket_id != SOCKET_ID_ANY) && (socket_id >= RTE_MAX_NUMA_NODES)) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
 
-		/* check len */
-		if ((requested_len + addr_offset) > free_memseg[i].len)
-			continue;
+	if (!rte_eal_has_hugepages())
+		socket_id = SOCKET_ID_ANY;
 
-		if ((size_mask & free_memseg[i].hugepage_sz) == 0)
-			continue;
+	if (len == 0) {
+		if (bound != 0)
+			requested_len = bound;
+		else
+			requested_len = find_heap_max_free_elem(&socket_id, align);
+	}
 
-		/* this segment is the best until now */
-		if (memseg_idx == -1) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
-		}
-		/* find the biggest contiguous zone */
-		else if (len == 0) {
-			if (free_memseg[i].len > memseg_len) {
-				memseg_idx = i;
-				memseg_len = free_memseg[i].len;
-				seg_offset = addr_offset;
-			}
-		}
-		/*
-		 * find the smallest (we already checked that current
-		 * zone length is > len
-		 */
-		else if (free_memseg[i].len + align < memseg_len ||
-				(free_memseg[i].len <= memseg_len + align &&
-				addr_offset < seg_offset)) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
+	if (socket_id == SOCKET_ID_ANY)
+		socket = malloc_get_numa_socket();
+	else
+		socket = socket_id;
+
+	/* allocate memory on heap */
+	void *mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[socket], NULL,
+			requested_len, flags, align, bound);
+
+	if ((mz_addr == NULL) && (socket_id == SOCKET_ID_ANY)) {
+		/* try other heaps */
+		for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+			if (socket == i)
+				continue;
+
+			mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[i],
+					NULL, requested_len, flags, align, bound);
+			if (mz_addr != NULL)
+				break;
 		}
 	}
 
-	/* no segment found */
-	if (memseg_idx == -1) {
+	if (mz_addr == NULL) {
 		rte_errno = ENOMEM;
 		return NULL;
 	}
 
-	/* save aligned physical and virtual addresses */
-	memseg_physaddr = free_memseg[memseg_idx].phys_addr + seg_offset;
-	memseg_addr = RTE_PTR_ADD(free_memseg[memseg_idx].addr,
-			(uintptr_t) seg_offset);
-
-	/* if we are looking for a biggest memzone */
-	if (len == 0) {
-		if (bound == 0)
-			requested_len = memseg_len - seg_offset;
-		else
-			requested_len = RTE_ALIGN_CEIL(memseg_physaddr + 1,
-				bound) - memseg_physaddr;
-	}
-
-	/* set length to correct value */
-	len = (size_t)seg_offset + requested_len;
-
-	/* update our internal state */
-	free_memseg[memseg_idx].len -= len;
-	free_memseg[memseg_idx].phys_addr += len;
-	free_memseg[memseg_idx].addr =
-		(char *)free_memseg[memseg_idx].addr + len;
+	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
 	struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
 	snprintf(mz->name, sizeof(mz->name), "%s", name);
-	mz->phys_addr = memseg_physaddr;
-	mz->addr = memseg_addr;
-	mz->len = requested_len;
-	mz->hugepage_sz = free_memseg[memseg_idx].hugepage_sz;
-	mz->socket_id = free_memseg[memseg_idx].socket_id;
+	mz->phys_addr = rte_malloc_virt2phy(mz_addr);
+	mz->addr = mz_addr;
+	mz->len = (requested_len == 0 ? elem->size : requested_len);
+	mz->hugepage_sz = elem->ms->hugepage_sz;
+	mz->socket_id = elem->ms->socket_id;
 	mz->flags = 0;
-	mz->memseg_id = memseg_idx;
+	mz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;
 
 	return mz;
 }
@@ -282,26 +226,6 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memzone *mz = NULL;
-	uint64_t size_mask = 0;
-
-	if (flags & RTE_MEMZONE_256KB)
-		size_mask |= RTE_PGSIZE_256K;
-	if (flags & RTE_MEMZONE_2MB)
-		size_mask |= RTE_PGSIZE_2M;
-	if (flags & RTE_MEMZONE_16MB)
-		size_mask |= RTE_PGSIZE_16M;
-	if (flags & RTE_MEMZONE_256MB)
-		size_mask |= RTE_PGSIZE_256M;
-	if (flags & RTE_MEMZONE_512MB)
-		size_mask |= RTE_PGSIZE_512M;
-	if (flags & RTE_MEMZONE_1GB)
-		size_mask |= RTE_PGSIZE_1G;
-	if (flags & RTE_MEMZONE_4GB)
-		size_mask |= RTE_PGSIZE_4G;
-	if (flags & RTE_MEMZONE_16GB)
-		size_mask |= RTE_PGSIZE_16G;
-	if (!size_mask)
-		size_mask = UINT64_MAX;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -309,18 +233,7 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 	rte_rwlock_write_lock(&mcfg->mlock);
 
 	mz = memzone_reserve_aligned_thread_unsafe(
-		name, len, socket_id, size_mask, align, bound);
-
-	/*
-	 * If we failed to allocate the requested page size, and the
-	 * RTE_MEMZONE_SIZE_HINT_ONLY flag is specified, try allocating
-	 * again.
-	 */
-	if (!mz && rte_errno == ENOMEM && size_mask != UINT64_MAX &&
-	    flags & RTE_MEMZONE_SIZE_HINT_ONLY) {
-		mz = memzone_reserve_aligned_thread_unsafe(
-			name, len, socket_id, UINT64_MAX, align, bound);
-	}
+		name, len, socket_id, flags, align, bound);
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
@@ -412,45 +325,6 @@ rte_memzone_dump(FILE *f)
 }
 
 /*
- * called by init: modify the free memseg list to have cache-aligned
- * addresses and cache-aligned lengths
- */
-static int
-memseg_sanitize(struct rte_memseg *memseg)
-{
-	unsigned phys_align;
-	unsigned virt_align;
-	unsigned off;
-
-	phys_align = memseg->phys_addr & RTE_CACHE_LINE_MASK;
-	virt_align = (unsigned long)memseg->addr & RTE_CACHE_LINE_MASK;
-
-	/*
-	 * sanity check: phys_addr and addr must have the same
-	 * alignment
-	 */
-	if (phys_align != virt_align)
-		return -1;
-
-	/* memseg is really too small, don't bother with it */
-	if (memseg->len < (2 * RTE_CACHE_LINE_SIZE)) {
-		memseg->len = 0;
-		return 0;
-	}
-
-	/* align start address */
-	off = (RTE_CACHE_LINE_SIZE - phys_align) & RTE_CACHE_LINE_MASK;
-	memseg->phys_addr += off;
-	memseg->addr = (char *)memseg->addr + off;
-	memseg->len -= off;
-
-	/* align end address */
-	memseg->len &= ~((uint64_t)RTE_CACHE_LINE_MASK);
-
-	return 0;
-}
-
-/*
  * Init the memzone subsystem
  */
 int
@@ -458,14 +332,10 @@ rte_eal_memzone_init(void)
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memseg *memseg;
-	unsigned i = 0;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* mirror the runtime memsegs from config */
-	free_memseg = mcfg->free_memseg;
-
 	/* secondary processes don't need to initialise anything */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return 0;
@@ -478,33 +348,13 @@ rte_eal_memzone_init(void)
 
 	rte_rwlock_write_lock(&mcfg->mlock);
 
-	/* fill in uninitialized free_memsegs */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (memseg[i].addr == NULL)
-			break;
-		if (free_memseg[i].addr != NULL)
-			continue;
-		memcpy(&free_memseg[i], &memseg[i], sizeof(struct rte_memseg));
-	}
-
-	/* make all zones cache-aligned */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (free_memseg[i].addr == NULL)
-			break;
-		if (memseg_sanitize(&free_memseg[i]) < 0) {
-			RTE_LOG(ERR, EAL, "%s(): Sanity check failed\n", __func__);
-			rte_rwlock_write_unlock(&mcfg->mlock);
-			return -1;
-		}
-	}
-
 	/* delete all zones */
 	mcfg->memzone_idx = 0;
 	memset(mcfg->memzone, 0, sizeof(mcfg->memzone));
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
-	return 0;
+	return rte_eal_malloc_heap_init();
 }
 
 /* Walk all reserved memory zones */
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 34f5abc..055212a 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,7 +73,7 @@ struct rte_mem_config {
 	struct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */
 	struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */
 
-	/* Runtime Physmem descriptors. */
+	/* Runtime Physmem descriptors - NOT USED */
 	struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
 
 	struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */
diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h
index 716216f..b270356 100644
--- a/lib/librte_eal/common/include/rte_malloc_heap.h
+++ b/lib/librte_eal/common/include/rte_malloc_heap.h
@@ -40,7 +40,7 @@
 #include <rte_memory.h>
 
 /* Number of free lists per heap, grouped by size. */
-#define RTE_HEAP_NUM_FREELISTS  5
+#define RTE_HEAP_NUM_FREELISTS  13
 
 /**
  * Structure to hold malloc heap
@@ -48,7 +48,6 @@
 struct malloc_heap {
 	rte_spinlock_t lock;
 	LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
-	unsigned mz_count;
 	unsigned alloc_count;
 	size_t total_size;
 } __rte_cache_aligned;
diff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c
index a5e1248..b54ee33 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -37,7 +37,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_per_lcore.h>
@@ -56,10 +55,10 @@
  */
 void
 malloc_elem_init(struct malloc_elem *elem,
-		struct malloc_heap *heap, const struct rte_memzone *mz, size_t size)
+		struct malloc_heap *heap, const struct rte_memseg *ms, size_t size)
 {
 	elem->heap = heap;
-	elem->mz = mz;
+	elem->ms = ms;
 	elem->prev = NULL;
 	memset(&elem->free_list, 0, sizeof(elem->free_list));
 	elem->state = ELEM_FREE;
@@ -70,12 +69,12 @@ malloc_elem_init(struct malloc_elem *elem,
 }
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
 {
-	malloc_elem_init(elem, prev->heap, prev->mz, 0);
+	malloc_elem_init(elem, prev->heap, prev->ms, 0);
 	elem->prev = prev;
 	elem->state = ELEM_BUSY; /* mark busy so its never merged */
 }
@@ -86,12 +85,24 @@ malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
  * fit, return NULL.
  */
 static void *
-elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
+elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	const uintptr_t end_pt = (uintptr_t)elem +
+	const size_t bmask = ~(bound - 1);
+	uintptr_t end_pt = (uintptr_t)elem +
 			elem->size - MALLOC_ELEM_TRAILER_LEN;
-	const uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
-	const uintptr_t new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
+	uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+	uintptr_t new_elem_start;
+
+	/* check boundary */
+	if ((new_data_start & bmask) != ((end_pt - 1) & bmask)) {
+		end_pt = RTE_ALIGN_FLOOR(end_pt, bound);
+		new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+		if (((end_pt - 1) & bmask) != (new_data_start & bmask))
+			return NULL;
+	}
+
+	new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
 
 	/* if the new start point is before the exist start, it won't fit */
 	return (new_elem_start < (uintptr_t)elem) ? NULL : (void *)new_elem_start;
@@ -102,9 +113,10 @@ elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
  * alignment request from the current element
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,	unsigned align,
+		size_t bound)
 {
-	return elem_start_pt(elem, size, align) != NULL;
+	return elem_start_pt(elem, size, align, bound) != NULL;
 }
 
 /*
@@ -115,10 +127,10 @@ static void
 split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)
 {
 	struct malloc_elem *next_elem = RTE_PTR_ADD(elem, elem->size);
-	const unsigned old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
-	const unsigned new_elem_size = elem->size - old_elem_size;
+	const size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
+	const size_t new_elem_size = elem->size - old_elem_size;
 
-	malloc_elem_init(split_pt, elem->heap, elem->mz, new_elem_size);
+	malloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);
 	split_pt->prev = elem;
 	next_elem->prev = split_pt;
 	elem->size = old_elem_size;
@@ -168,8 +180,9 @@ malloc_elem_free_list_index(size_t size)
 void
 malloc_elem_free_list_insert(struct malloc_elem *elem)
 {
-	size_t idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
+	size_t idx;
 
+	idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
 	elem->state = ELEM_FREE;
 	LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
 }
@@ -190,12 +203,26 @@ elem_free_list_remove(struct malloc_elem *elem)
  * is not done here, as it's done there previously.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	struct malloc_elem *new_elem = elem_start_pt(elem, size, align);
-	const unsigned old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	struct malloc_elem *new_elem = elem_start_pt(elem, size, align, bound);
+	const size_t old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	const size_t trailer_size = elem->size - old_elem_size - size -
+		MALLOC_ELEM_OVERHEAD;
+
+	elem_free_list_remove(elem);
 
-	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE){
+	if (trailer_size > MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
+		/* split it, too much free space after elem */
+		struct malloc_elem *new_free_elem =
+				RTE_PTR_ADD(new_elem, size + MALLOC_ELEM_OVERHEAD);
+
+		split_elem(elem, new_free_elem);
+		malloc_elem_free_list_insert(new_free_elem);
+	}
+
+	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
 		/* don't split it, pad the element instead */
 		elem->state = ELEM_BUSY;
 		elem->pad = old_elem_size;
@@ -208,8 +235,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 			new_elem->size = elem->size - elem->pad;
 			set_header(new_elem);
 		}
-		/* remove element from free list */
-		elem_free_list_remove(elem);
 
 		return new_elem;
 	}
@@ -219,7 +244,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 	 * Re-insert original element, in case its new size makes it
 	 * belong on a different list.
 	 */
-	elem_free_list_remove(elem);
 	split_elem(elem, new_elem);
 	new_elem->state = ELEM_BUSY;
 	malloc_elem_free_list_insert(elem);
diff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h
index 9790b1a..e05d2ea 100644
--- a/lib/librte_eal/common/malloc_elem.h
+++ b/lib/librte_eal/common/malloc_elem.h
@@ -47,9 +47,9 @@ enum elem_state {
 
 struct malloc_elem {
 	struct malloc_heap *heap;
-	struct malloc_elem *volatile prev;      /* points to prev elem in memzone */
+	struct malloc_elem *volatile prev;      /* points to prev elem in memseg */
 	LIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */
-	const struct rte_memzone *mz;
+	const struct rte_memseg *ms;
 	volatile enum elem_state state;
 	uint32_t pad;
 	size_t size;
@@ -136,11 +136,11 @@ malloc_elem_from_data(const void *data)
 void
 malloc_elem_init(struct malloc_elem *elem,
 		struct malloc_heap *heap,
-		const struct rte_memzone *mz,
+		const struct rte_memseg *ms,
 		size_t size);
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem,
@@ -151,14 +151,16 @@ malloc_elem_mkend(struct malloc_elem *elem,
  * of the requested size and with the requested alignment
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * reserve a block of data in an existing malloc_elem. If the malloc_elem
  * is much larger than the data block requested, we split the element in two.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_alloc(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * free a malloc_elem block by adding it to the free list. If the
diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index 8861d27..21d8914 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -39,7 +39,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_launch.h>
@@ -54,123 +53,125 @@
 #include "malloc_elem.h"
 #include "malloc_heap.h"
 
-/* since the memzone size starts with a digit, it will appear unquoted in
- * rte_config.h, so quote it so it can be passed to rte_str_to_size */
-#define MALLOC_MEMZONE_SIZE RTE_STR(RTE_MALLOC_MEMZONE_SIZE)
-
-/*
- * returns the configuration setting for the memzone size as a size_t value
- */
-static inline size_t
-get_malloc_memzone_size(void)
+static unsigned
+check_hugepage_sz(unsigned flags, uint64_t hugepage_sz)
 {
-	return rte_str_to_size(MALLOC_MEMZONE_SIZE);
+	unsigned check_flag = 0;
+
+	if (!(flags & ~RTE_MEMZONE_SIZE_HINT_ONLY))
+		return 1;
+
+	switch (hugepage_sz) {
+	case RTE_PGSIZE_256K:
+		check_flag = RTE_MEMZONE_256KB;
+		break;
+	case RTE_PGSIZE_2M:
+		check_flag = RTE_MEMZONE_2MB;
+		break;
+	case RTE_PGSIZE_16M:
+		check_flag = RTE_MEMZONE_16MB;
+		break;
+	case RTE_PGSIZE_256M:
+		check_flag = RTE_MEMZONE_256MB;
+		break;
+	case RTE_PGSIZE_512M:
+		check_flag = RTE_MEMZONE_512MB;
+		break;
+	case RTE_PGSIZE_1G:
+		check_flag = RTE_MEMZONE_1GB;
+		break;
+	case RTE_PGSIZE_4G:
+		check_flag = RTE_MEMZONE_4GB;
+		break;
+	case RTE_PGSIZE_16G:
+		check_flag = RTE_MEMZONE_16GB;
+	}
+
+	return (check_flag & flags);
 }
 
 /*
- * reserve an extra memory zone and make it available for use by a particular
- * heap. This reserves the zone and sets a dummy malloc_elem header at the end
+ * Expand the heap with a memseg.
+ * This reserves the zone and sets a dummy malloc_elem header at the end
  * to prevent overflow. The rest of the zone is added to free list as a single
  * large free block
  */
-static int
-malloc_heap_add_memzone(struct malloc_heap *heap, size_t size, unsigned align)
+static void
+malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)
 {
-	const unsigned mz_flags = 0;
-	const size_t block_size = get_malloc_memzone_size();
-	/* ensure the data we want to allocate will fit in the memzone */
-	const size_t min_size = size + align + MALLOC_ELEM_OVERHEAD * 2;
-	const struct rte_memzone *mz = NULL;
-	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	unsigned numa_socket = heap - mcfg->malloc_heaps;
-
-	size_t mz_size = min_size;
-	if (mz_size < block_size)
-		mz_size = block_size;
-
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	snprintf(mz_name, sizeof(mz_name), "MALLOC_S%u_HEAP_%u",
-		     numa_socket, heap->mz_count++);
-
-	/* try getting a block. if we fail and we don't need as big a block
-	 * as given in the config, we can shrink our request and try again
-	 */
-	do {
-		mz = rte_memzone_reserve(mz_name, mz_size, numa_socket,
-					 mz_flags);
-		if (mz == NULL)
-			mz_size /= 2;
-	} while (mz == NULL && mz_size > min_size);
-	if (mz == NULL)
-		return -1;
-
 	/* allocate the memory block headers, one at end, one at start */
-	struct malloc_elem *start_elem = (struct malloc_elem *)mz->addr;
-	struct malloc_elem *end_elem = RTE_PTR_ADD(mz->addr,
-			mz_size - MALLOC_ELEM_OVERHEAD);
+	struct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;
+	struct malloc_elem *end_elem = RTE_PTR_ADD(ms->addr,
+			ms->len - MALLOC_ELEM_OVERHEAD);
 	end_elem = RTE_PTR_ALIGN_FLOOR(end_elem, RTE_CACHE_LINE_SIZE);
+	const size_t elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
 
-	const unsigned elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
-	malloc_elem_init(start_elem, heap, mz, elem_size);
+	malloc_elem_init(start_elem, heap, ms, elem_size);
 	malloc_elem_mkend(end_elem, start_elem);
 	malloc_elem_free_list_insert(start_elem);
 
-	/* increase heap total size by size of new memzone */
-	heap->total_size+=mz_size - MALLOC_ELEM_OVERHEAD;
-	return 0;
+	heap->total_size += elem_size;
 }
 
 /*
  * Iterates through the freelist for a heap to find a free element
  * which can store data of the required size and with the requested alignment.
+ * If size is 0, find the biggest available elem.
  * Returns null on failure, or pointer to element on success.
  */
 static struct malloc_elem *
-find_suitable_element(struct malloc_heap *heap, size_t size, unsigned align)
+find_suitable_element(struct malloc_heap *heap, size_t size,
+		unsigned flags, size_t align, size_t bound)
 {
 	size_t idx;
-	struct malloc_elem *elem;
+	struct malloc_elem *elem, *alt_elem = NULL;
 
 	for (idx = malloc_elem_free_list_index(size);
-		idx < RTE_HEAP_NUM_FREELISTS; idx++)
-	{
+			idx < RTE_HEAP_NUM_FREELISTS; idx++) {
 		for (elem = LIST_FIRST(&heap->free_head[idx]);
-			!!elem; elem = LIST_NEXT(elem, free_list))
-		{
-			if (malloc_elem_can_hold(elem, size, align))
-				return elem;
+				!!elem; elem = LIST_NEXT(elem, free_list)) {
+			if (malloc_elem_can_hold(elem, size, align, bound)) {
+				if (check_hugepage_sz(flags, elem->ms->hugepage_sz))
+					return elem;
+				if (alt_elem == NULL)
+					alt_elem = elem;
+			}
 		}
 	}
+
+	if ((alt_elem != NULL) && (flags & RTE_MEMZONE_SIZE_HINT_ONLY))
+		return alt_elem;
+
 	return NULL;
 }
 
 /*
- * Main function called by malloc to allocate a block of memory from the
- * heap. It locks the free list, scans it, and adds a new memzone if the
- * scan fails. Once the new memzone is added, it re-scans and should return
+ * Main function to allocate a block of memory from the heap.
+ * It locks the free list, scans it, and adds a new memseg if the
+ * scan fails. Once the new memseg is added, it re-scans and should return
  * the new element after releasing the lock.
  */
 void *
 malloc_heap_alloc(struct malloc_heap *heap,
-		const char *type __attribute__((unused)), size_t size, unsigned align)
+		const char *type __attribute__((unused)), size_t size, unsigned flags,
+		size_t align, size_t bound)
 {
+	struct malloc_elem *elem;
+
 	size = RTE_CACHE_LINE_ROUNDUP(size);
 	align = RTE_CACHE_LINE_ROUNDUP(align);
+
 	rte_spinlock_lock(&heap->lock);
-	struct malloc_elem *elem = find_suitable_element(heap, size, align);
-	if (elem == NULL){
-		if ((malloc_heap_add_memzone(heap, size, align)) == 0)
-			elem = find_suitable_element(heap, size, align);
-	}
 
-	if (elem != NULL){
-		elem = malloc_elem_alloc(elem, size, align);
+	elem = find_suitable_element(heap, size, flags, align, bound);
+	if (elem != NULL) {
+		elem = malloc_elem_alloc(elem, size, align, bound);
 		/* increase heap's count of allocated elements */
 		heap->alloc_count++;
 	}
 	rte_spinlock_unlock(&heap->lock);
-	return elem == NULL ? NULL : (void *)(&elem[1]);
 
+	return elem == NULL ? NULL : (void *)(&elem[1]);
 }
 
 /*
@@ -206,3 +207,21 @@ malloc_heap_get_stats(const struct malloc_heap *heap,
 	socket_stats->alloc_count = heap->alloc_count;
 	return 0;
 }
+
+int
+rte_eal_malloc_heap_init(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	unsigned ms_cnt;
+	struct rte_memseg *ms;
+
+	if (mcfg == NULL)
+		return -1;
+
+	for (ms = &mcfg->memseg[0], ms_cnt = 0;
+			(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);
+			ms_cnt++, ms++)
+		malloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);
+
+	return 0;
+}
diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h
index a47136d..3ccbef0 100644
--- a/lib/librte_eal/common/malloc_heap.h
+++ b/lib/librte_eal/common/malloc_heap.h
@@ -53,15 +53,15 @@ malloc_get_numa_socket(void)
 }
 
 void *
-malloc_heap_alloc(struct malloc_heap *heap, const char *type,
-		size_t size, unsigned align);
+malloc_heap_alloc(struct malloc_heap *heap,	const char *type, size_t size,
+		unsigned flags, size_t align, size_t bound);
 
 int
 malloc_heap_get_stats(const struct malloc_heap *heap,
 		struct rte_malloc_socket_stats *socket_stats);
 
 int
-rte_eal_heap_memzone_init(void);
+rte_eal_malloc_heap_init(void);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index c313a57..47deb00 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -39,7 +39,6 @@
 
 #include <rte_memcpy.h>
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_branch_prediction.h>
@@ -77,6 +76,9 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 	if (size == 0 || (align && !rte_is_power_of_2(align)))
 		return NULL;
 
+	if (!rte_eal_has_hugepages())
+		socket_arg = SOCKET_ID_ANY;
+
 	if (socket_arg == SOCKET_ID_ANY)
 		socket = malloc_get_numa_socket();
 	else
@@ -87,7 +89,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 		return NULL;
 
 	ret = malloc_heap_alloc(&mcfg->malloc_heaps[socket], type,
-				size, align == 0 ? 1 : align);
+				size, 0, align == 0 ? 1 : align, 0);
 	if (ret != NULL || socket_arg != SOCKET_ID_ANY)
 		return ret;
 
@@ -98,7 +100,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 			continue;
 
 		ret = malloc_heap_alloc(&mcfg->malloc_heaps[i], type,
-					size, align == 0 ? 1 : align);
+					size, 0, align == 0 ? 1 : align, 0);
 		if (ret != NULL)
 			return ret;
 	}
@@ -256,5 +258,5 @@ rte_malloc_virt2phy(const void *addr)
 	const struct malloc_elem *elem = malloc_elem_from_data(addr);
 	if (elem == NULL)
 		return 0;
-	return elem->mz->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->mz->addr);
+	return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
 }
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 4fd63bb..80ee78f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1071,7 +1071,7 @@ rte_eal_hugepage_init(void)
 		mcfg->memseg[0].addr = addr;
 		mcfg->memseg[0].hugepage_sz = RTE_PGSIZE_4K;
 		mcfg->memseg[0].len = internal_config.memory;
-		mcfg->memseg[0].socket_id = SOCKET_ID_ANY;
+		mcfg->memseg[0].socket_id = 0;
 		return 0;
 	}
 
-- 
1.9.3

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v10 0/9] Dynamic memzones
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
  2015-07-15  8:26 19%     ` [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
@ 2015-07-15 16:32  3%     ` Sergio Gonzalez Monroy
  2015-07-15 16:32  1%       ` [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
  2015-07-15 16:32 19%       ` [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  2 siblings, 2 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15 16:32 UTC (permalink / raw)
  To: dev

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v10:
 - Convert PNG to SVG
 - Fix issue with --no-huge by forcing SOCKET_ID_ANY
 - Rework some parts of the code

v9:
 - Fix incorrect size_t type that results in 32bits compilation error.

v8:
 - Rebase against current HEAD to factor for changes made by new Tile-Gx arch

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests


v6 Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS                                       |   22 +-
 app/test/test_malloc.c                            |   86 --
 app/test/test_memzone.c                           |  456 ++-------
 config/common_bsdapp                              |    8 +-
 config/common_linuxapp                            |    8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   |  220 ++++-
 doc/guides/prog_guide/img/malloc_heap.png         |  Bin 81329 -> 0 bytes
 doc/guides/prog_guide/img/malloc_heap.svg         | 1018 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                   |    1 -
 doc/guides/prog_guide/malloc_lib.rst              |  233 -----
 doc/guides/prog_guide/overview.rst                |   11 +-
 doc/guides/rel_notes/abi.rst                      |    6 +-
 drivers/net/af_packet/Makefile                    |    1 -
 drivers/net/bonding/Makefile                      |    1 -
 drivers/net/e1000/Makefile                        |    2 +-
 drivers/net/enic/Makefile                         |    2 +-
 drivers/net/fm10k/Makefile                        |    2 +-
 drivers/net/i40e/Makefile                         |    2 +-
 drivers/net/ixgbe/Makefile                        |    2 +-
 drivers/net/mlx4/Makefile                         |    1 -
 drivers/net/null/Makefile                         |    1 -
 drivers/net/pcap/Makefile                         |    1 -
 drivers/net/virtio/Makefile                       |    2 +-
 drivers/net/vmxnet3/Makefile                      |    2 +-
 drivers/net/xenvirt/Makefile                      |    2 +-
 lib/Makefile                                      |    2 +-
 lib/librte_acl/Makefile                           |    2 +-
 lib/librte_eal/bsdapp/eal/Makefile                |    4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |   19 +
 lib/librte_eal/common/Makefile                    |    1 +
 lib/librte_eal/common/eal_common_memzone.c        |  352 +++----
 lib/librte_eal/common/include/rte_eal_memconfig.h |    5 +-
 lib/librte_eal/common/include/rte_malloc.h        |  342 +++++++
 lib/librte_eal/common/include/rte_malloc_heap.h   |    3 +-
 lib/librte_eal/common/include/rte_memzone.h       |   11 +
 lib/librte_eal/common/malloc_elem.c               |  344 +++++++
 lib/librte_eal/common/malloc_elem.h               |  192 ++++
 lib/librte_eal/common/malloc_heap.c               |  227 +++++
 lib/librte_eal/common/malloc_heap.h               |   70 ++
 lib/librte_eal/common/rte_malloc.c                |  262 ++++++
 lib/librte_eal/linuxapp/eal/Makefile              |    4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c         |   17 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c          |    2 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |   19 +
 lib/librte_hash/Makefile                          |    2 +-
 lib/librte_lpm/Makefile                           |    2 +-
 lib/librte_malloc/Makefile                        |    6 +-
 lib/librte_malloc/malloc_elem.c                   |  320 -------
 lib/librte_malloc/malloc_elem.h                   |  190 ----
 lib/librte_malloc/malloc_heap.c                   |  208 -----
 lib/librte_malloc/malloc_heap.h                   |   70 --
 lib/librte_malloc/rte_malloc.c                    |  228 +----
 lib/librte_malloc/rte_malloc.h                    |  342 -------
 lib/librte_malloc/rte_malloc_version.map          |   16 -
 lib/librte_mempool/Makefile                       |    2 -
 lib/librte_port/Makefile                          |    1 -
 lib/librte_ring/Makefile                          |    3 +-
 lib/librte_table/Makefile                         |    1 -
 58 files changed, 2988 insertions(+), 2371 deletions(-)
 delete mode 100644 doc/guides/prog_guide/img/malloc_heap.png
 create mode 100755 doc/guides/prog_guide/img/malloc_heap.svg
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2015-07-15 16:32  1%       ` [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
@ 2015-07-15 16:32 19%       ` Sergio Gonzalez Monroy
  1 sibling, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15 16:32 UTC (permalink / raw)
  To: dev

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 doc/guides/rel_notes/abi.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..76e0ae2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -16,7 +16,6 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
-
 * Significant ABI changes are planned for struct rte_eth_dev to support up to
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
@@ -24,3 +23,8 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* librte_malloc library has been integrated into librte_eal. The 2.1 release
+  creates a dummy/empty malloc library to fulfill binaries with dynamic linking
+  dependencies on librte_malloc.so. Such dummy library will not be created from
+  release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

^ permalink raw reply	[relevance 19%]

* [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs
  2015-07-15 13:11  3% [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
@ 2015-07-15 13:11  9% ` Maryam Tahhan
  2015-07-16  7:54  0% ` [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Olivier MATZ
  1 sibling, 0 replies; 200+ results
From: Maryam Tahhan @ 2015-07-15 13:11 UTC (permalink / raw)
  To: dev

Remove non generic stats in rte_stats_strings and mark the relevant
fields in struct rte_eth_stats as deprecated.

Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
---
 doc/guides/rel_notes/abi.rst  | 12 ++++++++++++
 lib/librte_ether/rte_ethdev.c |  9 ---------
 lib/librte_ether/rte_ethdev.h | 30 ++++++++++++++++++++----------
 3 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..d5bf625 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -24,3 +24,15 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* The following fields have been deprecated in rte_eth_stats:
+  * uint64_t imissed
+  * uint64_t ibadcrc
+  * uint64_t ibadlen
+  * uint64_t imcasts
+  * uint64_t fdirmatch
+  * uint64_t fdirmiss
+  * uint64_t tx_pause_xon
+  * uint64_t rx_pause_xon
+  * uint64_t tx_pause_xoff
+  * uint64_t rx_pause_xoff
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7689328..c8f0e9a 100755
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,17 +142,8 @@ static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_bytes", offsetof(struct rte_eth_stats, ibytes)},
 	{"tx_bytes", offsetof(struct rte_eth_stats, obytes)},
 	{"tx_errors", offsetof(struct rte_eth_stats, oerrors)},
-	{"rx_missed_errors", offsetof(struct rte_eth_stats, imissed)},
-	{"rx_crc_errors", offsetof(struct rte_eth_stats, ibadcrc)},
-	{"rx_bad_length_errors", offsetof(struct rte_eth_stats, ibadlen)},
 	{"rx_errors", offsetof(struct rte_eth_stats, ierrors)},
 	{"alloc_rx_buff_failed", offsetof(struct rte_eth_stats, rx_nombuf)},
-	{"fdir_match", offsetof(struct rte_eth_stats, fdirmatch)},
-	{"fdir_miss", offsetof(struct rte_eth_stats, fdirmiss)},
-	{"tx_flow_control_xon", offsetof(struct rte_eth_stats, tx_pause_xon)},
-	{"rx_flow_control_xon", offsetof(struct rte_eth_stats, rx_pause_xon)},
-	{"tx_flow_control_xoff", offsetof(struct rte_eth_stats, tx_pause_xoff)},
-	{"rx_flow_control_xoff", offsetof(struct rte_eth_stats, rx_pause_xoff)},
 };
 #define RTE_NB_STATS (sizeof(rte_stats_strings) / sizeof(rte_stats_strings[0]))
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d76bbb3..a862027 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -193,19 +193,29 @@ struct rte_eth_stats {
 	uint64_t opackets;  /**< Total number of successfully transmitted packets.*/
 	uint64_t ibytes;    /**< Total number of successfully received bytes. */
 	uint64_t obytes;    /**< Total number of successfully transmitted bytes. */
-	uint64_t imissed;   /**< Total of RX missed packets (e.g full FIFO). */
-	uint64_t ibadcrc;   /**< Total of RX packets with CRC error. */
-	uint64_t ibadlen;   /**< Total of RX packets with bad length. */
+	/**< Deprecated; Total of RX missed packets (e.g full FIFO). */
+	uint64_t imissed;
+	/**< Deprecated; Total of RX packets with CRC error. */
+	uint64_t ibadcrc;
+	/**< Deprecated; Total of RX packets with bad length. */
+	uint64_t ibadlen;
 	uint64_t ierrors;   /**< Total number of erroneous received packets. */
 	uint64_t oerrors;   /**< Total number of failed transmitted packets. */
-	uint64_t imcasts;   /**< Total number of multicast received packets. */
+	uint64_t imcasts;
+	/**< Deprecated; Total number of multicast received packets. */
 	uint64_t rx_nombuf; /**< Total number of RX mbuf allocation failures. */
-	uint64_t fdirmatch; /**< Total number of RX packets matching a filter. */
-	uint64_t fdirmiss;  /**< Total number of RX packets not matching any filter. */
-	uint64_t tx_pause_xon;  /**< Total nb. of XON pause frame sent. */
-	uint64_t rx_pause_xon;  /**< Total nb. of XON pause frame received. */
-	uint64_t tx_pause_xoff; /**< Total nb. of XOFF pause frame sent. */
-	uint64_t rx_pause_xoff; /**< Total nb. of XOFF pause frame received. */
+	uint64_t fdirmatch;
+	/**< Deprecated; Total number of RX packets matching a filter. */
+	uint64_t fdirmiss;
+	/**< Deprecated; Total number of RX packets not matching any filter. */
+	uint64_t tx_pause_xon;
+	 /**< Deprecated; Total nb. of XON pause frame sent. */
+	uint64_t rx_pause_xon;
+	/**< Deprecated; Total nb. of XON pause frame received. */
+	uint64_t tx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame sent. */
+	uint64_t rx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame received. */
 	uint64_t q_ipackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
 	/**< Total number of queue RX packets. */
 	uint64_t q_opackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
-- 
2.4.3

^ permalink raw reply	[relevance 9%]

* [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps
@ 2015-07-15 13:11  3% Maryam Tahhan
  2015-07-15 13:11  9% ` [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
  2015-07-16  7:54  0% ` [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Olivier MATZ
  0 siblings, 2 replies; 200+ results
From: Maryam Tahhan @ 2015-07-15 13:11 UTC (permalink / raw)
  To: dev

This patch set implements xstats_get() and xstats_reset() in dev_ops for
ixgbe to expose detailed error statistics to DPDK applications. The
dump_cfg application was extended to demonstrate the usage of
retrieving statistics for DPDK interfaces and renamed to proc_info
in order reflect this new functionality. This patch set also removes non
generic statistics from the statistics strings at the ethdev level and
marks the relevant registers as depricated in struct rte_eth_stats.

v2:
 - Fixed patch dependencies.
 - Broke down patches into smaller logical changes.

v3:
 - Removes non-generic stats fields in rte_stats_strings and deprecates
   the fields related to them in struct rte_eth_stats.
 - Modifies rte_eth_xstats_get() to return generic stats and extended
   stats.
 
v4:
 - Replace count use in the loop in ixgbe_dev_xstats_get() function
   definition with i.
 - Breakdown "ixgbe: add NIC specific stats removed from ethdev" into
   two patches, one that adds the stats and another that extends
   ierrors to include more error stats.
 - Remove second call to ixgbe_dev_xstats_get() from
   rte_eth_xstats_get().

v5:
 - Added documentation for proc_info.
 - Fixed proc_info copyright year.
 - Display queue stats for all devices in proc_info.

v6:
 - Modified the driver implementation of ixgbe_dev_xstats_get() so that
   it doesn't worry about the generic stats written by the generic layer.

Maryam Tahhan (9):
  ixgbe: move stats register reads to a new function
  ixgbe: add functions to get and reset xstats
  ethdev: expose extended error stats
  ethdev: remove HW specific stats in stats structs
  ixgbe: add NIC specific stats removed from ethdev
  ixgbe: return more errors in ierrors
  app: remove dump_cfg
  app: add a new app proc_info
  doc: Add documentation for proc_info

 MAINTAINERS                            |   4 +
 app/Makefile                           |   2 +-
 app/dump_cfg/Makefile                  |  45 -----
 app/dump_cfg/main.c                    |  92 ---------
 app/proc_info/Makefile                 |  45 +++++
 app/proc_info/main.c                   | 354 +++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/abi.rst           |  12 ++
 doc/guides/sample_app_ug/index.rst     |   1 +
 doc/guides/sample_app_ug/proc_info.rst |  71 +++++++
 drivers/net/ixgbe/ixgbe_ethdev.c       | 193 ++++++++++++++----
 lib/librte_ether/rte_ethdev.c          |  40 ++--
 lib/librte_ether/rte_ethdev.h          |  30 ++-
 mk/rte.sdktest.mk                      |   4 +-
 13 files changed, 685 insertions(+), 208 deletions(-)
 delete mode 100644 app/dump_cfg/Makefile
 delete mode 100644 app/dump_cfg/main.c
 create mode 100644 app/proc_info/Makefile
 create mode 100644 app/proc_info/main.c
 create mode 100644 doc/guides/sample_app_ug/proc_info.rst
 mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c

-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
@ 2015-07-15 10:19  0%     ` Olivier MATZ
  0 siblings, 0 replies; 200+ results
From: Olivier MATZ @ 2015-07-15 10:19 UTC (permalink / raw)
  To: Helin Zhang, dev

On 07/09/2015 06:31 PM, Helin Zhang wrote:
> As there are only 6 bit flags in ol_flags for indicating packet
> types, which is not enough to describe all the possible packet
> types hardware can recognize. For example, i40e hardware can
> recognize more than 150 packet types. Unified packet type is
> composed of L2 type, L3 type, L4 type, tunnel type, inner L2 type,
> inner L3 type and inner L4 type fields, and can be stored in
> 'struct rte_mbuf' of 32 bits field 'packet_type'.
> To avoid breaking ABI compatibility, all the changes would be
> enabled by RTE_NEXT_ABI, which is disabled by default.
>
> Signed-off-by: Helin Zhang <helin.zhang@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
@ 2015-07-15  8:26 19%     ` Sergio Gonzalez Monroy
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15  8:26 UTC (permalink / raw)
  To: dev

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 doc/guides/rel_notes/abi.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..76e0ae2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -16,7 +16,6 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
-
 * Significant ABI changes are planned for struct rte_eth_dev to support up to
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
@@ -24,3 +23,8 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* librte_malloc library has been integrated into librte_eal. The 2.1 release
+  creates a dummy/empty malloc library to fulfill binaries with dynamic linking
+  dependencies on librte_malloc.so. Such dummy library will not be created from
+  release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

^ permalink raw reply	[relevance 19%]

* [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
@ 2015-07-15  8:26  1%     ` Sergio Gonzalez Monroy
  2015-07-15  8:26 19%     ` [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15  8:26 UTC (permalink / raw)
  To: dev

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/common/eal_common_memzone.c        | 289 +++++-----------------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c               |  68 +++--
 lib/librte_eal/common/malloc_elem.h               |  14 +-
 lib/librte_eal/common/malloc_heap.c               | 161 ++++++------
 lib/librte_eal/common/malloc_heap.h               |   6 +-
 lib/librte_eal/common/rte_malloc.c                |   7 +-
 8 files changed, 220 insertions(+), 330 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 9c1da71..fd7e73f 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 
+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"
 
-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
 	const struct rte_mem_config *mcfg;
+	const struct rte_memzone *mz;
 	unsigned i = 0;
 
 	/* get pointer to global configuration */
@@ -68,62 +68,62 @@ memzone_lookup_thread_unsafe(const char *name)
 	 * the algorithm is not optimal (linear), but there are few
 	 * zones and this function should be called at init only
 	 */
-	for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-		if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+	for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+		mz = &mcfg->memzone[i];
+		if (mz->addr != NULL && !strncmp(name, mz->name, RTE_MEMZONE_NAMESIZE))
 			return &mcfg->memzone[i];
 	}
 
 	return NULL;
 }
 
-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-	size_t bound)
+/* Find the heap with the greatest free block size */
+static void
+find_heap_max_free_elem(int *s, size_t *len, unsigned align)
 {
-	phys_addr_t addr_offset, bmask, end, start;
-	size_t step;
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-	step = RTE_MAX(align, bound);
-	bmask = ~((phys_addr_t)bound - 1);
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* calculate offset to closest alignment */
-	start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-	addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size > *len) {
+			*len = stats.greatest_free_size;
+			*s = i;
+		}
+	}
+	*len -= (MALLOC_ELEM_OVERHEAD + align);
+}
 
-	while (addr_offset + len < ms->len) {
+/* Find a heap that can allocate the requested size */
+static void
+find_heap_suitable(int *s, size_t len, unsigned align)
+{
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-		/* check, do we meet boundary condition */
-		end = start + len - (len != 0);
-		if ((start & bmask) == (end & bmask))
-			break;
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-		/* calculate next offset */
-		start = RTE_ALIGN_CEIL(start + 1, step);
-		addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size >= len + MALLOC_ELEM_OVERHEAD + align) {
+			*s = i;
+			break;
+		}
 	}
-
-	return addr_offset;
 }
 
 static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
-		int socket_id, uint64_t size_mask, unsigned align,
-		unsigned bound)
+		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
 	struct rte_mem_config *mcfg;
-	unsigned i = 0;
-	int memseg_idx = -1;
-	uint64_t addr_offset, seg_offset = 0;
 	size_t requested_len;
-	size_t memseg_len = 0;
-	phys_addr_t memseg_physaddr;
-	void *memseg_addr;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -155,7 +155,6 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	if (align < RTE_CACHE_LINE_SIZE)
 		align = RTE_CACHE_LINE_SIZE;
 
-
 	/* align length on cache boundary. Check for overflow before doing so */
 	if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) {
 		rte_errno = EINVAL; /* requested size too big */
@@ -169,108 +168,50 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	requested_len = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE,  len);
 
 	/* check that boundary condition is valid */
-	if (bound != 0 &&
-			(requested_len > bound || !rte_is_power_of_2(bound))) {
+	if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
 
-	/* find the smallest segment matching requirements */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		/* last segment */
-		if (free_memseg[i].addr == NULL)
-			break;
+	if (len == 0) {
+		if (bound != 0)
+			requested_len = bound;
+		else
+			requested_len = 0;
+	}
 
-		/* empty segment, skip it */
-		if (free_memseg[i].len == 0)
-			continue;
-
-		/* bad socket ID */
-		if (socket_id != SOCKET_ID_ANY &&
-		    free_memseg[i].socket_id != SOCKET_ID_ANY &&
-		    socket_id != free_memseg[i].socket_id)
-			continue;
-
-		/*
-		 * calculate offset to closest alignment that
-		 * meets boundary conditions.
-		 */
-		addr_offset = align_phys_boundary(free_memseg + i,
-			requested_len, align, bound);
-
-		/* check len */
-		if ((requested_len + addr_offset) > free_memseg[i].len)
-			continue;
-
-		if ((size_mask & free_memseg[i].hugepage_sz) == 0)
-			continue;
-
-		/* this segment is the best until now */
-		if (memseg_idx == -1) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
-		}
-		/* find the biggest contiguous zone */
-		else if (len == 0) {
-			if (free_memseg[i].len > memseg_len) {
-				memseg_idx = i;
-				memseg_len = free_memseg[i].len;
-				seg_offset = addr_offset;
-			}
-		}
-		/*
-		 * find the smallest (we already checked that current
-		 * zone length is > len
-		 */
-		else if (free_memseg[i].len + align < memseg_len ||
-				(free_memseg[i].len <= memseg_len + align &&
-				addr_offset < seg_offset)) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
+	if (socket_id == SOCKET_ID_ANY) {
+		if (requested_len == 0)
+			find_heap_max_free_elem(&socket_id, &requested_len, align);
+		else
+			find_heap_suitable(&socket_id, requested_len, align);
+
+		if (socket_id == SOCKET_ID_ANY) {
+			rte_errno = ENOMEM;
+			return NULL;
 		}
 	}
 
-	/* no segment found */
-	if (memseg_idx == -1) {
+	/* allocate memory on heap */
+	void *mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[socket_id], NULL,
+			requested_len, flags, align, bound);
+	if (mz_addr == NULL) {
 		rte_errno = ENOMEM;
 		return NULL;
 	}
 
-	/* save aligned physical and virtual addresses */
-	memseg_physaddr = free_memseg[memseg_idx].phys_addr + seg_offset;
-	memseg_addr = RTE_PTR_ADD(free_memseg[memseg_idx].addr,
-			(uintptr_t) seg_offset);
-
-	/* if we are looking for a biggest memzone */
-	if (len == 0) {
-		if (bound == 0)
-			requested_len = memseg_len - seg_offset;
-		else
-			requested_len = RTE_ALIGN_CEIL(memseg_physaddr + 1,
-				bound) - memseg_physaddr;
-	}
-
-	/* set length to correct value */
-	len = (size_t)seg_offset + requested_len;
-
-	/* update our internal state */
-	free_memseg[memseg_idx].len -= len;
-	free_memseg[memseg_idx].phys_addr += len;
-	free_memseg[memseg_idx].addr =
-		(char *)free_memseg[memseg_idx].addr + len;
+	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
 	struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
 	snprintf(mz->name, sizeof(mz->name), "%s", name);
-	mz->phys_addr = memseg_physaddr;
-	mz->addr = memseg_addr;
-	mz->len = requested_len;
-	mz->hugepage_sz = free_memseg[memseg_idx].hugepage_sz;
-	mz->socket_id = free_memseg[memseg_idx].socket_id;
+	mz->phys_addr = rte_malloc_virt2phy(mz_addr);
+	mz->addr = mz_addr;
+	mz->len = (requested_len == 0 ? elem->size : requested_len);
+	mz->hugepage_sz = elem->ms->hugepage_sz;
+	mz->socket_id = elem->ms->socket_id;
 	mz->flags = 0;
-	mz->memseg_id = memseg_idx;
+	mz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;
 
 	return mz;
 }
@@ -282,26 +223,6 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memzone *mz = NULL;
-	uint64_t size_mask = 0;
-
-	if (flags & RTE_MEMZONE_256KB)
-		size_mask |= RTE_PGSIZE_256K;
-	if (flags & RTE_MEMZONE_2MB)
-		size_mask |= RTE_PGSIZE_2M;
-	if (flags & RTE_MEMZONE_16MB)
-		size_mask |= RTE_PGSIZE_16M;
-	if (flags & RTE_MEMZONE_256MB)
-		size_mask |= RTE_PGSIZE_256M;
-	if (flags & RTE_MEMZONE_512MB)
-		size_mask |= RTE_PGSIZE_512M;
-	if (flags & RTE_MEMZONE_1GB)
-		size_mask |= RTE_PGSIZE_1G;
-	if (flags & RTE_MEMZONE_4GB)
-		size_mask |= RTE_PGSIZE_4G;
-	if (flags & RTE_MEMZONE_16GB)
-		size_mask |= RTE_PGSIZE_16G;
-	if (!size_mask)
-		size_mask = UINT64_MAX;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -309,18 +230,7 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 	rte_rwlock_write_lock(&mcfg->mlock);
 
 	mz = memzone_reserve_aligned_thread_unsafe(
-		name, len, socket_id, size_mask, align, bound);
-
-	/*
-	 * If we failed to allocate the requested page size, and the
-	 * RTE_MEMZONE_SIZE_HINT_ONLY flag is specified, try allocating
-	 * again.
-	 */
-	if (!mz && rte_errno == ENOMEM && size_mask != UINT64_MAX &&
-	    flags & RTE_MEMZONE_SIZE_HINT_ONLY) {
-		mz = memzone_reserve_aligned_thread_unsafe(
-			name, len, socket_id, UINT64_MAX, align, bound);
-	}
+		name, len, socket_id, flags, align, bound);
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
@@ -412,45 +322,6 @@ rte_memzone_dump(FILE *f)
 }
 
 /*
- * called by init: modify the free memseg list to have cache-aligned
- * addresses and cache-aligned lengths
- */
-static int
-memseg_sanitize(struct rte_memseg *memseg)
-{
-	unsigned phys_align;
-	unsigned virt_align;
-	unsigned off;
-
-	phys_align = memseg->phys_addr & RTE_CACHE_LINE_MASK;
-	virt_align = (unsigned long)memseg->addr & RTE_CACHE_LINE_MASK;
-
-	/*
-	 * sanity check: phys_addr and addr must have the same
-	 * alignment
-	 */
-	if (phys_align != virt_align)
-		return -1;
-
-	/* memseg is really too small, don't bother with it */
-	if (memseg->len < (2 * RTE_CACHE_LINE_SIZE)) {
-		memseg->len = 0;
-		return 0;
-	}
-
-	/* align start address */
-	off = (RTE_CACHE_LINE_SIZE - phys_align) & RTE_CACHE_LINE_MASK;
-	memseg->phys_addr += off;
-	memseg->addr = (char *)memseg->addr + off;
-	memseg->len -= off;
-
-	/* align end address */
-	memseg->len &= ~((uint64_t)RTE_CACHE_LINE_MASK);
-
-	return 0;
-}
-
-/*
  * Init the memzone subsystem
  */
 int
@@ -458,14 +329,10 @@ rte_eal_memzone_init(void)
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memseg *memseg;
-	unsigned i = 0;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* mirror the runtime memsegs from config */
-	free_memseg = mcfg->free_memseg;
-
 	/* secondary processes don't need to initialise anything */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return 0;
@@ -478,33 +345,13 @@ rte_eal_memzone_init(void)
 
 	rte_rwlock_write_lock(&mcfg->mlock);
 
-	/* fill in uninitialized free_memsegs */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (memseg[i].addr == NULL)
-			break;
-		if (free_memseg[i].addr != NULL)
-			continue;
-		memcpy(&free_memseg[i], &memseg[i], sizeof(struct rte_memseg));
-	}
-
-	/* make all zones cache-aligned */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (free_memseg[i].addr == NULL)
-			break;
-		if (memseg_sanitize(&free_memseg[i]) < 0) {
-			RTE_LOG(ERR, EAL, "%s(): Sanity check failed\n", __func__);
-			rte_rwlock_write_unlock(&mcfg->mlock);
-			return -1;
-		}
-	}
-
 	/* delete all zones */
 	mcfg->memzone_idx = 0;
 	memset(mcfg->memzone, 0, sizeof(mcfg->memzone));
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
-	return 0;
+	return rte_eal_malloc_heap_init();
 }
 
 /* Walk all reserved memory zones */
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 34f5abc..055212a 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,7 +73,7 @@ struct rte_mem_config {
 	struct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */
 	struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */
 
-	/* Runtime Physmem descriptors. */
+	/* Runtime Physmem descriptors - NOT USED */
 	struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
 
 	struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */
diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h
index 716216f..b270356 100644
--- a/lib/librte_eal/common/include/rte_malloc_heap.h
+++ b/lib/librte_eal/common/include/rte_malloc_heap.h
@@ -40,7 +40,7 @@
 #include <rte_memory.h>
 
 /* Number of free lists per heap, grouped by size. */
-#define RTE_HEAP_NUM_FREELISTS  5
+#define RTE_HEAP_NUM_FREELISTS  13
 
 /**
  * Structure to hold malloc heap
@@ -48,7 +48,6 @@
 struct malloc_heap {
 	rte_spinlock_t lock;
 	LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
-	unsigned mz_count;
 	unsigned alloc_count;
 	size_t total_size;
 } __rte_cache_aligned;
diff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c
index a5e1248..b54ee33 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -37,7 +37,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_per_lcore.h>
@@ -56,10 +55,10 @@
  */
 void
 malloc_elem_init(struct malloc_elem *elem,
-		struct malloc_heap *heap, const struct rte_memzone *mz, size_t size)
+		struct malloc_heap *heap, const struct rte_memseg *ms, size_t size)
 {
 	elem->heap = heap;
-	elem->mz = mz;
+	elem->ms = ms;
 	elem->prev = NULL;
 	memset(&elem->free_list, 0, sizeof(elem->free_list));
 	elem->state = ELEM_FREE;
@@ -70,12 +69,12 @@ malloc_elem_init(struct malloc_elem *elem,
 }
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
 {
-	malloc_elem_init(elem, prev->heap, prev->mz, 0);
+	malloc_elem_init(elem, prev->heap, prev->ms, 0);
 	elem->prev = prev;
 	elem->state = ELEM_BUSY; /* mark busy so its never merged */
 }
@@ -86,12 +85,24 @@ malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
  * fit, return NULL.
  */
 static void *
-elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
+elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	const uintptr_t end_pt = (uintptr_t)elem +
+	const size_t bmask = ~(bound - 1);
+	uintptr_t end_pt = (uintptr_t)elem +
 			elem->size - MALLOC_ELEM_TRAILER_LEN;
-	const uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
-	const uintptr_t new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
+	uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+	uintptr_t new_elem_start;
+
+	/* check boundary */
+	if ((new_data_start & bmask) != ((end_pt - 1) & bmask)) {
+		end_pt = RTE_ALIGN_FLOOR(end_pt, bound);
+		new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+		if (((end_pt - 1) & bmask) != (new_data_start & bmask))
+			return NULL;
+	}
+
+	new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
 
 	/* if the new start point is before the exist start, it won't fit */
 	return (new_elem_start < (uintptr_t)elem) ? NULL : (void *)new_elem_start;
@@ -102,9 +113,10 @@ elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
  * alignment request from the current element
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,	unsigned align,
+		size_t bound)
 {
-	return elem_start_pt(elem, size, align) != NULL;
+	return elem_start_pt(elem, size, align, bound) != NULL;
 }
 
 /*
@@ -115,10 +127,10 @@ static void
 split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)
 {
 	struct malloc_elem *next_elem = RTE_PTR_ADD(elem, elem->size);
-	const unsigned old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
-	const unsigned new_elem_size = elem->size - old_elem_size;
+	const size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
+	const size_t new_elem_size = elem->size - old_elem_size;
 
-	malloc_elem_init(split_pt, elem->heap, elem->mz, new_elem_size);
+	malloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);
 	split_pt->prev = elem;
 	next_elem->prev = split_pt;
 	elem->size = old_elem_size;
@@ -168,8 +180,9 @@ malloc_elem_free_list_index(size_t size)
 void
 malloc_elem_free_list_insert(struct malloc_elem *elem)
 {
-	size_t idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
+	size_t idx;
 
+	idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
 	elem->state = ELEM_FREE;
 	LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
 }
@@ -190,12 +203,26 @@ elem_free_list_remove(struct malloc_elem *elem)
  * is not done here, as it's done there previously.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	struct malloc_elem *new_elem = elem_start_pt(elem, size, align);
-	const unsigned old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	struct malloc_elem *new_elem = elem_start_pt(elem, size, align, bound);
+	const size_t old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	const size_t trailer_size = elem->size - old_elem_size - size -
+		MALLOC_ELEM_OVERHEAD;
+
+	elem_free_list_remove(elem);
 
-	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE){
+	if (trailer_size > MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
+		/* split it, too much free space after elem */
+		struct malloc_elem *new_free_elem =
+				RTE_PTR_ADD(new_elem, size + MALLOC_ELEM_OVERHEAD);
+
+		split_elem(elem, new_free_elem);
+		malloc_elem_free_list_insert(new_free_elem);
+	}
+
+	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
 		/* don't split it, pad the element instead */
 		elem->state = ELEM_BUSY;
 		elem->pad = old_elem_size;
@@ -208,8 +235,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 			new_elem->size = elem->size - elem->pad;
 			set_header(new_elem);
 		}
-		/* remove element from free list */
-		elem_free_list_remove(elem);
 
 		return new_elem;
 	}
@@ -219,7 +244,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 	 * Re-insert original element, in case its new size makes it
 	 * belong on a different list.
 	 */
-	elem_free_list_remove(elem);
 	split_elem(elem, new_elem);
 	new_elem->state = ELEM_BUSY;
 	malloc_elem_free_list_insert(elem);
diff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h
index 9790b1a..e05d2ea 100644
--- a/lib/librte_eal/common/malloc_elem.h
+++ b/lib/librte_eal/common/malloc_elem.h
@@ -47,9 +47,9 @@ enum elem_state {
 
 struct malloc_elem {
 	struct malloc_heap *heap;
-	struct malloc_elem *volatile prev;      /* points to prev elem in memzone */
+	struct malloc_elem *volatile prev;      /* points to prev elem in memseg */
 	LIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */
-	const struct rte_memzone *mz;
+	const struct rte_memseg *ms;
 	volatile enum elem_state state;
 	uint32_t pad;
 	size_t size;
@@ -136,11 +136,11 @@ malloc_elem_from_data(const void *data)
 void
 malloc_elem_init(struct malloc_elem *elem,
 		struct malloc_heap *heap,
-		const struct rte_memzone *mz,
+		const struct rte_memseg *ms,
 		size_t size);
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem,
@@ -151,14 +151,16 @@ malloc_elem_mkend(struct malloc_elem *elem,
  * of the requested size and with the requested alignment
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * reserve a block of data in an existing malloc_elem. If the malloc_elem
  * is much larger than the data block requested, we split the element in two.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_alloc(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * free a malloc_elem block by adding it to the free list. If the
diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index 8861d27..21d8914 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -39,7 +39,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_launch.h>
@@ -54,123 +53,125 @@
 #include "malloc_elem.h"
 #include "malloc_heap.h"
 
-/* since the memzone size starts with a digit, it will appear unquoted in
- * rte_config.h, so quote it so it can be passed to rte_str_to_size */
-#define MALLOC_MEMZONE_SIZE RTE_STR(RTE_MALLOC_MEMZONE_SIZE)
-
-/*
- * returns the configuration setting for the memzone size as a size_t value
- */
-static inline size_t
-get_malloc_memzone_size(void)
+static unsigned
+check_hugepage_sz(unsigned flags, uint64_t hugepage_sz)
 {
-	return rte_str_to_size(MALLOC_MEMZONE_SIZE);
+	unsigned check_flag = 0;
+
+	if (!(flags & ~RTE_MEMZONE_SIZE_HINT_ONLY))
+		return 1;
+
+	switch (hugepage_sz) {
+	case RTE_PGSIZE_256K:
+		check_flag = RTE_MEMZONE_256KB;
+		break;
+	case RTE_PGSIZE_2M:
+		check_flag = RTE_MEMZONE_2MB;
+		break;
+	case RTE_PGSIZE_16M:
+		check_flag = RTE_MEMZONE_16MB;
+		break;
+	case RTE_PGSIZE_256M:
+		check_flag = RTE_MEMZONE_256MB;
+		break;
+	case RTE_PGSIZE_512M:
+		check_flag = RTE_MEMZONE_512MB;
+		break;
+	case RTE_PGSIZE_1G:
+		check_flag = RTE_MEMZONE_1GB;
+		break;
+	case RTE_PGSIZE_4G:
+		check_flag = RTE_MEMZONE_4GB;
+		break;
+	case RTE_PGSIZE_16G:
+		check_flag = RTE_MEMZONE_16GB;
+	}
+
+	return (check_flag & flags);
 }
 
 /*
- * reserve an extra memory zone and make it available for use by a particular
- * heap. This reserves the zone and sets a dummy malloc_elem header at the end
+ * Expand the heap with a memseg.
+ * This reserves the zone and sets a dummy malloc_elem header at the end
  * to prevent overflow. The rest of the zone is added to free list as a single
  * large free block
  */
-static int
-malloc_heap_add_memzone(struct malloc_heap *heap, size_t size, unsigned align)
+static void
+malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)
 {
-	const unsigned mz_flags = 0;
-	const size_t block_size = get_malloc_memzone_size();
-	/* ensure the data we want to allocate will fit in the memzone */
-	const size_t min_size = size + align + MALLOC_ELEM_OVERHEAD * 2;
-	const struct rte_memzone *mz = NULL;
-	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	unsigned numa_socket = heap - mcfg->malloc_heaps;
-
-	size_t mz_size = min_size;
-	if (mz_size < block_size)
-		mz_size = block_size;
-
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	snprintf(mz_name, sizeof(mz_name), "MALLOC_S%u_HEAP_%u",
-		     numa_socket, heap->mz_count++);
-
-	/* try getting a block. if we fail and we don't need as big a block
-	 * as given in the config, we can shrink our request and try again
-	 */
-	do {
-		mz = rte_memzone_reserve(mz_name, mz_size, numa_socket,
-					 mz_flags);
-		if (mz == NULL)
-			mz_size /= 2;
-	} while (mz == NULL && mz_size > min_size);
-	if (mz == NULL)
-		return -1;
-
 	/* allocate the memory block headers, one at end, one at start */
-	struct malloc_elem *start_elem = (struct malloc_elem *)mz->addr;
-	struct malloc_elem *end_elem = RTE_PTR_ADD(mz->addr,
-			mz_size - MALLOC_ELEM_OVERHEAD);
+	struct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;
+	struct malloc_elem *end_elem = RTE_PTR_ADD(ms->addr,
+			ms->len - MALLOC_ELEM_OVERHEAD);
 	end_elem = RTE_PTR_ALIGN_FLOOR(end_elem, RTE_CACHE_LINE_SIZE);
+	const size_t elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
 
-	const unsigned elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
-	malloc_elem_init(start_elem, heap, mz, elem_size);
+	malloc_elem_init(start_elem, heap, ms, elem_size);
 	malloc_elem_mkend(end_elem, start_elem);
 	malloc_elem_free_list_insert(start_elem);
 
-	/* increase heap total size by size of new memzone */
-	heap->total_size+=mz_size - MALLOC_ELEM_OVERHEAD;
-	return 0;
+	heap->total_size += elem_size;
 }
 
 /*
  * Iterates through the freelist for a heap to find a free element
  * which can store data of the required size and with the requested alignment.
+ * If size is 0, find the biggest available elem.
  * Returns null on failure, or pointer to element on success.
  */
 static struct malloc_elem *
-find_suitable_element(struct malloc_heap *heap, size_t size, unsigned align)
+find_suitable_element(struct malloc_heap *heap, size_t size,
+		unsigned flags, size_t align, size_t bound)
 {
 	size_t idx;
-	struct malloc_elem *elem;
+	struct malloc_elem *elem, *alt_elem = NULL;
 
 	for (idx = malloc_elem_free_list_index(size);
-		idx < RTE_HEAP_NUM_FREELISTS; idx++)
-	{
+			idx < RTE_HEAP_NUM_FREELISTS; idx++) {
 		for (elem = LIST_FIRST(&heap->free_head[idx]);
-			!!elem; elem = LIST_NEXT(elem, free_list))
-		{
-			if (malloc_elem_can_hold(elem, size, align))
-				return elem;
+				!!elem; elem = LIST_NEXT(elem, free_list)) {
+			if (malloc_elem_can_hold(elem, size, align, bound)) {
+				if (check_hugepage_sz(flags, elem->ms->hugepage_sz))
+					return elem;
+				if (alt_elem == NULL)
+					alt_elem = elem;
+			}
 		}
 	}
+
+	if ((alt_elem != NULL) && (flags & RTE_MEMZONE_SIZE_HINT_ONLY))
+		return alt_elem;
+
 	return NULL;
 }
 
 /*
- * Main function called by malloc to allocate a block of memory from the
- * heap. It locks the free list, scans it, and adds a new memzone if the
- * scan fails. Once the new memzone is added, it re-scans and should return
+ * Main function to allocate a block of memory from the heap.
+ * It locks the free list, scans it, and adds a new memseg if the
+ * scan fails. Once the new memseg is added, it re-scans and should return
  * the new element after releasing the lock.
  */
 void *
 malloc_heap_alloc(struct malloc_heap *heap,
-		const char *type __attribute__((unused)), size_t size, unsigned align)
+		const char *type __attribute__((unused)), size_t size, unsigned flags,
+		size_t align, size_t bound)
 {
+	struct malloc_elem *elem;
+
 	size = RTE_CACHE_LINE_ROUNDUP(size);
 	align = RTE_CACHE_LINE_ROUNDUP(align);
+
 	rte_spinlock_lock(&heap->lock);
-	struct malloc_elem *elem = find_suitable_element(heap, size, align);
-	if (elem == NULL){
-		if ((malloc_heap_add_memzone(heap, size, align)) == 0)
-			elem = find_suitable_element(heap, size, align);
-	}
 
-	if (elem != NULL){
-		elem = malloc_elem_alloc(elem, size, align);
+	elem = find_suitable_element(heap, size, flags, align, bound);
+	if (elem != NULL) {
+		elem = malloc_elem_alloc(elem, size, align, bound);
 		/* increase heap's count of allocated elements */
 		heap->alloc_count++;
 	}
 	rte_spinlock_unlock(&heap->lock);
-	return elem == NULL ? NULL : (void *)(&elem[1]);
 
+	return elem == NULL ? NULL : (void *)(&elem[1]);
 }
 
 /*
@@ -206,3 +207,21 @@ malloc_heap_get_stats(const struct malloc_heap *heap,
 	socket_stats->alloc_count = heap->alloc_count;
 	return 0;
 }
+
+int
+rte_eal_malloc_heap_init(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	unsigned ms_cnt;
+	struct rte_memseg *ms;
+
+	if (mcfg == NULL)
+		return -1;
+
+	for (ms = &mcfg->memseg[0], ms_cnt = 0;
+			(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);
+			ms_cnt++, ms++)
+		malloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);
+
+	return 0;
+}
diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h
index a47136d..3ccbef0 100644
--- a/lib/librte_eal/common/malloc_heap.h
+++ b/lib/librte_eal/common/malloc_heap.h
@@ -53,15 +53,15 @@ malloc_get_numa_socket(void)
 }
 
 void *
-malloc_heap_alloc(struct malloc_heap *heap, const char *type,
-		size_t size, unsigned align);
+malloc_heap_alloc(struct malloc_heap *heap,	const char *type, size_t size,
+		unsigned flags, size_t align, size_t bound);
 
 int
 malloc_heap_get_stats(const struct malloc_heap *heap,
 		struct rte_malloc_socket_stats *socket_stats);
 
 int
-rte_eal_heap_memzone_init(void);
+rte_eal_malloc_heap_init(void);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index c313a57..54c2bd8 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -39,7 +39,6 @@
 
 #include <rte_memcpy.h>
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_branch_prediction.h>
@@ -87,7 +86,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 		return NULL;
 
 	ret = malloc_heap_alloc(&mcfg->malloc_heaps[socket], type,
-				size, align == 0 ? 1 : align);
+				size, 0, align == 0 ? 1 : align, 0);
 	if (ret != NULL || socket_arg != SOCKET_ID_ANY)
 		return ret;
 
@@ -98,7 +97,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 			continue;
 
 		ret = malloc_heap_alloc(&mcfg->malloc_heaps[i], type,
-					size, align == 0 ? 1 : align);
+					size, 0, align == 0 ? 1 : align, 0);
 		if (ret != NULL)
 			return ret;
 	}
@@ -256,5 +255,5 @@ rte_malloc_virt2phy(const void *addr)
 	const struct malloc_elem *elem = malloc_elem_from_data(addr);
 	if (elem == NULL)
 		return 0;
-	return elem->mz->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->mz->addr);
+	return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
 }
-- 
1.9.3

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v9 0/9] Dynamic memzones
  2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
  2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
  2015-07-14  8:57 19%   ` [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
@ 2015-07-15  8:26  4%   ` Sergio Gonzalez Monroy
  2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
                       ` (2 more replies)
  2 siblings, 3 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15  8:26 UTC (permalink / raw)
  To: dev

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v9:
 - Fix incorrect size_t type that results in 32bits compilation error.

v8:
 - Rebase against current HEAD to factor for changes made by new Tile-Gx arch

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests


v6 Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>


Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS                                       |  22 +-
 app/test/test_malloc.c                            |  86 ----
 app/test/test_memzone.c                           | 456 ++++------------------
 config/common_bsdapp                              |   8 +-
 config/common_linuxapp                            |   8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   | 220 ++++++++++-
 doc/guides/prog_guide/img/malloc_heap.png         | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst                   |   1 -
 doc/guides/prog_guide/malloc_lib.rst              | 233 -----------
 doc/guides/prog_guide/overview.rst                |  11 +-
 doc/guides/rel_notes/abi.rst                      |   6 +-
 drivers/net/af_packet/Makefile                    |   1 -
 drivers/net/bonding/Makefile                      |   1 -
 drivers/net/e1000/Makefile                        |   2 +-
 drivers/net/enic/Makefile                         |   2 +-
 drivers/net/fm10k/Makefile                        |   2 +-
 drivers/net/i40e/Makefile                         |   2 +-
 drivers/net/ixgbe/Makefile                        |   2 +-
 drivers/net/mlx4/Makefile                         |   1 -
 drivers/net/null/Makefile                         |   1 -
 drivers/net/pcap/Makefile                         |   1 -
 drivers/net/virtio/Makefile                       |   2 +-
 drivers/net/vmxnet3/Makefile                      |   2 +-
 drivers/net/xenvirt/Makefile                      |   2 +-
 lib/Makefile                                      |   2 +-
 lib/librte_acl/Makefile                           |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile                |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  19 +
 lib/librte_eal/common/Makefile                    |   1 +
 lib/librte_eal/common/eal_common_memzone.c        | 353 +++++++----------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   5 +-
 lib/librte_eal/common/include/rte_malloc.h        | 342 ++++++++++++++++
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/include/rte_memzone.h       |  11 +
 lib/librte_eal/common/malloc_elem.c               | 344 ++++++++++++++++
 lib/librte_eal/common/malloc_elem.h               | 192 +++++++++
 lib/librte_eal/common/malloc_heap.c               | 227 +++++++++++
 lib/librte_eal/common/malloc_heap.h               |  70 ++++
 lib/librte_eal/common/rte_malloc.c                | 259 ++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c         |  17 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  19 +
 lib/librte_hash/Makefile                          |   2 +-
 lib/librte_lpm/Makefile                           |   2 +-
 lib/librte_malloc/Makefile                        |   6 +-
 lib/librte_malloc/malloc_elem.c                   | 320 ---------------
 lib/librte_malloc/malloc_elem.h                   | 190 ---------
 lib/librte_malloc/malloc_heap.c                   | 208 ----------
 lib/librte_malloc/malloc_heap.h                   |  70 ----
 lib/librte_malloc/rte_malloc.c                    | 228 +----------
 lib/librte_malloc/rte_malloc.h                    | 342 ----------------
 lib/librte_malloc/rte_malloc_version.map          |  16 -
 lib/librte_mempool/Makefile                       |   2 -
 lib/librte_port/Makefile                          |   1 -
 lib/librte_ring/Makefile                          |   3 +-
 lib/librte_table/Makefile                         |   1 -
 56 files changed, 1965 insertions(+), 2372 deletions(-)
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  2015-07-13 17:29  0%   ` Thomas Monjalon
@ 2015-07-15  8:08  3%   ` Olga Shern
  1 sibling, 0 replies; 200+ results
From: Olga Shern @ 2015-07-15  8:08 UTC (permalink / raw)
  To: Bruce Richardson, dev

Hi, 

I see the following compilation error :
dpdk/lib/librte_hash/rte_cuckoo_hash.c:145: error: flexible array member in otherwise empty struct
when compiling on RH6.5

Best Regards,
Olga

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
Sent: Monday, July 13, 2015 7:39 PM
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"

The cuckoo hash has a fixed number of entries per bucket, so the configuration parameter for this is unused. We change this field in the parameters struct to "reserved" to indicate that there is now no such parameter value, while at the same time keeping ABI consistency.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_func_reentrancy.c | 1 -
 app/test/test_hash_perf.c       | 1 -
 app/test/test_hash_scaling.c    | 1 -
 drivers/net/enic/enic_clsf.c    | 2 --
 examples/l3fwd-power/main.c     | 2 --
 examples/l3fwd-vf/main.c        | 1 -
 examples/l3fwd/main.c           | 2 --
 lib/librte_hash/rte_hash.h      | 2 +-
 8 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c index 85504c0..be61773 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -226,7 +226,6 @@ hash_create_free(__attribute__((unused)) void *arg)
 	struct rte_hash_parameters hash_params = {
 		.name = NULL,
 		.entries = 16,
-		.bucket_entries = 4,
 		.key_len = 4,
 		.hash_func = (rte_hash_function)rte_jhash_32b,
 		.hash_func_init_val = 0,
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c index e9a522b..a87fc80 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -100,7 +100,6 @@ int32_t positions[KEYS_TO_ADD];
 /* Parameters used for hash table in unit test functions. */  static struct rte_hash_parameters ut_params = {
 	.entries = MAX_ENTRIES,
-	.bucket_entries = BUCKET_SIZE,
 	.hash_func = rte_jhash,
 	.hash_func_init_val = 0,
 };
diff --git a/app/test/test_hash_scaling.c b/app/test/test_hash_scaling.c index 682ae94..39602cb 100644
--- a/app/test/test_hash_scaling.c
+++ b/app/test/test_hash_scaling.c
@@ -129,7 +129,6 @@ test_hash_scaling(int locking_mode)
 	uint64_t i, key;
 	struct rte_hash_parameters hash_params = {
 		.entries = num_iterations*2,
-		.bucket_entries = 16,
 		.key_len = sizeof(key),
 		.hash_func = rte_hash_crc,
 		.hash_func_init_val = 0,
diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c index ca12d2d..9c2abfb 100644
--- a/drivers/net/enic/enic_clsf.c
+++ b/drivers/net/enic/enic_clsf.c
@@ -63,7 +63,6 @@
 
 #define SOCKET_0                0
 #define ENICPMD_CLSF_HASH_ENTRIES       ENICPMD_FDIR_MAX
-#define ENICPMD_CLSF_BUCKET_ENTRIES     4
 
 void enic_fdir_stats_get(struct enic *enic, struct rte_eth_fdir_stats *stats)  { @@ -245,7 +244,6 @@ int enic_clsf_init(struct enic *enic)
 	struct rte_hash_parameters hash_params = {
 		.name = "enicpmd_clsf_hash",
 		.entries = ENICPMD_CLSF_HASH_ENTRIES,
-		.bucket_entries = ENICPMD_CLSF_BUCKET_ENTRIES,
 		.key_len = RTE_HASH_KEY_LENGTH_MAX,
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c index d4eba1a..6eb459d 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1247,7 +1247,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv4_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv4_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
@@ -1256,7 +1255,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv6_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv6_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-vf/main.c b/examples/l3fwd-vf/main.c index ccbb02f..01f610e 100644
--- a/examples/l3fwd-vf/main.c
+++ b/examples/l3fwd-vf/main.c
@@ -251,7 +251,6 @@ static lookup_struct_t *l3fwd_lookup_struct[NB_SOCKETS];  struct rte_hash_parameters l3fwd_hash_params = {
 	.name = "l3fwd_hash_0",
 	.entries = L3FWD_HASH_ENTRIES,
-	.bucket_entries = 4,
 	.key_len = sizeof(struct ipv4_5tuple),
 	.hash_func = DEFAULT_HASH_FUNC,
 	.hash_func_init_val = 0,
diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 5c22ed1..def9594 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -2162,7 +2162,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv4_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv4_5tuple_host),
         .hash_func = ipv4_hash_crc,
         .hash_func_init_val = 0,
@@ -2171,7 +2170,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv6_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv6_5tuple_host),
         .hash_func = ipv6_hash_crc,
         .hash_func_init_val = 0,
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h index 68109d5..1cddc07 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -75,7 +75,7 @@ typedef uint32_t (*rte_hash_function)(const void *key, uint32_t key_len,  struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
-	uint32_t bucket_entries;        /**< Bucket entries. */
+	uint32_t reserved;		/**< Unused field. Should be set to 0 */
 	uint32_t key_len;		/**< Length of hash key. */
 	rte_hash_function hash_func;	/**< Primary Hash function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
--
2.4.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc
  2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
@ 2015-07-14  8:57  1%   ` Sergio Gonzalez Monroy
  2015-07-14  8:57 19%   ` [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-14  8:57 UTC (permalink / raw)
  To: dev

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/common/eal_common_memzone.c        | 289 +++++-----------------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c               |  68 +++--
 lib/librte_eal/common/malloc_elem.h               |  14 +-
 lib/librte_eal/common/malloc_heap.c               | 161 ++++++------
 lib/librte_eal/common/malloc_heap.h               |   6 +-
 lib/librte_eal/common/rte_malloc.c                |   7 +-
 8 files changed, 220 insertions(+), 330 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 9c1da71..fd7e73f 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 
+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"
 
-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
 	const struct rte_mem_config *mcfg;
+	const struct rte_memzone *mz;
 	unsigned i = 0;
 
 	/* get pointer to global configuration */
@@ -68,62 +68,62 @@ memzone_lookup_thread_unsafe(const char *name)
 	 * the algorithm is not optimal (linear), but there are few
 	 * zones and this function should be called at init only
 	 */
-	for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-		if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+	for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+		mz = &mcfg->memzone[i];
+		if (mz->addr != NULL && !strncmp(name, mz->name, RTE_MEMZONE_NAMESIZE))
 			return &mcfg->memzone[i];
 	}
 
 	return NULL;
 }
 
-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-	size_t bound)
+/* Find the heap with the greatest free block size */
+static void
+find_heap_max_free_elem(int *s, size_t *len, unsigned align)
 {
-	phys_addr_t addr_offset, bmask, end, start;
-	size_t step;
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-	step = RTE_MAX(align, bound);
-	bmask = ~((phys_addr_t)bound - 1);
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* calculate offset to closest alignment */
-	start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-	addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size > *len) {
+			*len = stats.greatest_free_size;
+			*s = i;
+		}
+	}
+	*len -= (MALLOC_ELEM_OVERHEAD + align);
+}
 
-	while (addr_offset + len < ms->len) {
+/* Find a heap that can allocate the requested size */
+static void
+find_heap_suitable(int *s, size_t len, unsigned align)
+{
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-		/* check, do we meet boundary condition */
-		end = start + len - (len != 0);
-		if ((start & bmask) == (end & bmask))
-			break;
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-		/* calculate next offset */
-		start = RTE_ALIGN_CEIL(start + 1, step);
-		addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size >= len + MALLOC_ELEM_OVERHEAD + align) {
+			*s = i;
+			break;
+		}
 	}
-
-	return addr_offset;
 }
 
 static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
-		int socket_id, uint64_t size_mask, unsigned align,
-		unsigned bound)
+		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
 	struct rte_mem_config *mcfg;
-	unsigned i = 0;
-	int memseg_idx = -1;
-	uint64_t addr_offset, seg_offset = 0;
 	size_t requested_len;
-	size_t memseg_len = 0;
-	phys_addr_t memseg_physaddr;
-	void *memseg_addr;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -155,7 +155,6 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	if (align < RTE_CACHE_LINE_SIZE)
 		align = RTE_CACHE_LINE_SIZE;
 
-
 	/* align length on cache boundary. Check for overflow before doing so */
 	if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) {
 		rte_errno = EINVAL; /* requested size too big */
@@ -169,108 +168,50 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	requested_len = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE,  len);
 
 	/* check that boundary condition is valid */
-	if (bound != 0 &&
-			(requested_len > bound || !rte_is_power_of_2(bound))) {
+	if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
 
-	/* find the smallest segment matching requirements */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		/* last segment */
-		if (free_memseg[i].addr == NULL)
-			break;
+	if (len == 0) {
+		if (bound != 0)
+			requested_len = bound;
+		else
+			requested_len = 0;
+	}
 
-		/* empty segment, skip it */
-		if (free_memseg[i].len == 0)
-			continue;
-
-		/* bad socket ID */
-		if (socket_id != SOCKET_ID_ANY &&
-		    free_memseg[i].socket_id != SOCKET_ID_ANY &&
-		    socket_id != free_memseg[i].socket_id)
-			continue;
-
-		/*
-		 * calculate offset to closest alignment that
-		 * meets boundary conditions.
-		 */
-		addr_offset = align_phys_boundary(free_memseg + i,
-			requested_len, align, bound);
-
-		/* check len */
-		if ((requested_len + addr_offset) > free_memseg[i].len)
-			continue;
-
-		if ((size_mask & free_memseg[i].hugepage_sz) == 0)
-			continue;
-
-		/* this segment is the best until now */
-		if (memseg_idx == -1) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
-		}
-		/* find the biggest contiguous zone */
-		else if (len == 0) {
-			if (free_memseg[i].len > memseg_len) {
-				memseg_idx = i;
-				memseg_len = free_memseg[i].len;
-				seg_offset = addr_offset;
-			}
-		}
-		/*
-		 * find the smallest (we already checked that current
-		 * zone length is > len
-		 */
-		else if (free_memseg[i].len + align < memseg_len ||
-				(free_memseg[i].len <= memseg_len + align &&
-				addr_offset < seg_offset)) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
+	if (socket_id == SOCKET_ID_ANY) {
+		if (requested_len == 0)
+			find_heap_max_free_elem(&socket_id, &requested_len, align);
+		else
+			find_heap_suitable(&socket_id, requested_len, align);
+
+		if (socket_id == SOCKET_ID_ANY) {
+			rte_errno = ENOMEM;
+			return NULL;
 		}
 	}
 
-	/* no segment found */
-	if (memseg_idx == -1) {
+	/* allocate memory on heap */
+	void *mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[socket_id], NULL,
+			requested_len, flags, align, bound);
+	if (mz_addr == NULL) {
 		rte_errno = ENOMEM;
 		return NULL;
 	}
 
-	/* save aligned physical and virtual addresses */
-	memseg_physaddr = free_memseg[memseg_idx].phys_addr + seg_offset;
-	memseg_addr = RTE_PTR_ADD(free_memseg[memseg_idx].addr,
-			(uintptr_t) seg_offset);
-
-	/* if we are looking for a biggest memzone */
-	if (len == 0) {
-		if (bound == 0)
-			requested_len = memseg_len - seg_offset;
-		else
-			requested_len = RTE_ALIGN_CEIL(memseg_physaddr + 1,
-				bound) - memseg_physaddr;
-	}
-
-	/* set length to correct value */
-	len = (size_t)seg_offset + requested_len;
-
-	/* update our internal state */
-	free_memseg[memseg_idx].len -= len;
-	free_memseg[memseg_idx].phys_addr += len;
-	free_memseg[memseg_idx].addr =
-		(char *)free_memseg[memseg_idx].addr + len;
+	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
 	struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
 	snprintf(mz->name, sizeof(mz->name), "%s", name);
-	mz->phys_addr = memseg_physaddr;
-	mz->addr = memseg_addr;
-	mz->len = requested_len;
-	mz->hugepage_sz = free_memseg[memseg_idx].hugepage_sz;
-	mz->socket_id = free_memseg[memseg_idx].socket_id;
+	mz->phys_addr = rte_malloc_virt2phy(mz_addr);
+	mz->addr = mz_addr;
+	mz->len = (requested_len == 0 ? elem->size : requested_len);
+	mz->hugepage_sz = elem->ms->hugepage_sz;
+	mz->socket_id = elem->ms->socket_id;
 	mz->flags = 0;
-	mz->memseg_id = memseg_idx;
+	mz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;
 
 	return mz;
 }
@@ -282,26 +223,6 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memzone *mz = NULL;
-	uint64_t size_mask = 0;
-
-	if (flags & RTE_MEMZONE_256KB)
-		size_mask |= RTE_PGSIZE_256K;
-	if (flags & RTE_MEMZONE_2MB)
-		size_mask |= RTE_PGSIZE_2M;
-	if (flags & RTE_MEMZONE_16MB)
-		size_mask |= RTE_PGSIZE_16M;
-	if (flags & RTE_MEMZONE_256MB)
-		size_mask |= RTE_PGSIZE_256M;
-	if (flags & RTE_MEMZONE_512MB)
-		size_mask |= RTE_PGSIZE_512M;
-	if (flags & RTE_MEMZONE_1GB)
-		size_mask |= RTE_PGSIZE_1G;
-	if (flags & RTE_MEMZONE_4GB)
-		size_mask |= RTE_PGSIZE_4G;
-	if (flags & RTE_MEMZONE_16GB)
-		size_mask |= RTE_PGSIZE_16G;
-	if (!size_mask)
-		size_mask = UINT64_MAX;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -309,18 +230,7 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 	rte_rwlock_write_lock(&mcfg->mlock);
 
 	mz = memzone_reserve_aligned_thread_unsafe(
-		name, len, socket_id, size_mask, align, bound);
-
-	/*
-	 * If we failed to allocate the requested page size, and the
-	 * RTE_MEMZONE_SIZE_HINT_ONLY flag is specified, try allocating
-	 * again.
-	 */
-	if (!mz && rte_errno == ENOMEM && size_mask != UINT64_MAX &&
-	    flags & RTE_MEMZONE_SIZE_HINT_ONLY) {
-		mz = memzone_reserve_aligned_thread_unsafe(
-			name, len, socket_id, UINT64_MAX, align, bound);
-	}
+		name, len, socket_id, flags, align, bound);
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
@@ -412,45 +322,6 @@ rte_memzone_dump(FILE *f)
 }
 
 /*
- * called by init: modify the free memseg list to have cache-aligned
- * addresses and cache-aligned lengths
- */
-static int
-memseg_sanitize(struct rte_memseg *memseg)
-{
-	unsigned phys_align;
-	unsigned virt_align;
-	unsigned off;
-
-	phys_align = memseg->phys_addr & RTE_CACHE_LINE_MASK;
-	virt_align = (unsigned long)memseg->addr & RTE_CACHE_LINE_MASK;
-
-	/*
-	 * sanity check: phys_addr and addr must have the same
-	 * alignment
-	 */
-	if (phys_align != virt_align)
-		return -1;
-
-	/* memseg is really too small, don't bother with it */
-	if (memseg->len < (2 * RTE_CACHE_LINE_SIZE)) {
-		memseg->len = 0;
-		return 0;
-	}
-
-	/* align start address */
-	off = (RTE_CACHE_LINE_SIZE - phys_align) & RTE_CACHE_LINE_MASK;
-	memseg->phys_addr += off;
-	memseg->addr = (char *)memseg->addr + off;
-	memseg->len -= off;
-
-	/* align end address */
-	memseg->len &= ~((uint64_t)RTE_CACHE_LINE_MASK);
-
-	return 0;
-}
-
-/*
  * Init the memzone subsystem
  */
 int
@@ -458,14 +329,10 @@ rte_eal_memzone_init(void)
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memseg *memseg;
-	unsigned i = 0;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* mirror the runtime memsegs from config */
-	free_memseg = mcfg->free_memseg;
-
 	/* secondary processes don't need to initialise anything */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return 0;
@@ -478,33 +345,13 @@ rte_eal_memzone_init(void)
 
 	rte_rwlock_write_lock(&mcfg->mlock);
 
-	/* fill in uninitialized free_memsegs */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (memseg[i].addr == NULL)
-			break;
-		if (free_memseg[i].addr != NULL)
-			continue;
-		memcpy(&free_memseg[i], &memseg[i], sizeof(struct rte_memseg));
-	}
-
-	/* make all zones cache-aligned */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (free_memseg[i].addr == NULL)
-			break;
-		if (memseg_sanitize(&free_memseg[i]) < 0) {
-			RTE_LOG(ERR, EAL, "%s(): Sanity check failed\n", __func__);
-			rte_rwlock_write_unlock(&mcfg->mlock);
-			return -1;
-		}
-	}
-
 	/* delete all zones */
 	mcfg->memzone_idx = 0;
 	memset(mcfg->memzone, 0, sizeof(mcfg->memzone));
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
-	return 0;
+	return rte_eal_malloc_heap_init();
 }
 
 /* Walk all reserved memory zones */
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 34f5abc..055212a 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,7 +73,7 @@ struct rte_mem_config {
 	struct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */
 	struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */
 
-	/* Runtime Physmem descriptors. */
+	/* Runtime Physmem descriptors - NOT USED */
 	struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
 
 	struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */
diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h
index 716216f..b270356 100644
--- a/lib/librte_eal/common/include/rte_malloc_heap.h
+++ b/lib/librte_eal/common/include/rte_malloc_heap.h
@@ -40,7 +40,7 @@
 #include <rte_memory.h>
 
 /* Number of free lists per heap, grouped by size. */
-#define RTE_HEAP_NUM_FREELISTS  5
+#define RTE_HEAP_NUM_FREELISTS  13
 
 /**
  * Structure to hold malloc heap
@@ -48,7 +48,6 @@
 struct malloc_heap {
 	rte_spinlock_t lock;
 	LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
-	unsigned mz_count;
 	unsigned alloc_count;
 	size_t total_size;
 } __rte_cache_aligned;
diff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c
index a5e1248..b54ee33 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -37,7 +37,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_per_lcore.h>
@@ -56,10 +55,10 @@
  */
 void
 malloc_elem_init(struct malloc_elem *elem,
-		struct malloc_heap *heap, const struct rte_memzone *mz, size_t size)
+		struct malloc_heap *heap, const struct rte_memseg *ms, size_t size)
 {
 	elem->heap = heap;
-	elem->mz = mz;
+	elem->ms = ms;
 	elem->prev = NULL;
 	memset(&elem->free_list, 0, sizeof(elem->free_list));
 	elem->state = ELEM_FREE;
@@ -70,12 +69,12 @@ malloc_elem_init(struct malloc_elem *elem,
 }
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
 {
-	malloc_elem_init(elem, prev->heap, prev->mz, 0);
+	malloc_elem_init(elem, prev->heap, prev->ms, 0);
 	elem->prev = prev;
 	elem->state = ELEM_BUSY; /* mark busy so its never merged */
 }
@@ -86,12 +85,24 @@ malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
  * fit, return NULL.
  */
 static void *
-elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
+elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	const uintptr_t end_pt = (uintptr_t)elem +
+	const size_t bmask = ~(bound - 1);
+	uintptr_t end_pt = (uintptr_t)elem +
 			elem->size - MALLOC_ELEM_TRAILER_LEN;
-	const uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
-	const uintptr_t new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
+	uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+	uintptr_t new_elem_start;
+
+	/* check boundary */
+	if ((new_data_start & bmask) != ((end_pt - 1) & bmask)) {
+		end_pt = RTE_ALIGN_FLOOR(end_pt, bound);
+		new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+		if (((end_pt - 1) & bmask) != (new_data_start & bmask))
+			return NULL;
+	}
+
+	new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
 
 	/* if the new start point is before the exist start, it won't fit */
 	return (new_elem_start < (uintptr_t)elem) ? NULL : (void *)new_elem_start;
@@ -102,9 +113,10 @@ elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
  * alignment request from the current element
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,	unsigned align,
+		size_t bound)
 {
-	return elem_start_pt(elem, size, align) != NULL;
+	return elem_start_pt(elem, size, align, bound) != NULL;
 }
 
 /*
@@ -115,10 +127,10 @@ static void
 split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)
 {
 	struct malloc_elem *next_elem = RTE_PTR_ADD(elem, elem->size);
-	const unsigned old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
-	const unsigned new_elem_size = elem->size - old_elem_size;
+	const size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
+	const size_t new_elem_size = elem->size - old_elem_size;
 
-	malloc_elem_init(split_pt, elem->heap, elem->mz, new_elem_size);
+	malloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);
 	split_pt->prev = elem;
 	next_elem->prev = split_pt;
 	elem->size = old_elem_size;
@@ -168,8 +180,9 @@ malloc_elem_free_list_index(size_t size)
 void
 malloc_elem_free_list_insert(struct malloc_elem *elem)
 {
-	size_t idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
+	size_t idx;
 
+	idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
 	elem->state = ELEM_FREE;
 	LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
 }
@@ -190,12 +203,26 @@ elem_free_list_remove(struct malloc_elem *elem)
  * is not done here, as it's done there previously.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	struct malloc_elem *new_elem = elem_start_pt(elem, size, align);
-	const unsigned old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	struct malloc_elem *new_elem = elem_start_pt(elem, size, align, bound);
+	const size_t old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	const size_t trailer_size = elem->size - old_elem_size - size -
+		MALLOC_ELEM_OVERHEAD;
+
+	elem_free_list_remove(elem);
 
-	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE){
+	if (trailer_size > MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
+		/* split it, too much free space after elem */
+		struct malloc_elem *new_free_elem =
+				RTE_PTR_ADD(new_elem, size + MALLOC_ELEM_OVERHEAD);
+
+		split_elem(elem, new_free_elem);
+		malloc_elem_free_list_insert(new_free_elem);
+	}
+
+	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
 		/* don't split it, pad the element instead */
 		elem->state = ELEM_BUSY;
 		elem->pad = old_elem_size;
@@ -208,8 +235,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 			new_elem->size = elem->size - elem->pad;
 			set_header(new_elem);
 		}
-		/* remove element from free list */
-		elem_free_list_remove(elem);
 
 		return new_elem;
 	}
@@ -219,7 +244,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 	 * Re-insert original element, in case its new size makes it
 	 * belong on a different list.
 	 */
-	elem_free_list_remove(elem);
 	split_elem(elem, new_elem);
 	new_elem->state = ELEM_BUSY;
 	malloc_elem_free_list_insert(elem);
diff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h
index 9790b1a..e05d2ea 100644
--- a/lib/librte_eal/common/malloc_elem.h
+++ b/lib/librte_eal/common/malloc_elem.h
@@ -47,9 +47,9 @@ enum elem_state {
 
 struct malloc_elem {
 	struct malloc_heap *heap;
-	struct malloc_elem *volatile prev;      /* points to prev elem in memzone */
+	struct malloc_elem *volatile prev;      /* points to prev elem in memseg */
 	LIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */
-	const struct rte_memzone *mz;
+	const struct rte_memseg *ms;
 	volatile enum elem_state state;
 	uint32_t pad;
 	size_t size;
@@ -136,11 +136,11 @@ malloc_elem_from_data(const void *data)
 void
 malloc_elem_init(struct malloc_elem *elem,
 		struct malloc_heap *heap,
-		const struct rte_memzone *mz,
+		const struct rte_memseg *ms,
 		size_t size);
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem,
@@ -151,14 +151,16 @@ malloc_elem_mkend(struct malloc_elem *elem,
  * of the requested size and with the requested alignment
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * reserve a block of data in an existing malloc_elem. If the malloc_elem
  * is much larger than the data block requested, we split the element in two.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_alloc(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * free a malloc_elem block by adding it to the free list. If the
diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index 8861d27..2496b77 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -39,7 +39,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_launch.h>
@@ -54,123 +53,125 @@
 #include "malloc_elem.h"
 #include "malloc_heap.h"
 
-/* since the memzone size starts with a digit, it will appear unquoted in
- * rte_config.h, so quote it so it can be passed to rte_str_to_size */
-#define MALLOC_MEMZONE_SIZE RTE_STR(RTE_MALLOC_MEMZONE_SIZE)
-
-/*
- * returns the configuration setting for the memzone size as a size_t value
- */
-static inline size_t
-get_malloc_memzone_size(void)
+static unsigned
+check_hugepage_sz(unsigned flags, size_t hugepage_sz)
 {
-	return rte_str_to_size(MALLOC_MEMZONE_SIZE);
+	unsigned check_flag = 0;
+
+	if (!(flags & ~RTE_MEMZONE_SIZE_HINT_ONLY))
+		return 1;
+
+	switch (hugepage_sz) {
+	case RTE_PGSIZE_256K:
+		check_flag = RTE_MEMZONE_256KB;
+		break;
+	case RTE_PGSIZE_2M:
+		check_flag = RTE_MEMZONE_2MB;
+		break;
+	case RTE_PGSIZE_16M:
+		check_flag = RTE_MEMZONE_16MB;
+		break;
+	case RTE_PGSIZE_256M:
+		check_flag = RTE_MEMZONE_256MB;
+		break;
+	case RTE_PGSIZE_512M:
+		check_flag = RTE_MEMZONE_512MB;
+		break;
+	case RTE_PGSIZE_1G:
+		check_flag = RTE_MEMZONE_1GB;
+		break;
+	case RTE_PGSIZE_4G:
+		check_flag = RTE_MEMZONE_4GB;
+		break;
+	case RTE_PGSIZE_16G:
+		check_flag = RTE_MEMZONE_16GB;
+	}
+
+	return (check_flag & flags);
 }
 
 /*
- * reserve an extra memory zone and make it available for use by a particular
- * heap. This reserves the zone and sets a dummy malloc_elem header at the end
+ * Expand the heap with a memseg.
+ * This reserves the zone and sets a dummy malloc_elem header at the end
  * to prevent overflow. The rest of the zone is added to free list as a single
  * large free block
  */
-static int
-malloc_heap_add_memzone(struct malloc_heap *heap, size_t size, unsigned align)
+static void
+malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)
 {
-	const unsigned mz_flags = 0;
-	const size_t block_size = get_malloc_memzone_size();
-	/* ensure the data we want to allocate will fit in the memzone */
-	const size_t min_size = size + align + MALLOC_ELEM_OVERHEAD * 2;
-	const struct rte_memzone *mz = NULL;
-	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	unsigned numa_socket = heap - mcfg->malloc_heaps;
-
-	size_t mz_size = min_size;
-	if (mz_size < block_size)
-		mz_size = block_size;
-
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	snprintf(mz_name, sizeof(mz_name), "MALLOC_S%u_HEAP_%u",
-		     numa_socket, heap->mz_count++);
-
-	/* try getting a block. if we fail and we don't need as big a block
-	 * as given in the config, we can shrink our request and try again
-	 */
-	do {
-		mz = rte_memzone_reserve(mz_name, mz_size, numa_socket,
-					 mz_flags);
-		if (mz == NULL)
-			mz_size /= 2;
-	} while (mz == NULL && mz_size > min_size);
-	if (mz == NULL)
-		return -1;
-
 	/* allocate the memory block headers, one at end, one at start */
-	struct malloc_elem *start_elem = (struct malloc_elem *)mz->addr;
-	struct malloc_elem *end_elem = RTE_PTR_ADD(mz->addr,
-			mz_size - MALLOC_ELEM_OVERHEAD);
+	struct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;
+	struct malloc_elem *end_elem = RTE_PTR_ADD(ms->addr,
+			ms->len - MALLOC_ELEM_OVERHEAD);
 	end_elem = RTE_PTR_ALIGN_FLOOR(end_elem, RTE_CACHE_LINE_SIZE);
+	const size_t elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
 
-	const unsigned elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
-	malloc_elem_init(start_elem, heap, mz, elem_size);
+	malloc_elem_init(start_elem, heap, ms, elem_size);
 	malloc_elem_mkend(end_elem, start_elem);
 	malloc_elem_free_list_insert(start_elem);
 
-	/* increase heap total size by size of new memzone */
-	heap->total_size+=mz_size - MALLOC_ELEM_OVERHEAD;
-	return 0;
+	heap->total_size += elem_size;
 }
 
 /*
  * Iterates through the freelist for a heap to find a free element
  * which can store data of the required size and with the requested alignment.
+ * If size is 0, find the biggest available elem.
  * Returns null on failure, or pointer to element on success.
  */
 static struct malloc_elem *
-find_suitable_element(struct malloc_heap *heap, size_t size, unsigned align)
+find_suitable_element(struct malloc_heap *heap, size_t size,
+		unsigned flags, size_t align, size_t bound)
 {
 	size_t idx;
-	struct malloc_elem *elem;
+	struct malloc_elem *elem, *alt_elem = NULL;
 
 	for (idx = malloc_elem_free_list_index(size);
-		idx < RTE_HEAP_NUM_FREELISTS; idx++)
-	{
+			idx < RTE_HEAP_NUM_FREELISTS; idx++) {
 		for (elem = LIST_FIRST(&heap->free_head[idx]);
-			!!elem; elem = LIST_NEXT(elem, free_list))
-		{
-			if (malloc_elem_can_hold(elem, size, align))
-				return elem;
+				!!elem; elem = LIST_NEXT(elem, free_list)) {
+			if (malloc_elem_can_hold(elem, size, align, bound)) {
+				if (check_hugepage_sz(flags, elem->ms->hugepage_sz))
+					return elem;
+				if (alt_elem == NULL)
+					alt_elem = elem;
+			}
 		}
 	}
+
+	if ((alt_elem != NULL) && (flags & RTE_MEMZONE_SIZE_HINT_ONLY))
+		return alt_elem;
+
 	return NULL;
 }
 
 /*
- * Main function called by malloc to allocate a block of memory from the
- * heap. It locks the free list, scans it, and adds a new memzone if the
- * scan fails. Once the new memzone is added, it re-scans and should return
+ * Main function to allocate a block of memory from the heap.
+ * It locks the free list, scans it, and adds a new memseg if the
+ * scan fails. Once the new memseg is added, it re-scans and should return
  * the new element after releasing the lock.
  */
 void *
 malloc_heap_alloc(struct malloc_heap *heap,
-		const char *type __attribute__((unused)), size_t size, unsigned align)
+		const char *type __attribute__((unused)), size_t size, unsigned flags,
+		size_t align, size_t bound)
 {
+	struct malloc_elem *elem;
+
 	size = RTE_CACHE_LINE_ROUNDUP(size);
 	align = RTE_CACHE_LINE_ROUNDUP(align);
+
 	rte_spinlock_lock(&heap->lock);
-	struct malloc_elem *elem = find_suitable_element(heap, size, align);
-	if (elem == NULL){
-		if ((malloc_heap_add_memzone(heap, size, align)) == 0)
-			elem = find_suitable_element(heap, size, align);
-	}
 
-	if (elem != NULL){
-		elem = malloc_elem_alloc(elem, size, align);
+	elem = find_suitable_element(heap, size, flags, align, bound);
+	if (elem != NULL) {
+		elem = malloc_elem_alloc(elem, size, align, bound);
 		/* increase heap's count of allocated elements */
 		heap->alloc_count++;
 	}
 	rte_spinlock_unlock(&heap->lock);
-	return elem == NULL ? NULL : (void *)(&elem[1]);
 
+	return elem == NULL ? NULL : (void *)(&elem[1]);
 }
 
 /*
@@ -206,3 +207,21 @@ malloc_heap_get_stats(const struct malloc_heap *heap,
 	socket_stats->alloc_count = heap->alloc_count;
 	return 0;
 }
+
+int
+rte_eal_malloc_heap_init(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	unsigned ms_cnt;
+	struct rte_memseg *ms;
+
+	if (mcfg == NULL)
+		return -1;
+
+	for (ms = &mcfg->memseg[0], ms_cnt = 0;
+			(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);
+			ms_cnt++, ms++)
+		malloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);
+
+	return 0;
+}
diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h
index a47136d..3ccbef0 100644
--- a/lib/librte_eal/common/malloc_heap.h
+++ b/lib/librte_eal/common/malloc_heap.h
@@ -53,15 +53,15 @@ malloc_get_numa_socket(void)
 }
 
 void *
-malloc_heap_alloc(struct malloc_heap *heap, const char *type,
-		size_t size, unsigned align);
+malloc_heap_alloc(struct malloc_heap *heap,	const char *type, size_t size,
+		unsigned flags, size_t align, size_t bound);
 
 int
 malloc_heap_get_stats(const struct malloc_heap *heap,
 		struct rte_malloc_socket_stats *socket_stats);
 
 int
-rte_eal_heap_memzone_init(void);
+rte_eal_malloc_heap_init(void);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index c313a57..54c2bd8 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -39,7 +39,6 @@
 
 #include <rte_memcpy.h>
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_branch_prediction.h>
@@ -87,7 +86,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 		return NULL;
 
 	ret = malloc_heap_alloc(&mcfg->malloc_heaps[socket], type,
-				size, align == 0 ? 1 : align);
+				size, 0, align == 0 ? 1 : align, 0);
 	if (ret != NULL || socket_arg != SOCKET_ID_ANY)
 		return ret;
 
@@ -98,7 +97,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 			continue;
 
 		ret = malloc_heap_alloc(&mcfg->malloc_heaps[i], type,
-					size, align == 0 ? 1 : align);
+					size, 0, align == 0 ? 1 : align, 0);
 		if (ret != NULL)
 			return ret;
 	}
@@ -256,5 +255,5 @@ rte_malloc_virt2phy(const void *addr)
 	const struct malloc_elem *elem = malloc_elem_from_data(addr);
 	if (elem == NULL)
 		return 0;
-	return elem->mz->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->mz->addr);
+	return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
 }
-- 
1.9.3

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc
  2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
  2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
@ 2015-07-14  8:57 19%   ` Sergio Gonzalez Monroy
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-14  8:57 UTC (permalink / raw)
  To: dev

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 doc/guides/rel_notes/abi.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..76e0ae2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -16,7 +16,6 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
-
 * Significant ABI changes are planned for struct rte_eth_dev to support up to
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
@@ -24,3 +23,8 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* librte_malloc library has been integrated into librte_eal. The 2.1 release
+  creates a dummy/empty malloc library to fulfill binaries with dynamic linking
+  dependencies on librte_malloc.so. Such dummy library will not be created from
+  release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

^ permalink raw reply	[relevance 19%]

* [dpdk-dev] [PATCH v8 0/9] Dynamic memzones
  @ 2015-07-14  8:57  4% ` Sergio Gonzalez Monroy
  2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
                     ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-14  8:57 UTC (permalink / raw)
  To: dev

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v8:
 - Rebase against current HEAD to factor for changes made by new Tile-Gx arch

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests


v6 Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>


Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS                                       |  22 +-
 app/test/test_malloc.c                            |  86 ----
 app/test/test_memzone.c                           | 456 ++++------------------
 config/common_bsdapp                              |   8 +-
 config/common_linuxapp                            |   8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   | 220 ++++++++++-
 doc/guides/prog_guide/img/malloc_heap.png         | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst                   |   1 -
 doc/guides/prog_guide/malloc_lib.rst              | 233 -----------
 doc/guides/prog_guide/overview.rst                |  11 +-
 doc/guides/rel_notes/abi.rst                      |   6 +-
 drivers/net/af_packet/Makefile                    |   1 -
 drivers/net/bonding/Makefile                      |   1 -
 drivers/net/e1000/Makefile                        |   2 +-
 drivers/net/enic/Makefile                         |   2 +-
 drivers/net/fm10k/Makefile                        |   2 +-
 drivers/net/i40e/Makefile                         |   2 +-
 drivers/net/ixgbe/Makefile                        |   2 +-
 drivers/net/mlx4/Makefile                         |   1 -
 drivers/net/null/Makefile                         |   1 -
 drivers/net/pcap/Makefile                         |   1 -
 drivers/net/virtio/Makefile                       |   2 +-
 drivers/net/vmxnet3/Makefile                      |   2 +-
 drivers/net/xenvirt/Makefile                      |   2 +-
 lib/Makefile                                      |   2 +-
 lib/librte_acl/Makefile                           |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile                |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  19 +
 lib/librte_eal/common/Makefile                    |   1 +
 lib/librte_eal/common/eal_common_memzone.c        | 353 +++++++----------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   5 +-
 lib/librte_eal/common/include/rte_malloc.h        | 342 ++++++++++++++++
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/include/rte_memzone.h       |  11 +
 lib/librte_eal/common/malloc_elem.c               | 344 ++++++++++++++++
 lib/librte_eal/common/malloc_elem.h               | 192 +++++++++
 lib/librte_eal/common/malloc_heap.c               | 227 +++++++++++
 lib/librte_eal/common/malloc_heap.h               |  70 ++++
 lib/librte_eal/common/rte_malloc.c                | 259 ++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c         |  17 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  19 +
 lib/librte_hash/Makefile                          |   2 +-
 lib/librte_lpm/Makefile                           |   2 +-
 lib/librte_malloc/Makefile                        |   6 +-
 lib/librte_malloc/malloc_elem.c                   | 320 ---------------
 lib/librte_malloc/malloc_elem.h                   | 190 ---------
 lib/librte_malloc/malloc_heap.c                   | 208 ----------
 lib/librte_malloc/malloc_heap.h                   |  70 ----
 lib/librte_malloc/rte_malloc.c                    | 228 +----------
 lib/librte_malloc/rte_malloc.h                    | 342 ----------------
 lib/librte_malloc/rte_malloc_version.map          |  16 -
 lib/librte_mempool/Makefile                       |   2 -
 lib/librte_port/Makefile                          |   1 -
 lib/librte_ring/Makefile                          |   3 +-
 lib/librte_table/Makefile                         |   1 -
 56 files changed, 1965 insertions(+), 2372 deletions(-)
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
@ 2015-07-13 17:29  0%   ` Thomas Monjalon
  2015-07-15  8:08  3%   ` Olga Shern
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13 17:29 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2015-07-13 17:38, Bruce Richardson:
> The cuckoo hash has a fixed number of entries per bucket, so the
> configuration parameter for this is unused. We change this field in the
> parameters struct to "reserved" to indicate that there is now no such
> parameter value, while at the same time keeping ABI consistency.
> 
> Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
> 
> Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"
  2015-07-13 16:25  3% [dpdk-dev] [PATCH] hash: rename unused field to "reserved" Bruce Richardson
  2015-07-13 16:28  0% ` Bruce Richardson
@ 2015-07-13 16:38  3% ` Bruce Richardson
  2015-07-13 17:29  0%   ` Thomas Monjalon
  2015-07-15  8:08  3%   ` Olga Shern
  1 sibling, 2 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:38 UTC (permalink / raw)
  To: dev

The cuckoo hash has a fixed number of entries per bucket, so the
configuration parameter for this is unused. We change this field in the
parameters struct to "reserved" to indicate that there is now no such
parameter value, while at the same time keeping ABI consistency.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_func_reentrancy.c | 1 -
 app/test/test_hash_perf.c       | 1 -
 app/test/test_hash_scaling.c    | 1 -
 drivers/net/enic/enic_clsf.c    | 2 --
 examples/l3fwd-power/main.c     | 2 --
 examples/l3fwd-vf/main.c        | 1 -
 examples/l3fwd/main.c           | 2 --
 lib/librte_hash/rte_hash.h      | 2 +-
 8 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
index 85504c0..be61773 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -226,7 +226,6 @@ hash_create_free(__attribute__((unused)) void *arg)
 	struct rte_hash_parameters hash_params = {
 		.name = NULL,
 		.entries = 16,
-		.bucket_entries = 4,
 		.key_len = 4,
 		.hash_func = (rte_hash_function)rte_jhash_32b,
 		.hash_func_init_val = 0,
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index e9a522b..a87fc80 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -100,7 +100,6 @@ int32_t positions[KEYS_TO_ADD];
 /* Parameters used for hash table in unit test functions. */
 static struct rte_hash_parameters ut_params = {
 	.entries = MAX_ENTRIES,
-	.bucket_entries = BUCKET_SIZE,
 	.hash_func = rte_jhash,
 	.hash_func_init_val = 0,
 };
diff --git a/app/test/test_hash_scaling.c b/app/test/test_hash_scaling.c
index 682ae94..39602cb 100644
--- a/app/test/test_hash_scaling.c
+++ b/app/test/test_hash_scaling.c
@@ -129,7 +129,6 @@ test_hash_scaling(int locking_mode)
 	uint64_t i, key;
 	struct rte_hash_parameters hash_params = {
 		.entries = num_iterations*2,
-		.bucket_entries = 16,
 		.key_len = sizeof(key),
 		.hash_func = rte_hash_crc,
 		.hash_func_init_val = 0,
diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c
index ca12d2d..9c2abfb 100644
--- a/drivers/net/enic/enic_clsf.c
+++ b/drivers/net/enic/enic_clsf.c
@@ -63,7 +63,6 @@
 
 #define SOCKET_0                0
 #define ENICPMD_CLSF_HASH_ENTRIES       ENICPMD_FDIR_MAX
-#define ENICPMD_CLSF_BUCKET_ENTRIES     4
 
 void enic_fdir_stats_get(struct enic *enic, struct rte_eth_fdir_stats *stats)
 {
@@ -245,7 +244,6 @@ int enic_clsf_init(struct enic *enic)
 	struct rte_hash_parameters hash_params = {
 		.name = "enicpmd_clsf_hash",
 		.entries = ENICPMD_CLSF_HASH_ENTRIES,
-		.bucket_entries = ENICPMD_CLSF_BUCKET_ENTRIES,
 		.key_len = RTE_HASH_KEY_LENGTH_MAX,
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index d4eba1a..6eb459d 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1247,7 +1247,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv4_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv4_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
@@ -1256,7 +1255,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv6_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv6_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-vf/main.c b/examples/l3fwd-vf/main.c
index ccbb02f..01f610e 100644
--- a/examples/l3fwd-vf/main.c
+++ b/examples/l3fwd-vf/main.c
@@ -251,7 +251,6 @@ static lookup_struct_t *l3fwd_lookup_struct[NB_SOCKETS];
 struct rte_hash_parameters l3fwd_hash_params = {
 	.name = "l3fwd_hash_0",
 	.entries = L3FWD_HASH_ENTRIES,
-	.bucket_entries = 4,
 	.key_len = sizeof(struct ipv4_5tuple),
 	.hash_func = DEFAULT_HASH_FUNC,
 	.hash_func_init_val = 0,
diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 5c22ed1..def9594 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -2162,7 +2162,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv4_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv4_5tuple_host),
         .hash_func = ipv4_hash_crc,
         .hash_func_init_val = 0,
@@ -2171,7 +2170,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv6_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv6_5tuple_host),
         .hash_func = ipv6_hash_crc,
         .hash_func_init_val = 0,
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 68109d5..1cddc07 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -75,7 +75,7 @@ typedef uint32_t (*rte_hash_function)(const void *key, uint32_t key_len,
 struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
-	uint32_t bucket_entries;        /**< Bucket entries. */
+	uint32_t reserved;		/**< Unused field. Should be set to 0 */
 	uint32_t key_len;		/**< Length of hash key. */
 	rte_hash_function hash_func;	/**< Primary Hash function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] hash: rename unused field to "reserved"
  2015-07-13 16:25  3% [dpdk-dev] [PATCH] hash: rename unused field to "reserved" Bruce Richardson
@ 2015-07-13 16:28  0% ` Bruce Richardson
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:28 UTC (permalink / raw)
  To: dev

On Mon, Jul 13, 2015 at 05:25:51PM +0100, Bruce Richardson wrote:
> The cuckoo hash has a fixed number of entries per bucket, so the
> configuration parameter for this is unused. We change this field in the
> parameters struct to "reserved" to indicate that there is now no such
> parameter value, while at the same time keeping ABI consistency.
> 
> Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
> 
> Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Self-NAK. 

Missed some extra code dependencies using this field. V2 to follow.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-13 16:20  0%                   ` Thomas Monjalon
@ 2015-07-13 16:26  0%                     ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:26 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 13, 2015 at 06:20:08PM +0200, Thomas Monjalon wrote:
> 2015-07-13 17:14, Bruce Richardson:
> > On Mon, Jul 13, 2015 at 05:11:54PM +0100, Bruce Richardson wrote:
> > > On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> > > > 2015-07-11 01:18, Pablo de Lara:
> > > > > The main change when creating a new table is that the number of entries
> > > > > per bucket is fixed now, so its parameter is ignored now
> > > > > (still there to maintain the same parameters structure).
> > > > 
> > > > Why not rename the "bucket_entries" field to "reserved"?
> > > > The API of this field has changed (now ignored) so it should be reflected
> > > > without changing the ABI.
> > > 
> > > Since the hash_create function is itself already versionned to take account of the
> > > new struct parameter, there is no reason to keep the field at all, as far as I can see.
> > > We can just drop it, and let the ABI versionning handle the change.
> > > 
> > > /Bruce
> > 
> > Sorry, my mistake. It's no longer versioned in the patchset that was merged, so
> > the field does need to be kept. :-(
> 
> So do you agree to submit a patch which rename the unused field?

Yes. It should be in your inbox now... :-)

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] hash: rename unused field to "reserved"
@ 2015-07-13 16:25  3% Bruce Richardson
  2015-07-13 16:28  0% ` Bruce Richardson
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  0 siblings, 2 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:25 UTC (permalink / raw)
  To: dev

The cuckoo hash has a fixed number of entries per bucket, so the
configuration parameter for this is unused. We change this field in the
parameters struct to "reserved" to indicate that there is now no such
parameter value, while at the same time keeping ABI consistency.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_hash/rte_hash.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 68109d5..1cddc07 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -75,7 +75,7 @@ typedef uint32_t (*rte_hash_function)(const void *key, uint32_t key_len,
 struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
-	uint32_t bucket_entries;        /**< Bucket entries. */
+	uint32_t reserved;		/**< Unused field. Should be set to 0 */
 	uint32_t key_len;		/**< Length of hash key. */
 	rte_hash_function hash_func;	/**< Primary Hash function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-13 16:14  0%                 ` Bruce Richardson
@ 2015-07-13 16:20  0%                   ` Thomas Monjalon
  2015-07-13 16:26  0%                     ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-13 16:20 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2015-07-13 17:14, Bruce Richardson:
> On Mon, Jul 13, 2015 at 05:11:54PM +0100, Bruce Richardson wrote:
> > On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> > > 2015-07-11 01:18, Pablo de Lara:
> > > > The main change when creating a new table is that the number of entries
> > > > per bucket is fixed now, so its parameter is ignored now
> > > > (still there to maintain the same parameters structure).
> > > 
> > > Why not rename the "bucket_entries" field to "reserved"?
> > > The API of this field has changed (now ignored) so it should be reflected
> > > without changing the ABI.
> > 
> > Since the hash_create function is itself already versionned to take account of the
> > new struct parameter, there is no reason to keep the field at all, as far as I can see.
> > We can just drop it, and let the ABI versionning handle the change.
> > 
> > /Bruce
> 
> Sorry, my mistake. It's no longer versioned in the patchset that was merged, so
> the field does need to be kept. :-(

So do you agree to submit a patch which rename the unused field?

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-13 16:11  3%               ` Bruce Richardson
@ 2015-07-13 16:14  0%                 ` Bruce Richardson
  2015-07-13 16:20  0%                   ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-13 16:14 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 13, 2015 at 05:11:54PM +0100, Bruce Richardson wrote:
> On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> > 2015-07-11 01:18, Pablo de Lara:
> > > The main change when creating a new table is that the number of entries
> > > per bucket is fixed now, so its parameter is ignored now
> > > (still there to maintain the same parameters structure).
> > 
> > Why not rename the "bucket_entries" field to "reserved"?
> > The API of this field has changed (now ignored) so it should be reflected
> > without changing the ABI.
> 
> Since the hash_create function is itself already versionned to take account of the
> new struct parameter, there is no reason to keep the field at all, as far as I can see.
> We can just drop it, and let the ABI versionning handle the change.
> 
> /Bruce

Sorry, my mistake. It's no longer versioned in the patchset that was merged, so
the field does need to be kept. :-(

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-12 22:29  3%             ` Thomas Monjalon
@ 2015-07-13 16:11  3%               ` Bruce Richardson
  2015-07-13 16:14  0%                 ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-13 16:11 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> 2015-07-11 01:18, Pablo de Lara:
> > The main change when creating a new table is that the number of entries
> > per bucket is fixed now, so its parameter is ignored now
> > (still there to maintain the same parameters structure).
> 
> Why not rename the "bucket_entries" field to "reserved"?
> The API of this field has changed (now ignored) so it should be reflected
> without changing the ABI.

Since the hash_create function is itself already versionned to take account of the
new struct parameter, there is no reason to keep the field at all, as far as I can see.
We can just drop it, and let the ABI versionning handle the change.

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
@ 2015-07-13 15:53  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13 15:53 UTC (permalink / raw)
  To: Helin Zhang; +Cc: dev

2015-07-10 00:31, Helin Zhang:
> To avoid breaking ABI compatibility, all the changes would be enabled by RTE_NEXT_ABI,
> which is disabled by default.

It is enabled by default.
This comment will be removed from all patches of the series.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v5 4/9] ethdev: remove HW specific stats in stats structs
  2015-07-13 14:17  3% [dpdk-dev] [PATCH v5 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
@ 2015-07-13 14:17  9% ` Maryam Tahhan
  0 siblings, 0 replies; 200+ results
From: Maryam Tahhan @ 2015-07-13 14:17 UTC (permalink / raw)
  To: dev

Remove non generic stats in rte_stats_strings and mark the relevant
fields in struct rte_eth_stats as deprecated.

Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
---
 doc/guides/rel_notes/abi.rst  | 12 ++++++++++++
 lib/librte_ether/rte_ethdev.c |  9 ---------
 lib/librte_ether/rte_ethdev.h | 30 ++++++++++++++++++++----------
 3 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..d5bf625 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -24,3 +24,15 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* The following fields have been deprecated in rte_eth_stats:
+  * uint64_t imissed
+  * uint64_t ibadcrc
+  * uint64_t ibadlen
+  * uint64_t imcasts
+  * uint64_t fdirmatch
+  * uint64_t fdirmiss
+  * uint64_t tx_pause_xon
+  * uint64_t rx_pause_xon
+  * uint64_t tx_pause_xoff
+  * uint64_t rx_pause_xoff
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 2e62f43..976ce5f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,17 +142,8 @@ static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_bytes", offsetof(struct rte_eth_stats, ibytes)},
 	{"tx_bytes", offsetof(struct rte_eth_stats, obytes)},
 	{"tx_errors", offsetof(struct rte_eth_stats, oerrors)},
-	{"rx_missed_errors", offsetof(struct rte_eth_stats, imissed)},
-	{"rx_crc_errors", offsetof(struct rte_eth_stats, ibadcrc)},
-	{"rx_bad_length_errors", offsetof(struct rte_eth_stats, ibadlen)},
 	{"rx_errors", offsetof(struct rte_eth_stats, ierrors)},
 	{"alloc_rx_buff_failed", offsetof(struct rte_eth_stats, rx_nombuf)},
-	{"fdir_match", offsetof(struct rte_eth_stats, fdirmatch)},
-	{"fdir_miss", offsetof(struct rte_eth_stats, fdirmiss)},
-	{"tx_flow_control_xon", offsetof(struct rte_eth_stats, tx_pause_xon)},
-	{"rx_flow_control_xon", offsetof(struct rte_eth_stats, rx_pause_xon)},
-	{"tx_flow_control_xoff", offsetof(struct rte_eth_stats, tx_pause_xoff)},
-	{"rx_flow_control_xoff", offsetof(struct rte_eth_stats, rx_pause_xoff)},
 };
 #define RTE_NB_STATS (sizeof(rte_stats_strings) / sizeof(rte_stats_strings[0]))
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d76bbb3..a862027 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -193,19 +193,29 @@ struct rte_eth_stats {
 	uint64_t opackets;  /**< Total number of successfully transmitted packets.*/
 	uint64_t ibytes;    /**< Total number of successfully received bytes. */
 	uint64_t obytes;    /**< Total number of successfully transmitted bytes. */
-	uint64_t imissed;   /**< Total of RX missed packets (e.g full FIFO). */
-	uint64_t ibadcrc;   /**< Total of RX packets with CRC error. */
-	uint64_t ibadlen;   /**< Total of RX packets with bad length. */
+	/**< Deprecated; Total of RX missed packets (e.g full FIFO). */
+	uint64_t imissed;
+	/**< Deprecated; Total of RX packets with CRC error. */
+	uint64_t ibadcrc;
+	/**< Deprecated; Total of RX packets with bad length. */
+	uint64_t ibadlen;
 	uint64_t ierrors;   /**< Total number of erroneous received packets. */
 	uint64_t oerrors;   /**< Total number of failed transmitted packets. */
-	uint64_t imcasts;   /**< Total number of multicast received packets. */
+	uint64_t imcasts;
+	/**< Deprecated; Total number of multicast received packets. */
 	uint64_t rx_nombuf; /**< Total number of RX mbuf allocation failures. */
-	uint64_t fdirmatch; /**< Total number of RX packets matching a filter. */
-	uint64_t fdirmiss;  /**< Total number of RX packets not matching any filter. */
-	uint64_t tx_pause_xon;  /**< Total nb. of XON pause frame sent. */
-	uint64_t rx_pause_xon;  /**< Total nb. of XON pause frame received. */
-	uint64_t tx_pause_xoff; /**< Total nb. of XOFF pause frame sent. */
-	uint64_t rx_pause_xoff; /**< Total nb. of XOFF pause frame received. */
+	uint64_t fdirmatch;
+	/**< Deprecated; Total number of RX packets matching a filter. */
+	uint64_t fdirmiss;
+	/**< Deprecated; Total number of RX packets not matching any filter. */
+	uint64_t tx_pause_xon;
+	 /**< Deprecated; Total nb. of XON pause frame sent. */
+	uint64_t rx_pause_xon;
+	/**< Deprecated; Total nb. of XON pause frame received. */
+	uint64_t tx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame sent. */
+	uint64_t rx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame received. */
 	uint64_t q_ipackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
 	/**< Total number of queue RX packets. */
 	uint64_t q_opackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
-- 
2.4.3

^ permalink raw reply	[relevance 9%]

* [dpdk-dev] [PATCH v5 0/9] Expose IXGBE extended stats to DPDK apps
@ 2015-07-13 14:17  3% Maryam Tahhan
  2015-07-13 14:17  9% ` [dpdk-dev] [PATCH v5 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
  0 siblings, 1 reply; 200+ results
From: Maryam Tahhan @ 2015-07-13 14:17 UTC (permalink / raw)
  To: dev

This patch set implements xstats_get() and xstats_reset() in dev_ops for
ixgbe to expose detailed error statistics to DPDK applications. The
dump_cfg application was extended to demonstrate the usage of
retrieving statistics for DPDK interfaces and renamed to proc_info
in order reflect this new functionality. This patch set also removes non
generic statistics from the statistics strings at the ethdev level and
marks the relevant registers as depricated in struct rte_eth_stats.

v2:
 - Fixed patch dependencies.
 - Broke down patches into smaller logical changes.

v3:
 - Removes non-generic stats fields in rte_stats_strings and deprecates
   the fields related to them in struct rte_eth_stats.
 - Modifies rte_eth_xstats_get() to return generic stats and extended stats.
 
v4:
 - Replace count use in the loop in ixgbe_dev_xstats_get() function definition with i.
 - Breakdown "ixgbe: add NIC specific stats removed from ethdev" into two patches, one
   that adds the stats and another that extends ierrors to include more error stats.
 - Remove second call to ixgbe_dev_xstats_get() from rte_eth_xstats_get().

v5:
 - Added documentation for proc_info.
 - Fixed proc_info copyright year.
 - Display queue stats for all devices in proc_info.

Maryam Tahhan (9):
  ixgbe: move stats register reads to a new function
  ixgbe: add functions to get and reset xstats
  ethdev: expose extended error stats
  ethdev: remove HW specific stats in stats structs
  ixgbe: add NIC specific stats removed from ethdev
  ixgbe: return more errors in ierrors
  app: remove dump_cfg
  app: add a new app proc_info
  doc: Add documentation for proc_info

 MAINTAINERS                            |   4 +
 app/Makefile                           |   2 +-
 app/dump_cfg/Makefile                  |  45 -----
 app/dump_cfg/main.c                    |  92 ---------
 app/proc_info/Makefile                 |  45 +++++
 app/proc_info/main.c                   | 354 +++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/abi.rst           |  12 ++
 doc/guides/sample_app_ug/index.rst     |   1 +
 doc/guides/sample_app_ug/proc_info.rst |  71 +++++++
 drivers/net/ixgbe/ixgbe_ethdev.c       | 194 ++++++++++++++----
 lib/librte_ether/rte_ethdev.c          |  34 ++--
 lib/librte_ether/rte_ethdev.h          |  30 ++-
 mk/rte.sdktest.mk                      |   4 +-
 13 files changed, 682 insertions(+), 206 deletions(-)
 delete mode 100644 app/dump_cfg/Makefile
 delete mode 100644 app/dump_cfg/main.c
 create mode 100644 app/proc_info/Makefile
 create mode 100644 app/proc_info/main.c
 create mode 100644 doc/guides/sample_app_ug/proc_info.rst

-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:47  4%   ` Mcnamara, John
@ 2015-07-13 13:59  4%     ` Neil Horman
  2015-07-17 11:45  7%       ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-13 13:59 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Mon, Jul 13, 2015 at 10:47:03AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, July 13, 2015 11:42 AM
> > To: Mcnamara, John
> > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > 
> > On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > > Fix for ABI breakage introduced in LRO addition. Moves lro bitfield to
> > > the end of the struct/member.
> > >
> > > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> > >
> > > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > > ---
> > >  lib/librte_ether/rte_ethdev.h | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> > >  	uint8_t port_id;           /**< Device [external] port identifier.
> > */
> > >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0).
> > */
> > >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) /
> > OFF(0) */
> > > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> > >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0).
> > */
> > > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0).
> > */
> > > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0).
> > */
> > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > >  };
> > >
> > >  /**
> > > --
> > > 1.8.1.4
> > >
> > >
> > I presume the ABI checker stopped complaining about this with the patch,
> > yes?
> 
> Hi Neil,
> 
> Yes, I replied about that in the previous thread.
> 
Thank you, I'll ack as soon as Chao confirms its not a problem on ppc
Neil

> John.
> -- 
> 
> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:42  7% ` Neil Horman
@ 2015-07-13 10:46  4%   ` Thomas Monjalon
  2015-07-13 10:47  4%   ` Mcnamara, John
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13 10:46 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

2015-07-13 06:42, Neil Horman:
> On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > Fix for ABI breakage introduced in LRO addition. Moves
> > lro bitfield to the end of the struct/member.
> > 
> > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> > 
> > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 79bde89..1c3ace1 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> >  	uint8_t port_id;           /**< Device [external] port identifier. */
> >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
> >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
> > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> >  };
> >  
> >  /**
> I presume the ABI checker stopped complaining about this with the patch, yes?
> 
> Also, it would be great if someone could check this on ppc or a ppc cross
> compile, as I recall bitfields follow endianess order.

+ Chao, IBM POWER maintainer.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:42  7% ` Neil Horman
  2015-07-13 10:46  4%   ` Thomas Monjalon
@ 2015-07-13 10:47  4%   ` Mcnamara, John
  2015-07-13 13:59  4%     ` Neil Horman
  1 sibling, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-13 10:47 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, July 13, 2015 11:42 AM
> To: Mcnamara, John
> Cc: dev@dpdk.org; vladz@cloudius-systems.com
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> 
> On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > Fix for ABI breakage introduced in LRO addition. Moves lro bitfield to
> > the end of the struct/member.
> >
> > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> >
> > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.h
> > b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> >  	uint8_t port_id;           /**< Device [external] port identifier.
> */
> >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0).
> */
> >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) /
> OFF(0) */
> > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0).
> */
> > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0).
> */
> > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0).
> */
> > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> >  };
> >
> >  /**
> > --
> > 1.8.1.4
> >
> >
> I presume the ABI checker stopped complaining about this with the patch,
> yes?

Hi Neil,

Yes, I replied about that in the previous thread.

John.
-- 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:26  7% [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code John McNamara
@ 2015-07-13 10:42  7% ` Neil Horman
  2015-07-13 10:46  4%   ` Thomas Monjalon
  2015-07-13 10:47  4%   ` Mcnamara, John
  2015-07-16 22:22  4% ` Vlad Zolotarov
  1 sibling, 2 replies; 200+ results
From: Neil Horman @ 2015-07-13 10:42 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> Fix for ABI breakage introduced in LRO addition. Moves
> lro bitfield to the end of the struct/member.
> 
> Fixes: 8eecb3295aed (ixgbe: add LRO support)
> 
> Signed-off-by: John McNamara <john.mcnamara@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 79bde89..1c3ace1 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>  	uint8_t port_id;           /**< Device [external] port identifier. */
>  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
>  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
> -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
>  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
>  };
>  
>  /**
> -- 
> 1.8.1.4
> 
> 
I presume the ABI checker stopped complaining about this with the patch, yes?

Also, it would be great if someone could check this on ppc or a ppc cross
compile, as I recall bitfields follow endianess order.

Neil

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
@ 2015-07-13 10:26  7% John McNamara
  2015-07-13 10:42  7% ` Neil Horman
  2015-07-16 22:22  4% ` Vlad Zolotarov
  0 siblings, 2 replies; 200+ results
From: John McNamara @ 2015-07-13 10:26 UTC (permalink / raw)
  To: dev, vladz

Fix for ABI breakage introduced in LRO addition. Moves
lro bitfield to the end of the struct/member.

Fixes: 8eecb3295aed (ixgbe: add LRO support)

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 lib/librte_ether/rte_ethdev.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 79bde89..1c3ace1 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
 	uint8_t port_id;           /**< Device [external] port identifier. */
 	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
 		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
-		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
 		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
-		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
+		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
+		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
 };
 
 /**
-- 
1.8.1.4

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-11 14:19  7%               ` Neil Horman
@ 2015-07-13 10:14  8%                 ` Mcnamara, John
  0 siblings, 0 replies; 200+ results
From: Mcnamara, John @ 2015-07-13 10:14 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Saturday, July 11, 2015 3:20 PM
> To: Mcnamara, John
> Cc: Thomas Monjalon; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> 
> On Fri, Jul 10, 2015 at 04:07:53PM +0000, Mcnamara, John wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > > Sent: Tuesday, July 7, 2015 2:44 PM
> > > To: Thomas Monjalon
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> > >
> ...
>
> > Is it okay to submit a patch to move it to the end?
> >
> Assuming that fixes the problem, I think thats the only thing you can do
> right now.  I expect that will work, but I would run it through the ABI
> checker to be certain.

Hi Neil,

I ran the change through the abi checker before and after and the lro field no longer generates a Medium Severity ABI warning. I'll submit a patch to move the field.

John.
-- 

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi
  2015-07-13  9:24  4%             ` Mcnamara, John
@ 2015-07-13  9:32  7%               ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13  9:32 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

> > When next ABI is enabled, the shared lib extension is .so.x.1.
> > That's why a double basename was introduced.
> > But the "ifeq NEXT_ABI" was forgotten, removing the .so extension when
> > NEXT_ABI is disabled.
> > It was preventing the linker from finding the .so libraries.
> > 
> > Fixes: 506f51cc0da7 ("mk: enable next abi preview")
> > 
> > Reported-by: John McNamara <john.mcnamara@intel.com>
> > Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> 
> That fixes it. Thanks.
> 
> Acked-by: John McNamara <john.mcnamara@intel.com>

Applied, thanks.
Now that next ABI can be disabled, patches using it may be applied :)

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi
  2015-07-13  9:02  8%           ` [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi Thomas Monjalon
@ 2015-07-13  9:24  4%             ` Mcnamara, John
  2015-07-13  9:32  7%               ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-13  9:24 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, July 13, 2015 10:02 AM
> To: Mcnamara, John
> Cc: dev@dpdk.org
> Subject: [PATCH] mk: fix shared lib build with stable abi
> 
> When next ABI is enabled, the shared lib extension is .so.x.1.
> That's why a double basename was introduced.
> But the "ifeq NEXT_ABI" was forgotten, removing the .so extension when
> NEXT_ABI is disabled.
> It was preventing the linker from finding the .so libraries.
> 
> Fixes: 506f51cc0da7 ("mk: enable next abi preview")
> 
> Reported-by: John McNamara <john.mcnamara@intel.com>
> Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>

That fixes it. Thanks.

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi
  2015-07-13  8:48  7%         ` Thomas Monjalon
@ 2015-07-13  9:02  8%           ` Thomas Monjalon
  2015-07-13  9:24  4%             ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-13  9:02 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

When next ABI is enabled, the shared lib extension is .so.x.1.
That's why a double basename was introduced.
But the "ifeq NEXT_ABI" was forgotten, removing the .so
extension when NEXT_ABI is disabled.
It was preventing the linker from finding the .so libraries.

Fixes: 506f51cc0da7 ("mk: enable next abi preview")

Reported-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 mk/rte.lib.mk | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index f15de9b..9ff5cce 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -173,7 +173,11 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
 	@[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
 	$(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_NEXT_ABI),y)
 	$(Q)ln -s -f $< $(basename $(basename $@))
+else
+	$(Q)ln -s -f $< $(basename $@)
+endif
 endif
 
 #
-- 
2.4.2

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v3] mk: enable next abi preview
  2015-07-13  7:32  7%       ` Mcnamara, John
@ 2015-07-13  8:48  7%         ` Thomas Monjalon
  2015-07-13  9:02  8%           ` [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-13  8:48 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

2015-07-13 07:32, Mcnamara, John:
> This change to enable CONFIG_RTE_NEXT_ABI=n breaks validate-abi.sh
> because master won't compile with CONFIG_RTE_BUILD_SHARED_LIB=y and
> CONFIG_RTE_NEXT_ABI=n:

My bad. I thought I was testing both cases (next ABI and stable one) but
it appears only the "next one" was tested.

The error is trivial:
-       $(Q)ln -s -f $< $(RTE_OUTPUT)/lib/$(LIBSONAME)
+       $(Q)ln -s -f $< $(basename $(basename $@))

The double basename should apply to NEXT_ABI case only.

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v3] mk: enable next abi preview
  2015-07-08 16:44 33%     ` [dpdk-dev] [PATCH v3] " Thomas Monjalon
@ 2015-07-13  7:32  7%       ` Mcnamara, John
  2015-07-13  8:48  7%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-13  7:32 UTC (permalink / raw)
  To: Thomas Monjalon, nhorman; +Cc: dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, July 8, 2015 5:44 PM
> To: nhorman@tuxdriver.com
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3] mk: enable next abi preview
> 
> --- a/scripts/validate-abi.sh
> +++ b/scripts/validate-abi.sh
> @@ -157,6 +157,7 @@ git checkout $TAG1
>  # Make sure we configure SHARED libraries  # Also turn off IGB and KNI as
> those require kernel headers to build  sed -i -e"$
> a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
> +sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
>  sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET  sed -i
> -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET
> 

Hi,

This change to enable CONFIG_RTE_NEXT_ABI=n breaks validate-abi.sh because master won't compile with CONFIG_RTE_BUILD_SHARED_LIB=y and CONFIG_RTE_NEXT_ABI=n:

     make uninstall && \
     make T=x86_64-native-linuxapp-gcc install -j \
          CONFIG_RTE_BUILD_SHARED_LIB=y CONFIG_RTE_NEXT_ABI=n

Change was made in: 506f51cc0da7e057ac31e15048ba3b8015112226

John.
-- 

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
    2015-07-11  0:18 14%           ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-12 22:46  0%           ` Thomas Monjalon
  2 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-12 22:46 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-11 01:18, Pablo de Lara:
> This patchset is to replace the existing hash library with
> a more efficient and functional approach, using the Cuckoo hash
> method to deal with collisions. This method is based on using
> two different hash functions to have two possible locations
> in the hash table where an entry can be.
> So, if a bucket is full, a new entry can push one of the items
> in that bucket to its alternative location, making space for itself.
> 
> Advantages
> ~~
> - Offers the option to store more entries when the target bucket is full
>   (unlike the previous implementation)
> - Memory efficient: for storing those entries, it is not necessary to
>   request new memory, as the entries will be stored in the same table
> - Constant worst lookup time: in worst case scenario, it always takes
>   the same time to look up an entry, as there are only two possible locations
>   where an entry can be.
> - Storing data: user can store data in the hash table, unlike the
>   previous implementation, but he can still use the old API
> 
> This implementation typically offers over 90% utilization.
> Notice that API has been extended, but old API remains.
> Check documentation included to know more about this new implementation
> (including how entries are distributed as table utilization increases).
> 
> Changes in v7:
> - Fix inaccurate documentation
> 
> Changes in v6:
> - Replace datatype for functions from uintptr_t to void *
> 
> Changes in v5:
> - Fix 32-bit compilation issues
> 
> Changes in v4:
> - Unit tests enhancements are not part of this patchset anymore.
> - rte_hash structure has been made internal in another patch,
>   so it is not part of this patchset anymore.
> - Add function to iterate through the hash table, as rte_hash
>   structure has been made private.
> - Added extra_flag parameter in rte_hash_parameter to be able
>   to add new parameters in the future without breaking the ABI
> - Remove proposed lookup_bulk_with_hash function, as it is
>   not of much use with the existing hash functions
>   (there are no vector hash functions).
> - User can store 8-byte integer or pointer as data, instead
>   of variable size data, as discussed in the mailing list.
> 
> Changes in v3:
> 
> - Now user can store variable size data, instead of 32 or 64-bit size data,
>   using the new parameter "data_len" in rte_hash_parameters
> - Add lookup_bulk_with_hash function in performance  unit tests
> - Add new functions that handle data in performance unit tests
> - Remove duplicates in performance unit tests
> - Fix rte_hash_reset, which was not resetting the last entry
> 
> Changes in v2:
> 
> - Fixed issue where table could not store maximum number of entries
> - Fixed issue where lookup burst could not be more than 32 (instead of 64)
> - Remove unnecessary macros and add other useful ones
> - Added missing library dependencies
> - Used directly rte_hash_secondary instead of rte_hash_alt
> - Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
> - Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
> - Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
>   make them private
> - Corrected copyright dates
> - Added an optimized function to compare keys that are multiple of 16 bytes
> - Improved the way to use primary/secondary signatures. Now both are stored in
>   the bucket, so there is no need to calculate them if required.
>   Also, there is no need to use the MSB of a signature to differenciate between
>   an empty entry and signature 0, since we are storing both signatures,
>   which cannot be both 0.
> - Removed rte_hash_rehash, as it was a very expensive operation.
>   Therefore, the add function returns now -ENOSPC if key cannot be added
>   because of a loop.
> - Prefetched new slot for new key in add function to improve performance.
> - Made doxygen comments more clear.
> - Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
>   as we can use the lookup functions if we want to get the data before deleting.
> - Removed some unnecessary includes in rte_hash.h
> - Removed some unnecessary variables in rte_cuckoo_hash.c
> - Removed some unnecessary checks before creating a new hash table
> - Added documentation (in release notes and programmers guide)
> - Added new unit tests and replaced the performance one for hash tables
> 
> Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

Some bugs/formatting were fixed in fly.
Some remaining comments may be addressed in further patches.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash
  2015-07-11  0:18 14%           ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-12 22:38  8%             ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-12 22:38 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-11 01:18, Pablo de Lara:
> Two of the macros in rte_hash.h are now deprecated, so this patch
> adds notice that they will be removed in 2.2.
[...]
> +* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.

These macros have no impact on the ABI.
I suggest to rename doc/guides/rel_notes/abi.rst to deprecation.rst
and add a chapter about API in doc/guides/guidelines/versioning.rst.

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  @ 2015-07-12 22:29  3%             ` Thomas Monjalon
  2015-07-13 16:11  3%               ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-12 22:29 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-11 01:18, Pablo de Lara:
> The main change when creating a new table is that the number of entries
> per bucket is fixed now, so its parameter is ignored now
> (still there to maintain the same parameters structure).

Why not rename the "bucket_entries" field to "reserved"?
The API of this field has changed (now ignored) so it should be reflected
without changing the ABI.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-10 16:07  4%             ` Mcnamara, John
@ 2015-07-11 14:19  7%               ` Neil Horman
  2015-07-13 10:14  8%                 ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-11 14:19 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Fri, Jul 10, 2015 at 04:07:53PM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > Sent: Tuesday, July 7, 2015 2:44 PM
> > To: Thomas Monjalon
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> > 
> > On Tue, Jul 07, 2015 at 05:46:08AM -0700, Thomas Monjalon wrote:
> > > Neil, in the meantime, could you please help to check ABI breakage in
> > the HEAD?
> > >
> > Took a look, the only ABI break I see that we need to worry about is the
> > one introduced in commit 8eecb3295aed0a979def52245564d03be172a83c. It adds
> > a bitfield called lro into the existing uint8_t there, but does so in the
> > middle of the set, which pushes the other bits around, breaking ABI.  It
> > should have been added to the end.
> 
> Hi,
> 
> Is it okay to submit a patch to move it to the end?
> 
Assuming that fixes the problem, I think thats the only thing you can do right
now.  I expect that will work, but I would run it through the ABI checker to be
certain
Neil


> John.
> -- 
> 
> 
> 
> 

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  @ 2015-07-11  0:18 14%           ` Pablo de Lara
  2015-07-12 22:38  8%             ` Thomas Monjalon
  2015-07-12 22:46  0%           ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Thomas Monjalon
  2 siblings, 1 reply; 200+ results
From: Pablo de Lara @ 2015-07-11  0:18 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9d60a74..b45017c 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -21,3 +21,4 @@ Deprecation Notices
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
   All binaries will need to be rebuilt from release 2.2.
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.3

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  2015-07-10 23:30 14%         ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-11  0:18  4%         ` Pablo de Lara
                               ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Pablo de Lara @ 2015-07-11  0:18 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation typically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v7:
- Fix inaccurate documentation

Changes in v6:
- Replace datatype for functions from uintptr_t to void *

Changes in v5:
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not resetting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  303 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1810 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
@ 2015-07-10 23:30 14%         ` Pablo de Lara
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  1 sibling, 0 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 23:30 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9d60a74..b45017c 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -21,3 +21,4 @@ Deprecation Notices
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
   All binaries will need to be rebuilt from release 2.2.
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.3

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
  2015-07-10 21:57 14%       ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-10 23:30  4%       ` Pablo de Lara
  2015-07-10 23:30 14%         ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  1 sibling, 2 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 23:30 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation typically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v6:
- Replace datatype for functions from uintptr_t to void *

Changes in v5:
- Fix documentation
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not resetting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  303 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1810 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port
  2015-07-08 11:07  4% ` Neil Horman
@ 2015-07-10 22:14  4%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10 22:14 UTC (permalink / raw)
  To: Jijiang Liu; +Cc: dev

> > The significant ABI change of all shared libraries is planned for struct rte_eth_dev to support up to 1024 queues per port which will be taken effect from release 2.2.
> > 
> > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
[...]
> >  Deprecation Notices
> >  -------------------
> > +* Significant ABI changes are planned for struct rte_eth_dev to support up to 1024 queues per port. This change will be taken effect for shared libraries from release 2.2. There is no backward compatibility planned from release 2.2. All binaries will need to be rebuilt from release 2.2.

not only for shared libraries

> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Wrapped and applied with above fix, thanks

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] config: revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256
  2015-07-08  1:24  3% [dpdk-dev] [PATCH] config: revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256 Jijiang Liu
@ 2015-07-10 21:58  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10 21:58 UTC (permalink / raw)
  To: Jijiang Liu; +Cc: dev

2015-07-08 09:24, Jijiang Liu:
> Revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256.
> 
> The previous commit changed the size and the offsets of struct rte_eth_dev,
> so it is an ABI breakage. I revert it, and will send a deprecation notice for this.
> 
> Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
@ 2015-07-10 21:57 14%       ` Pablo de Lara
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  1 sibling, 0 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 21:57 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..312348e 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.3

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH v5 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
  2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
  2015-07-10 20:52  0%     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
@ 2015-07-10 21:57  4%     ` Pablo de Lara
  2015-07-10 21:57 14%       ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  2 siblings, 2 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 21:57 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation tipically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v5:
- Fix documentation
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not reseting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  305 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1812 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
  2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-10 20:52  0%     ` Bruce Richardson
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
  2 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-10 20:52 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

On Fri, Jul 10, 2015 at 06:24:17PM +0100, Pablo de Lara wrote:
> This patchset is to replace the existing hash library with
> a more efficient and functional approach, using the Cuckoo hash
> method to deal with collisions. This method is based on using
> two different hash functions to have two possible locations
> in the hash table where an entry can be.
> So, if a bucket is full, a new entry can push one of the items
> in that bucket to its alternative location, making space for itself.
> 
> Advantages
> ~~~~~
> - Offers the option to store more entries when the target bucket is full
>   (unlike the previous implementation)
> - Memory efficient: for storing those entries, it is not necessary to
>   request new memory, as the entries will be stored in the same table
> - Constant worst lookup time: in worst case scenario, it always takes
>   the same time to look up an entry, as there are only two possible locations
>   where an entry can be.
> - Storing data: user can store data in the hash table, unlike the
>   previous implementation, but he can still use the old API
> 
> This implementation tipically offers over 90% utilization.
> Notice that API has been extended, but old API remains.
> Check documentation included to know more about this new implementation
> (including how entries are distributed as table utilization increases).
> 
> Changes in v4:
> - Unit tests enhancements are not part of this patchset anymore.
> - rte_hash structure has been made internal in another patch,
>   so it is not part of this patchset anymore.
> - Add function to iterate through the hash table, as rte_hash
>   structure has been made private.
> - Added extra_flag parameter in rte_hash_parameter to be able
>   to add new parameters in the future without breaking the ABI
> - Remove proposed lookup_bulk_with_hash function, as it is
>   not of much use with the existing hash functions
>   (there are no vector hash functions).
> - User can store 8-byte integer or pointer as data, instead
>   of variable size data, as discussed in the mailing list.
>

Hi Pablo,

I'm getting some compile errors with this code, perhaps you could recheck e.g
32-bit and with clang.
On the plus side, I like the docs included with this set.

Regards,
/Bruce

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
@ 2015-07-10 17:24 14%     ` Pablo de Lara
  2015-07-10 20:52  0%     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
  2 siblings, 0 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 17:24 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..312348e 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.2

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash
    2015-07-08 23:23  0%   ` Thomas Monjalon
@ 2015-07-10 17:24  4%   ` Pablo de Lara
  2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
                       ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 17:24 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation tipically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not reseting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table 
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  305 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1812 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.2

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-07 13:44  8%           ` Neil Horman
@ 2015-07-10 16:07  4%             ` Mcnamara, John
  2015-07-11 14:19  7%               ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-10 16:07 UTC (permalink / raw)
  To: Neil Horman, Thomas Monjalon; +Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> Sent: Tuesday, July 7, 2015 2:44 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> 
> On Tue, Jul 07, 2015 at 05:46:08AM -0700, Thomas Monjalon wrote:
> > Neil, in the meantime, could you please help to check ABI breakage in
> the HEAD?
> >
> Took a look, the only ABI break I see that we need to worry about is the
> one introduced in commit 8eecb3295aed0a979def52245564d03be172a83c. It adds
> a bitfield called lro into the existing uint8_t there, but does so in the
> middle of the set, which pushes the other bits around, breaking ABI.  It
> should have been added to the end.

Hi,

Is it okay to submit a patch to move it to the end?

John.
-- 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 2/3] doc: added guidelines on dpdk documentation
  @ 2015-07-10 15:45  3%   ` John McNamara
  0 siblings, 0 replies; 200+ results
From: John McNamara @ 2015-07-10 15:45 UTC (permalink / raw)
  To: dev

Added guidelines on the purpose and structure of the DPDK
documentation, how to build it and guidelines for creating it.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/guidelines/documentation.rst | 579 ++++++++++++++++++++++++++++++++
 doc/guides/guidelines/index.rst         |   2 +
 2 files changed, 581 insertions(+)
 create mode 100644 doc/guides/guidelines/documentation.rst

diff --git a/doc/guides/guidelines/documentation.rst b/doc/guides/guidelines/documentation.rst
new file mode 100644
index 0000000..b47f028
--- /dev/null
+++ b/doc/guides/guidelines/documentation.rst
@@ -0,0 +1,579 @@
+.. doc_guidelines:
+
+DPDK Documentation Guidelines
+=============================
+
+This document outlines the guidelines for writing the DPDK Guides and API documentation in RST and Doxygen format.
+
+It also explains the structure of the DPDK documentation and shows how to build the Html and PDF versions of the documents.
+
+
+Structure of the Documentation
+------------------------------
+
+The DPDK source code repository contains input files to build the API documentation and User Guides.
+
+The main directories that contain files related to documentation are shown below::
+
+   lib
+   |-- librte_acl
+   |-- librte_cfgfile
+   |-- librte_cmdline
+   |-- librte_compat
+   |-- librte_eal
+   |   |-- ...
+   ...
+   doc
+   |-- api
+   +-- guides
+       |-- freebsd_gsg
+       |-- linux_gsg
+       |-- prog_guide
+       |-- sample_app_ug
+       |-- guidelines
+       |-- testpmd_app_ug
+       |-- rel_notes
+       |-- nics
+       |-- xen
+       |-- ...
+
+
+The API documentation is built from `Doxygen <http://www.stack.nl/~dimitri/doxygen/>`_ comments in the header files.
+These files are mainly in the ``lib/lib_rte_*`` directories although some of the Poll Mode Drivers in ``drivers/net``
+are also documented with Doxygen.
+
+The configuration files that are used to control the Doxygen output are in the ``doc/api`` directory.
+
+The user guides such as *The Programmers Guide* and the *FreeBSD* and *Linux Getting Started* Guides are generated
+from RST markup text files using the `Sphinx <http://sphinx-doc.org/index.html>`_ Documentation Generator.
+
+These files are included in the ``doc/guides/`` directory.
+The output is controlled by the ``doc/guides/conf.py`` file.
+
+
+Role of the Documentation
+-------------------------
+
+The following items outline the roles of the different parts of the documentation and when they need to be updated or
+added to by the developer.
+
+* **Release Notes**
+
+  The Release Notes document which features have been added in the current and previous releases of DPDK and highlight
+  any known issues.
+  The Releases Notes also contain notifications of features that will change ABI compatibility in the next major release.
+
+  Developers should update the Release Notes to add a short description of new or updated features.
+  Developers should also update the Release Notes to add ABI announcements if necessary,
+  (see :doc:`/guidelines/versioning` for details).
+
+* **API documentation**
+
+  The API documentation explains how to use the public DPDK functions.
+  The `API index page <http://dpdk.org/doc/api/>`_ shows the generated API documentation with related groups of functions.
+
+  The API documentation should be updated via Doxygen comments when new functions are added.
+
+* **Getting Started Guides**
+
+  The Getting Started Guides show how to install and configure DPDK and how to run DPDK based applications on different OSes.
+
+  A Getting Started Guide should be added when DPDK is ported to a new OS.
+
+* **The Programmers Guide**
+
+  The Programmers Guide explains how the API components of DPDK such as the EAL, Memzone, Rings and the Hash Library work.
+  It also explains how some higher level functionality such as Packet Distributor, Packet Framework and KNI work.
+  It also shows the build system and explains how to add applications.
+
+  The Programmers Guide should be expanded when new functionality is added to DPDK.
+
+* **App Guides**
+
+  The app guides document the DPDK applications in the ``app`` directory such as ``testpmd``.
+
+  The app guides should be updated if functionality is changed or added.
+
+* **Sample App Guides**
+
+  The sample app guides document the DPDK example applications in the examples directory.
+  Generally they demonstrate a major feature such as L2 or L3 Forwarding, Multi Process or Power Management.
+  They explain the purpose of the sample application, how to run it and step through some of the code to explain the
+  major functionality.
+
+  A new sample application should be accompanied by a new sample app guide.
+  The guide for the Skeleton Forwarding app is a good starting reference.
+
+* **Network Interface Controller Drivers**
+
+  The NIC Drivers document explains the features of the individual Poll Mode Drivers, such as software requirements,
+  configuration and initialization.
+
+  New documentation should be added for new Poll Mode Drivers.
+
+* **Guidelines**
+
+  The guideline documents record community process, expectations and design directions.
+
+  They can be extended, amended or discussed by submitting a patch and getting community approval.
+
+
+Building the Documentation
+--------------------------
+
+Dependencies
+~~~~~~~~~~~~
+
+
+The following dependencies must be installed to build the documentation:
+
+* Doxygen.
+
+* Sphinx (also called python-sphinx).
+
+* TexLive (at least TexLive-core, extra Latex support and extra fonts).
+
+* Inkscape.
+
+`Doxygen`_ generates documentation from commented source code.
+It can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install doxygen
+
+   # Red Hat/Fedora.
+   sudo yum     -y install doxygen
+
+`Sphinx`_ is a Python documentation tool for converting RST files to Html or to PDF (via LaTeX).
+It can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install python-sphinx
+
+   # Red Hat/Fedora.
+   sudo yum     -y install python-sphinx
+
+   # Or, on any system with Python installed.
+   sudo easy_install -U sphinx
+
+For further information on getting started with Sphinx see the `Sphinx Tutorial <http://sphinx-doc.org/tutorial.html>`_.
+
+.. Note::
+
+   To get full support for Figure and Table numbering it is best to install Sphinx 1.3.1 or later.
+
+
+`Inkscape`_ is a vector based graphics program which is used to create SVG images and also to convert SVG images to PDF images.
+It can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install inkscape
+
+   # Red Hat/Fedora.
+   sudo yum     -y install inkscape
+
+`TexLive <http://www.tug.org/texlive/>`_ is an installation package for Tex/LaTeX.
+It is used to generate the PDF versions of the documentation.
+The main required packages can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install texlive-latex-extra texlive-fonts-extra \
+                           texlive-fonts-recommended
+
+
+   # Red Hat/Fedora, selective install.
+   sudo yum     -y install texlive-collection-latexextra \
+                           texlive-collection-fontsextra
+
+
+Build commands
+~~~~~~~~~~~~~~
+
+The documentation is built using the standard DPDK build system.
+Some examples are shown below:
+
+* Generate all the documentation targets::
+
+     make doc
+
+* Generate the Doxygen API documentation in Html::
+
+     make doc-api-html
+
+* Generate the guides documentation in Html::
+
+     make doc-guides-html
+
+* Generate the guides documentation in Pdf::
+
+     make doc-guides-pdf
+
+The output of these commands is generated in the ``build`` directory::
+
+   build/doc
+         |-- html
+         |   |-- api
+         |   +-- guides
+         |
+         +-- pdf
+             +-- guides
+
+
+.. Note::
+
+   Make sure to fix any Sphinx or Doxygen warnings when adding or updating documentation.
+
+The documentation output files can be removed as follows::
+
+   make doc-clean
+
+
+Document Guidelines
+-------------------
+
+Here are some guidelines in relation to the style of the documentation:
+
+* Document the obvious as well as the obscure since it won't always be obvious to the reader.
+  For example an instruction like "Set up 64 2MB Hugepages" is better when followed by a sample commandline or a link to
+  the appropriate section of the documentation.
+
+* Use American English spellings throughout.
+  This can be checked using the ``aspell`` utility::
+
+       aspell --lang=en_US --check doc/guides/sample_app_ug/mydoc.rst
+
+
+RST Guidelines
+--------------
+
+The RST (reStructuredText) format is a plain text markup format that can be converted to Html, PDF or other formats.
+It is most closely associated with Python but it can be used to document any language.
+It is used in DPDK to document everything apart from the API.
+
+The Sphinx documentation contains a very useful `RST Primer <http://sphinx-doc.org/rest.html#rst-primer>`_ which is a
+good place to learn the minimal set of syntax required to format a document.
+
+The official `reStructuredText <http://docutils.sourceforge.net/rst.html>`_ website contains the specification for the
+RST format and also examples of how to use it.
+However, for most developers the RST Primer is a better resource.
+
+The most common guidelines for writing RST text are detailed in the
+`Documenting Python <https://docs.python.org/devguide/documenting.html>`_ guidelines.
+The additional guidelines below reiterate or expand upon those guidelines.
+
+
+Line Length
+~~~~~~~~~~~
+
+* The recommended style for the DPDK documentation is to put sentences on separate lines.
+  This allows for easier reviewing of patches.
+  Multiple sentences which are not separated by a blank line are joined automatically into paragraphs, for example::
+
+     Here is an example sentence.
+     Long sentences over the limit shown below can be wrapped onto
+     a new line.
+     These three sentences will be joined into the same paragraph.
+
+     This is a new paragraph, since it is separated from the
+     previous paragraph by a blank line.
+
+  This would be rendered as follows:
+
+     *Here is an example sentence.
+     Long sentences over the limit shown below can be wrapped onto
+     a new line.
+     These three sentences will be joined into the same paragraph.*
+
+     *This is a new paragraph, since it is separated from the
+     previous paragraph by a blank line.*
+
+
+* Long sentences should be wrapped at 120 characters +/- 10 characters. They should be wrapped at words.
+
+* Lines in literal blocks must by less than 80 characters since they aren't wrapped by the document formatters
+  and can exceed the page width in PDF documents.
+
+
+Whitespace
+~~~~~~~~~~
+
+* Standard RST indentation is 3 spaces.
+  Code can be indented 4 spaces, especially if it is copied from source files.
+
+* No tabs.
+  Convert tabs in embedded code to 4 or 8 spaces.
+
+* No trailing whitespace.
+
+* Add 2 blank lines before each section header.
+
+* Add 1 blank line after each section header.
+
+* Add 1 blank line between each line of a list.
+
+
+Section Headers
+~~~~~~~~~~~~~~~
+
+* Section headers should use the use the following underline formats::
+
+   Level 1 Heading
+   ===============
+
+
+   Level 2 Heading
+   ---------------
+
+
+   Level 3 Heading
+   ~~~~~~~~~~~~~~~
+
+
+   Level 4 Heading
+   ^^^^^^^^^^^^^^^
+
+
+* Level 4 headings should be used sparingly.
+
+* The underlines should match the length of the text.
+
+* In general, the heading should be less than 80 characters, for conciseness.
+
+* As noted above:
+
+   * Add 2 blank lines before each section header.
+
+   * Add 1 blank line after each section header.
+
+
+Lists
+~~~~~
+
+* Bullet lists should be formatted with a leading ``*`` as follows::
+
+     * Item one.
+
+     * Item two is a long line that is wrapped and then indented to match
+       the start of the previous line.
+
+     * One space character between the bullet and the text is preferred.
+
+* Numbered lists can be formatted with a leading number but the preference is to use ``#.`` which will give automatic numbering.
+  This is more convenient when adding or removing items::
+
+     #. Item one.
+
+     #. Item two is a long line that is wrapped and then indented
+        to match the start of the e first line.
+
+     #. Item two is a long line that is wrapped and then indented to match
+        the start of the previous line.
+
+* Definition lists can be written with or without a bullet::
+
+     * Item one.
+
+       Some text about item one.
+
+     * Item two.
+
+       Some text about item two.
+
+* All lists, and sub-lists, must be separated from the preceding text by a blank line.
+  This is a syntax requirement.
+
+* All list items should be separated by a blank line for readability.
+
+
+Code and Literal block sections
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Inline text that is required to be rendered with a fixed width font should be enclosed in backquotes like this:
+  \`\`text\`\`, so that it appears like this: ``text``.
+
+* Fixed width, literal blocks of texts should be indented at least 3 spaces and prefixed with ``::`` like this::
+
+     Here is some fixed width text::
+
+        0x0001 0x0001 0x00FF 0x00FF
+
+* It is also possible to specify an encoding for a literal block using the ``.. code-block::`` directive so that syntax
+  highlighting can be applied.
+  Examples of supported highlighting are::
+
+     .. code-block:: console
+     .. code-block:: c
+     .. code-block:: python
+     .. code-block:: diff
+     .. code-block:: none
+
+  That can be applied as follows::
+
+      .. code-block:: c
+
+         #include<stdio.h>
+
+         int main() {
+
+            printf("Hello World\n");
+
+            return 0;
+         }
+
+  Which would be rendered as:
+
+  .. code-block:: c
+
+      #include<stdio.h>
+
+      int main() {
+
+         printf("Hello World\n");
+
+         return 0;
+      }
+
+
+* The default encoding for a literal block using the simplified ``::``
+  directive is ``none``.
+
+* Lines in literal blocks must be less than 80 characters since they can exceed the page width when converted to PDF documentation.
+  For long literal lines that exceed that limit try to wrap the text at sensible locations.
+  For example a long command line could be documented like this and still work if copied directly from the docs::
+
+     build/app/testpmd -c7 -n3 --vdev=eth_pcap0,iface=eth0     \
+                               --vdev=eth_pcap1,iface=eth1     \
+                               -- -i --nb-cores=2 --nb-ports=2 \
+                                  --total-num-mbufs=2048
+
+* Long lines that cannot be wrapped, such as application output, should be truncated to be less than 80 characters.
+
+
+Images
+~~~~~~
+
+* All images should be in SVG scalar graphics format.
+  They should be true SVG XML files and should not include binary formats embedded in a SVG wrapper.
+
+* The DPDK documentation contains some legacy images in PNG format.
+  These will be converted to SVG in time.
+
+* `Inkscape <inkscape.org>`_ is the recommended graphics editor for creating the images.
+  Use some of the older images in ``doc/guides/prog_guide/img/`` as a template, for example ``mbuf1.svg``
+  or ``ring-enqueue.svg``.
+
+* The SVG images should include a copyright notice, as an XML comment.
+
+* Images in the documentation should be formatted as follows:
+
+   * The image should be preceded by a label in the format ``.. _figure_XXXX:`` with a leading underscore and
+     where ``XXXX`` is a unique descriptive name.
+
+   * Images should be included using the ``.. figure::`` directive and the file type should be set to ``*`` (not ``.svg``).
+     This allows the format of the image to be changed if required, without updating the documentation.
+
+   * Images must have a caption as part of the ``.. figure::`` directive.
+
+* Here is an example of the previous three guidelines::
+
+     .. _figure_mempool:
+
+     .. figure:: img/mempool.*
+
+        A mempool in memory with its associated ring.
+
+.. _mock_label:
+
+* Images can then be linked to using the ``:numref:`` directive::
+
+     The mempool layout is shown in :numref:`figure_mempool`.
+
+  This would be rendered as: *The mempool layout is shown in* :ref:`Fig 6.3 <mock_label>`.
+
+  **Note**: The ``:numref:`` directive requires Sphinx 1.3.1 or later.
+  With earlier versions it will still be rendered as a link but won't have an automatically generated number.
+
+* The caption of the image can be generated, with a link, using the ``:ref:`` directive::
+
+     :ref:`figure_mempool`
+
+  This would be rendered as: *A mempool in memory with its associated ring.*
+
+Tables
+~~~~~~
+
+* RST tables should be used sparingly.
+  They are hard to format and to edit, they are often rendered incorrectly in PDF format, and the same information
+  can usually be shown just as clearly with a definition or bullet list.
+
+* Tables in the documentation should be formatted as follows:
+
+   * The table should be preceded by a label in the format ``.. _table_XXXX:`` with a leading underscore and where
+     ``XXXX`` is a unique descriptive name.
+
+   * Tables should be included using the ``.. table::`` directive and must have a caption.
+
+* Here is an example of the previous two guidelines::
+
+     .. _table_qos_pipes:
+
+     .. table:: Sample configuration for QOS pipes.
+
+        +----------+----------+----------+
+        | Header 1 | Header 2 | Header 3 |
+        |          |          |          |
+        +==========+==========+==========+
+        | Text     | Text     | Text     |
+        +----------+----------+----------+
+        | ...      | ...      | ...      |
+        +----------+----------+----------+
+
+* Tables can be linked to using the ``:numref:`` and ``:ref:`` directives, as shown in the previous section for images.
+  For example::
+
+     The QOS configuration is shown in :numref:`table_qos_pipes`.
+
+* Tables should not include merged cells since they are not supported by the PDF renderer.
+
+
+.. _links:
+
+Hyperlinks
+~~~~~~~~~~
+
+* Links to external websites can be plain URLs.
+  The following is rendered as http://dpdk.org::
+
+     http://dpdk.org
+
+* They can contain alternative text.
+  The following is rendered as `Check out DPDK <http://dpdk.org>`_::
+
+     `Check out DPDK <http://dpdk.org>`_
+
+* An internal link can be generated by placing labels in the document with the format ``.. _label_name``.
+
+* The following links to the top of this section: :ref:`links`::
+
+     .. _links:
+
+     Hyperlinks
+     ~~~~~~~~~~
+
+     * The following links to the top of this section: :ref:`links`:
+
+.. Note::
+
+   The label must have a leading underscore but the reference to it must omit it.
+   This is a frequent cause of errors and warnings.
+
+* The use of a label is preferred since it works across files and will still work if the header text changes.
+
diff --git a/doc/guides/guidelines/index.rst b/doc/guides/guidelines/index.rst
index bfb9fa3..1050d99 100644
--- a/doc/guides/guidelines/index.rst
+++ b/doc/guides/guidelines/index.rst
@@ -8,3 +8,5 @@ Guidelines
     coding_style
     design
     versioning
+    documentation
+
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  2015-07-08 13:21  3%   ` Bruce Richardson
  2015-07-08 16:57  0%     ` Matthew Hall
@ 2015-07-10 10:27  0%     ` Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10 10:27 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-08 14:21, Bruce Richardson:
> On Wed, Jul 08, 2015 at 12:27:34PM +0100, Pablo de Lara wrote:
> > rte_hash structure should not be a public structure,
> > and therefore it should be moved to the C file and be declared
> > as internal. rte_hash_hash implementation is also moved
> > to the C file, as it uses the structure.
> > 
> > This patch also removes part of a unit test that was checking
> > a field of the structure.
> > 
> > Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> 
> Irrespective of whether or not we change the underlying hash table implementation
> this looks a good change to me. The rte_hash structure should not be used directly
> by any applications - the APIs all take pointers to the structure,
> so there should be no ABI breakage from this, I think.
> 
> Therefore:
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks
  2015-07-09  2:47 21% [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks Wenzhuo Lu
  2015-07-09  7:32  4% ` Thomas Monjalon
@ 2015-07-10  2:24 21% ` Wenzhuo Lu
  1 sibling, 0 replies; 200+ results
From: Wenzhuo Lu @ 2015-07-10  2:24 UTC (permalink / raw)
  To: dev

For x550 supports 2 new flow director modes, MAC VLAN and Cloud. The MAC VLAN
mode means the MAC and VLAN are monitored. The Cloud mode is for VxLAN and
NVGRE, and the tunnel type, TNI/VNI, inner MAC and inner VLAN are monitored. So,
there're a few new lookup fields for these 2 new modes, like MAC, tunnel type,
TNI/VNI. We have to change the ABI to support these new lookup fields.

v2 changes:
* Correct the names of the structures.
* Wrap the words.
* Add explanation for the new modes.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/rel_notes/abi.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..37bcad9 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,13 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The ABI changes are planned for struct rte_eth_fdir_filter and
+  rte_eth_fdir_masks in order to support new flow director modes,
+  MAC VLAN and Cloud, on x550. The MAC VLAN mode means the MAC and
+  VLAN are monitored. The Cloud mode is for VxLAN and NVGRE, and
+  the tunnel type, TNI/VNI, inner MAC and inner VLAN are monitored.
+  The upcoming release 2.1 will not contain these ABI changes, but
+  release 2.2 will, and no backwards compatibility is planed due to
+  this change.
+  Binaries using this library build prior to version 2.2 will require
+  updating and recompilation.
-- 
1.9.3

^ permalink raw reply	[relevance 21%]

* Re: [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping
  2015-07-09 13:30  3% [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping John McNamara
@ 2015-07-10  0:43  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10  0:43 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

2015-07-09 14:30, John McNamara:
> This patchset adds ethdev API to enable and read IEEE1588/802.1AS PTP
> timestamps from devices that support it. The following functions are added:
> 
>     rte_eth_timesync_enable()
>     rte_eth_timesync_disable()
>     rte_eth_timesync_read_rx_timestamp()
>     rte_eth_timesync_read_tx_timestamp()
> 
> The "ieee1588" forwarding mode in testpmd is also refactored to demonstrate
> the new API and to clean up the code.
> 
> Adds support for igb, ixgbe and i40e.
> 
> V4:
> * Added timesync field to end of mbuf to pass IEEE1588 registers and flags.
>   Removed previous ABI deprecation notice.
> 
> V3:
> * Fixed issued with version.map.
> 
> V2:
> * Added i40e support.
> 
> * Renamed ethdev functions from rte_eth_ieee15888_*() to rte_eth_timesync_*()
>   since 802.1AS can be supported through the same interfaces.
> 
> V1:
> * Initial version for igb and ixgbe.
> 
> 
> John McNamara (7):
>   ethdev: add support for ieee1588 timestamping
>   mbuf: add field for ieee1588 timesync index
>   e1000: add support for ieee1588 timestamping
>   ixgbe: add support for ieee1588 timestamping
>   i40e: add support for ieee1588 timestamping
>   app/testpmd: refactor ieee1588 forwarding
>   doc: document ieee1588 forwarding mode

Was previously acked by Wenzhuo except the new mbuf field.
Applied, thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  2015-07-09  8:12  3%       ` Bruce Richardson
@ 2015-07-09 20:42  3%         ` Matthew Hall
  0 siblings, 0 replies; 200+ results
From: Matthew Hall @ 2015-07-09 20:42 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Thu, Jul 09, 2015 at 09:12:23AM +0100, Bruce Richardson wrote:
> Thanks for the feedback Matthew. Can you suggest a function prototype for such
> a walk operation that would make it useful for you. While we can keep the
> hash structure public, I'd prefer if we could avoid it, as it makes making changes
> hard due to ABI issues.
> 
> /Bruce

Hi Bruce,

I understand about the ABI issues. Hence my suggestion of an iterator if the 
structs are opaque. The names could be something like these:

rte_hash_iterate(_safe)
rte_hash_foreach(_safe)

If required due to the implementation, the safe version would be similar to 
what's seen in BSD queue.h, where you can do a slower iteration that allows 
removing a current entry without corrupting the table or iterator.

Then the function would look something like this:

rte_hash_iterate(rte_hash_t* h, rte_hash_callback_t callback, void* data)
rte_hash_iterate(rte_hash_t* h, rte_hash_callback_t callback, void* data)
rte_hash_iterate(rte_hash_t* h, rte_hash_callback_t callback, void* data)

rte_hash_callback_t would be a typedef of a function pointer for a callback 
function, something like this on the function depending how it works inside 
the hash:

int application_hash_callback(void* key, void* value, void* data)
int application_hash_callback(void* key, rte_hash_entry_t* entry, void* data)
int application_hash_callback(void* key, void* key, void* value, void* data)

The data pointer will contain the same pointer passed to the iterator. If the 
iteration function returns non-zero, iteration could be discontinued, as the 
client code found what it wanted already.

Threading synchronization responsibility will fall on the client as before. 
The iterator should say if it's thread-safe for read-only, read-write, or 
unsafe for anything, etc.

Matthew.

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 07/19] vmxnet3: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (5 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 06/19] enic: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 08/19] fm10k: " Helin Zhang
                     ` (12 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 8 ++++++++
 1 file changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index a1eac45..25ae2f6 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -649,9 +649,17 @@ vmxnet3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			struct ipv4_hdr *ip = (struct ipv4_hdr *)(eth + 1);
 
 			if (((ip->version_ihl & 0xf) << 2) > (int)sizeof(struct ipv4_hdr))
+#ifdef RTE_NEXT_ABI
+				rxm->packet_type = RTE_PTYPE_L3_IPV4_EXT;
+#else
 				rxm->ol_flags |= PKT_RX_IPV4_HDR_EXT;
+#endif
 			else
+#ifdef RTE_NEXT_ABI
+				rxm->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 				rxm->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 
 			if (!rcd->cnc) {
 				if (!rcd->ipc)
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (17 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

As unified packet types are used instead, those old bit masks and
the relevant macros for packet type indication need to be removed.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 lib/librte_mbuf/rte_mbuf.c | 4 ++++
 lib/librte_mbuf/rte_mbuf.h | 4 ++++
 2 files changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.
* Redefined the bit masks for packet RX offload flags.

v5 changes:
* Rolled back the bit masks of RX flags, for ABI compatibility.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index f506517..4320dd4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -251,14 +251,18 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	/* case PKT_RX_HBUF_OVERFLOW: return "PKT_RX_HBUF_OVERFLOW"; */
 	/* case PKT_RX_RECIP_ERR: return "PKT_RX_RECIP_ERR"; */
 	/* case PKT_RX_MAC_ERR: return "PKT_RX_MAC_ERR"; */
+#ifndef RTE_NEXT_ABI
 	case PKT_RX_IPV4_HDR: return "PKT_RX_IPV4_HDR";
 	case PKT_RX_IPV4_HDR_EXT: return "PKT_RX_IPV4_HDR_EXT";
 	case PKT_RX_IPV6_HDR: return "PKT_RX_IPV6_HDR";
 	case PKT_RX_IPV6_HDR_EXT: return "PKT_RX_IPV6_HDR_EXT";
+#endif /* RTE_NEXT_ABI */
 	case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
 	case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
+#ifndef RTE_NEXT_ABI
 	case PKT_RX_TUNNEL_IPV4_HDR: return "PKT_RX_TUNNEL_IPV4_HDR";
 	case PKT_RX_TUNNEL_IPV6_HDR: return "PKT_RX_TUNNEL_IPV6_HDR";
+#endif /* RTE_NEXT_ABI */
 	default: return NULL;
 	}
 }
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 3a17d95..b90c73f 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -92,14 +92,18 @@ extern "C" {
 #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
 #define PKT_RX_RECIP_ERR     (0ULL << 0)  /**< Hardware processing error. */
 #define PKT_RX_MAC_ERR       (0ULL << 0)  /**< MAC error. */
+#ifndef RTE_NEXT_ABI
 #define PKT_RX_IPV4_HDR      (1ULL << 5)  /**< RX packet with IPv4 header. */
 #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended IPv4 header. */
 #define PKT_RX_IPV6_HDR      (1ULL << 7)  /**< RX packet with IPv6 header. */
 #define PKT_RX_IPV6_HDR_EXT  (1ULL << 8)  /**< RX packet with extended IPv6 header. */
+#endif /* RTE_NEXT_ABI */
 #define PKT_RX_IEEE1588_PTP  (1ULL << 9)  /**< RX IEEE1588 L2 Ethernet PT Packet. */
 #define PKT_RX_IEEE1588_TMST (1ULL << 10) /**< RX IEEE1588 L2/L4 timestamped packet.*/
+#ifndef RTE_NEXT_ABI
 #define PKT_RX_TUNNEL_IPV4_HDR (1ULL << 11) /**< RX tunnel packet with IPv4 header.*/
 #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 header. */
+#endif /* RTE_NEXT_ABI */
 #define PKT_RX_FDIR_ID       (1ULL << 13) /**< FD id reported if FDIR match. */
 #define PKT_RX_FDIR_FLX      (1ULL << 14) /**< Flexible bytes reported if FDIR match. */
 #define PKT_RX_QINQ_PKT      (1ULL << 15)  /**< RX packet with double VLAN stripped. */
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (15 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: " Helin Zhang
                     ` (2 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/l3fwd/main.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 120 insertions(+), 3 deletions(-)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v3 changes:
* Minor bug fixes and enhancements.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 5c22ed1..b1bcb35 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -939,7 +939,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
 
 	eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		/* Handle IPv4 headers.*/
 		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
 						   sizeof(struct ether_hdr));
@@ -970,8 +974,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
 		ether_addr_copy(&ports_eth_addr[dst_port], &eth_hdr->s_addr);
 
 		send_single_packet(m, dst_port);
-
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
 	} else {
+#endif
 		/* Handle IPv6 headers.*/
 		struct ipv6_hdr *ipv6_hdr;
 
@@ -990,8 +997,13 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
 		ether_addr_copy(&ports_eth_addr[dst_port], &eth_hdr->s_addr);
 
 		send_single_packet(m, dst_port);
+#ifdef RTE_NEXT_ABI
+	} else
+		/* Free the mbuf that contains non-IPV4/IPV6 packet */
+		rte_pktmbuf_free(m);
+#else
 	}
-
+#endif
 }
 
 #ifdef DO_RFC_1812_CHECKS
@@ -1015,12 +1027,19 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
  * to BAD_PORT value.
  */
 static inline __attribute__((always_inline)) void
+#ifdef RTE_NEXT_ABI
+rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t ptype)
+#else
 rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t flags)
+#endif
 {
 	uint8_t ihl;
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(ptype)) {
+#else
 	if ((flags & PKT_RX_IPV4_HDR) != 0) {
-
+#endif
 		ihl = ipv4_hdr->version_ihl - IPV4_MIN_VER_IHL;
 
 		ipv4_hdr->time_to_live--;
@@ -1050,11 +1069,19 @@ get_dst_port(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 	struct ipv6_hdr *ipv6_hdr;
 	struct ether_hdr *eth_hdr;
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
 	if (pkt->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		if (rte_lpm_lookup(qconf->ipv4_lookup_struct, dst_ipv4,
 				&next_hop) != 0)
 			next_hop = portid;
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
 	} else if (pkt->ol_flags & PKT_RX_IPV6_HDR) {
+#endif
 		eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
 		ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
 		if (rte_lpm6_lookup(qconf->ipv6_lookup_struct,
@@ -1088,12 +1115,52 @@ process_packet(struct lcore_conf *qconf, struct rte_mbuf *pkt,
 	ve = val_eth[dp];
 
 	dst_port[0] = dp;
+#ifdef RTE_NEXT_ABI
+	rfc1812_process(ipv4_hdr, dst_port, pkt->packet_type);
+#else
 	rfc1812_process(ipv4_hdr, dst_port, pkt->ol_flags);
+#endif
 
 	te =  _mm_blend_epi16(te, ve, MASK_ETH);
 	_mm_store_si128((__m128i *)eth_hdr, te);
 }
 
+#ifdef RTE_NEXT_ABI
+/*
+ * Read packet_type and destination IPV4 addresses from 4 mbufs.
+ */
+static inline void
+processx4_step1(struct rte_mbuf *pkt[FWDSTEP],
+		__m128i *dip,
+		uint32_t *ipv4_flag)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct ether_hdr *eth_hdr;
+	uint32_t x0, x1, x2, x3;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[0], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x0 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] = pkt[0]->packet_type & RTE_PTYPE_L3_IPV4;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[1], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x1 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] &= pkt[1]->packet_type;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[2], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x2 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] &= pkt[2]->packet_type;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[3], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x3 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] &= pkt[3]->packet_type;
+
+	dip[0] = _mm_set_epi32(x3, x2, x1, x0);
+}
+#else /* RTE_NEXT_ABI */
 /*
  * Read ol_flags and destination IPV4 addresses from 4 mbufs.
  */
@@ -1126,14 +1193,24 @@ processx4_step1(struct rte_mbuf *pkt[FWDSTEP], __m128i *dip, uint32_t *flag)
 
 	dip[0] = _mm_set_epi32(x3, x2, x1, x0);
 }
+#endif /* RTE_NEXT_ABI */
 
 /*
  * Lookup into LPM for destination port.
  * If lookup fails, use incoming port (portid) as destination port.
  */
 static inline void
+#ifdef RTE_NEXT_ABI
+processx4_step2(const struct lcore_conf *qconf,
+		__m128i dip,
+		uint32_t ipv4_flag,
+		uint8_t portid,
+		struct rte_mbuf *pkt[FWDSTEP],
+		uint16_t dprt[FWDSTEP])
+#else
 processx4_step2(const struct lcore_conf *qconf, __m128i dip, uint32_t flag,
 	uint8_t portid, struct rte_mbuf *pkt[FWDSTEP], uint16_t dprt[FWDSTEP])
+#endif /* RTE_NEXT_ABI */
 {
 	rte_xmm_t dst;
 	const  __m128i bswap_mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11,
@@ -1143,7 +1220,11 @@ processx4_step2(const struct lcore_conf *qconf, __m128i dip, uint32_t flag,
 	dip = _mm_shuffle_epi8(dip, bswap_mask);
 
 	/* if all 4 packets are IPV4. */
+#ifdef RTE_NEXT_ABI
+	if (likely(ipv4_flag)) {
+#else
 	if (likely(flag != 0)) {
+#endif
 		rte_lpm_lookupx4(qconf->ipv4_lookup_struct, dip, dprt, portid);
 	} else {
 		dst.x = dip;
@@ -1193,6 +1274,16 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
 	_mm_store_si128(p[2], te[2]);
 	_mm_store_si128(p[3], te[3]);
 
+#ifdef RTE_NEXT_ABI
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[0] + 1),
+		&dst_port[0], pkt[0]->packet_type);
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[1] + 1),
+		&dst_port[1], pkt[1]->packet_type);
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[2] + 1),
+		&dst_port[2], pkt[2]->packet_type);
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[3] + 1),
+		&dst_port[3], pkt[3]->packet_type);
+#else /* RTE_NEXT_ABI */
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[0] + 1),
 		&dst_port[0], pkt[0]->ol_flags);
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[1] + 1),
@@ -1201,6 +1292,7 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
 		&dst_port[2], pkt[2]->ol_flags);
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[3] + 1),
 		&dst_port[3], pkt[3]->ol_flags);
+#endif /* RTE_NEXT_ABI */
 }
 
 /*
@@ -1387,7 +1479,11 @@ main_loop(__attribute__((unused)) void *dummy)
 	uint16_t *lp;
 	uint16_t dst_port[MAX_PKT_BURST];
 	__m128i dip[MAX_PKT_BURST / FWDSTEP];
+#ifdef RTE_NEXT_ABI
+	uint32_t ipv4_flag[MAX_PKT_BURST / FWDSTEP];
+#else
 	uint32_t flag[MAX_PKT_BURST / FWDSTEP];
+#endif
 	uint16_t pnum[MAX_PKT_BURST + 1];
 #endif
 
@@ -1457,6 +1553,18 @@ main_loop(__attribute__((unused)) void *dummy)
 				 */
 				int32_t n = RTE_ALIGN_FLOOR(nb_rx, 4);
 				for (j = 0; j < n ; j+=4) {
+#ifdef RTE_NEXT_ABI
+					uint32_t pkt_type =
+						pkts_burst[j]->packet_type &
+						pkts_burst[j+1]->packet_type &
+						pkts_burst[j+2]->packet_type &
+						pkts_burst[j+3]->packet_type;
+					if (pkt_type & RTE_PTYPE_L3_IPV4) {
+						simple_ipv4_fwd_4pkts(
+						&pkts_burst[j], portid, qconf);
+					} else if (pkt_type &
+						RTE_PTYPE_L3_IPV6) {
+#else /* RTE_NEXT_ABI */
 					uint32_t ol_flag = pkts_burst[j]->ol_flags
 							& pkts_burst[j+1]->ol_flags
 							& pkts_burst[j+2]->ol_flags
@@ -1465,6 +1573,7 @@ main_loop(__attribute__((unused)) void *dummy)
 						simple_ipv4_fwd_4pkts(&pkts_burst[j],
 									portid, qconf);
 					} else if (ol_flag & PKT_RX_IPV6_HDR) {
+#endif /* RTE_NEXT_ABI */
 						simple_ipv6_fwd_4pkts(&pkts_burst[j],
 									portid, qconf);
 					} else {
@@ -1489,13 +1598,21 @@ main_loop(__attribute__((unused)) void *dummy)
 			for (j = 0; j != k; j += FWDSTEP) {
 				processx4_step1(&pkts_burst[j],
 					&dip[j / FWDSTEP],
+#ifdef RTE_NEXT_ABI
+					&ipv4_flag[j / FWDSTEP]);
+#else
 					&flag[j / FWDSTEP]);
+#endif
 			}
 
 			k = RTE_ALIGN_FLOOR(nb_rx, FWDSTEP);
 			for (j = 0; j != k; j += FWDSTEP) {
 				processx4_step2(qconf, dip[j / FWDSTEP],
+#ifdef RTE_NEXT_ABI
+					ipv4_flag[j / FWDSTEP], portid,
+#else
 					flag[j / FWDSTEP], portid,
+#endif
 					&pkts_burst[j], &dst_port[j]);
 			}
 
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (14 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: " Helin Zhang
                     ` (3 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/l3fwd-power/main.c | 8 ++++++++
 1 file changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index d4eba1a..dbbebdd 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -635,7 +635,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 
 	eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		/* Handle IPv4 headers.*/
 		ipv4_hdr =
 			rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
@@ -670,8 +674,12 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 		ether_addr_copy(&ports_eth_addr[dst_port], &eth_hdr->s_addr);
 
 		send_single_packet(m, dst_port);
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
 	}
 	else {
+#endif
 		/* Handle IPv6 headers.*/
 #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
 		struct ipv6_hdr *ipv6_hdr;
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (16 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks Helin Zhang
  2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be enabled
by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/tep_termination/vxlan.c | 4 ++++
 1 file changed, 4 insertions(+)

v9 changes:
* Used unified packet type to check if it is a VXLAN packet, included in
  RTE_NEXT_ABI which is disabled by default.

v10 changes:
* Fixed a compile error.

diff --git a/examples/tep_termination/vxlan.c b/examples/tep_termination/vxlan.c
index b2a2f53..e98a29f 100644
--- a/examples/tep_termination/vxlan.c
+++ b/examples/tep_termination/vxlan.c
@@ -180,8 +180,12 @@ decapsulation(struct rte_mbuf *pkt)
 	 * (rfc7348) or that the rx offload flag is set (i40e only
 	 * currently)*/
 	if (udp_hdr->dst_port != rte_cpu_to_be_16(DEFAULT_VXLAN_PORT) &&
+#ifdef RTE_NEXT_ABI
+		(pkt->packet_type & RTE_PTYPE_TUNNEL_MASK) == 0)
+#else
 			(pkt->ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
 				PKT_RX_TUNNEL_IPV6_HDR)) == 0)
+#endif
 		return -1;
 	outer_header_len = info.outer_l2_len + info.outer_l3_len
 		+ sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr);
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (11 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: " Helin Zhang
                     ` (6 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/ip_fragmentation/main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 0922ba6..b71d05f 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -283,7 +283,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 	len = qconf->tx_mbufs[port_out].len;
 
 	/* if this is an IPv4 packet */
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		struct ipv4_hdr *ip_hdr;
 		uint32_t ip_dst;
 		/* Read the lookup key (i.e. ip_dst) from the input packet */
@@ -317,9 +321,14 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 			if (unlikely (len2 < 0))
 				return;
 		}
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+		/* if this is an IPv6 packet */
+#else
 	}
 	/* if this is an IPv6 packet */
 	else if (m->ol_flags & PKT_RX_IPV6_HDR) {
+#endif
 		struct ipv6_hdr *ip_hdr;
 
 		ipv6 = 1;
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (12 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: " Helin Zhang
                     ` (5 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/ip_reassembly/main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 9ecb6f9..f1c47ad 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -356,7 +356,11 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 	dst_port = portid;
 
 	/* if packet is IPv4 */
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & (PKT_RX_IPV4_HDR)) {
+#endif
 		struct ipv4_hdr *ip_hdr;
 		uint32_t ip_dst;
 
@@ -396,9 +400,14 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 		}
 
 		eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+		/* if packet is IPv6 */
+#else
 	}
 	/* if packet is IPv6 */
 	else if (m->ol_flags & (PKT_RX_IPV6_HDR | PKT_RX_IPV6_HDR_EXT)) {
+#endif
 		struct ipv6_extension_fragment *frag_hdr;
 		struct ipv6_hdr *ip_hdr;
 
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (13 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: " Helin Zhang
                     ` (4 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/l3fwd-acl/main.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd-acl/main.c b/examples/l3fwd-acl/main.c
index 29cb25e..b2bdf2f 100644
--- a/examples/l3fwd-acl/main.c
+++ b/examples/l3fwd-acl/main.c
@@ -645,10 +645,13 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
 	struct ipv4_hdr *ipv4_hdr;
 	struct rte_mbuf *pkt = pkts_in[index];
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
 	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
 
 	if (type == PKT_RX_IPV4_HDR) {
-
+#endif
 		ipv4_hdr = rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *,
 						   sizeof(struct ether_hdr));
 
@@ -667,9 +670,11 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
 			/* Not a valid IPv4 packet */
 			rte_pktmbuf_free(pkt);
 		}
-
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
 	} else if (type == PKT_RX_IPV6_HDR) {
-
+#endif
 		/* Fill acl structure */
 		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
 		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
@@ -687,17 +692,22 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
 {
 	struct rte_mbuf *pkt = pkts_in[index];
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
 	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
 
 	if (type == PKT_RX_IPV4_HDR) {
-
+#endif
 		/* Fill acl structure */
 		acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
 		acl->m_ipv4[(acl->num_ipv4)++] = pkt;
 
-
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
 	} else if (type == PKT_RX_IPV6_HDR) {
-
+#endif
 		/* Fill acl structure */
 		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
 		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
@@ -745,10 +755,17 @@ send_one_packet(struct rte_mbuf *m, uint32_t res)
 		/* in the ACL list, drop it */
 #ifdef L3FWDACL_DEBUG
 		if ((res & ACL_DENY_SIGNATURE) != 0) {
+#ifdef RTE_NEXT_ABI
+			if (RTE_ETH_IS_IPV4_HDR(m->packet_type))
+				dump_acl4_rule(m, res);
+			else if (RTE_ETH_IS_IPV6_HDR(m->packet_type))
+				dump_acl6_rule(m, res);
+#else
 			if (m->ol_flags & PKT_RX_IPV4_HDR)
 				dump_acl4_rule(m, res);
 			else
 				dump_acl6_rule(m, res);
+#endif /* RTE_NEXT_ABI */
 		}
 #endif
 		rte_pktmbuf_free(m);
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 08/19] fm10k: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (6 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 07/19] vmxnet3: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 09/19] cxgbe: " Helin Zhang
                     ` (11 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/fm10k/fm10k_rxtx.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

v4 changes:
* Supported unified packet type of fm10k from v4.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index 7d5e32c..b5fa2e6 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -68,12 +68,37 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
 static inline void
 rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
 {
+#ifdef RTE_NEXT_ABI
+	static const uint32_t
+		ptype_table[FM10K_RXD_PKTTYPE_MASK >> FM10K_RXD_PKTTYPE_SHIFT]
+			__rte_cache_aligned = {
+		[FM10K_PKTTYPE_OTHER] = RTE_PTYPE_L2_ETHER,
+		[FM10K_PKTTYPE_IPV4] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
+		[FM10K_PKTTYPE_IPV4_EX] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT,
+		[FM10K_PKTTYPE_IPV6] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6,
+		[FM10K_PKTTYPE_IPV6_EX] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT,
+		[FM10K_PKTTYPE_IPV4 | FM10K_PKTTYPE_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+		[FM10K_PKTTYPE_IPV6 | FM10K_PKTTYPE_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+		[FM10K_PKTTYPE_IPV4 | FM10K_PKTTYPE_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+		[FM10K_PKTTYPE_IPV6 | FM10K_PKTTYPE_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+	};
+
+	m->packet_type = ptype_table[(d->w.pkt_info & FM10K_RXD_PKTTYPE_MASK)
+						>> FM10K_RXD_PKTTYPE_SHIFT];
+#else /* RTE_NEXT_ABI */
 	uint16_t ptype;
 	static const uint16_t pt_lut[] = { 0,
 		PKT_RX_IPV4_HDR, PKT_RX_IPV4_HDR_EXT,
 		PKT_RX_IPV6_HDR, PKT_RX_IPV6_HDR_EXT,
 		0, 0, 0
 	};
+#endif /* RTE_NEXT_ABI */
 
 	if (d->w.pkt_info & FM10K_RXD_RSSTYPE_MASK)
 		m->ol_flags |= PKT_RX_RSS_HASH;
@@ -97,9 +122,11 @@ rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
 	if (unlikely(d->d.staterr & FM10K_RXD_STATUS_RXE))
 		m->ol_flags |= PKT_RX_RECIP_ERR;
 
+#ifndef RTE_NEXT_ABI
 	ptype = (d->d.data & FM10K_RXD_PKTTYPE_MASK_L3) >>
 						FM10K_RXD_PKTTYPE_SHIFT;
 	m->ol_flags |= pt_lut[(uint8_t)ptype];
+#endif
 }
 
 uint16_t
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (10 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 11/19] app/testpmd: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type Helin Zhang
                     ` (7 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

Severl useless code lines are added accidently, which blocks packet
type unification. They should be deleted at all.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 app/test/packet_burst_generator.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

v4 changes:
* Removed several useless code lines which block packet type unification.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/app/test/packet_burst_generator.c b/app/test/packet_burst_generator.c
index 28d9e25..d9d808b 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -273,19 +273,21 @@ nomore_mbuf:
 		if (ipv4) {
 			pkt->vlan_tci  = ETHER_TYPE_IPv4;
 			pkt->l3_len = sizeof(struct ipv4_hdr);
-
+#ifndef RTE_NEXT_ABI
 			if (vlan_enabled)
 				pkt->ol_flags = PKT_RX_IPV4_HDR | PKT_RX_VLAN_PKT;
 			else
 				pkt->ol_flags = PKT_RX_IPV4_HDR;
+#endif
 		} else {
 			pkt->vlan_tci  = ETHER_TYPE_IPv6;
 			pkt->l3_len = sizeof(struct ipv6_hdr);
-
+#ifndef RTE_NEXT_ABI
 			if (vlan_enabled)
 				pkt->ol_flags = PKT_RX_IPV6_HDR | PKT_RX_VLAN_PKT;
 			else
 				pkt->ol_flags = PKT_RX_IPV6_HDR;
+#endif
 		}
 
 		pkts_burst[nb_pkt] = pkt;
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (8 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 09/19] cxgbe: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 11/19] app/testpmd: " Helin Zhang
                     ` (9 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 app/test-pipeline/pipeline_hash.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/app/test-pipeline/pipeline_hash.c b/app/test-pipeline/pipeline_hash.c
index 4598ad4..aa3f9e5 100644
--- a/app/test-pipeline/pipeline_hash.c
+++ b/app/test-pipeline/pipeline_hash.c
@@ -459,20 +459,33 @@ app_main_loop_rx_metadata(void) {
 			signature = RTE_MBUF_METADATA_UINT32_PTR(m, 0);
 			key = RTE_MBUF_METADATA_UINT8_PTR(m, 32);
 
+#ifdef RTE_NEXT_ABI
+			if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 			if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 				ip_hdr = (struct ipv4_hdr *)
 					&m_data[sizeof(struct ether_hdr)];
 				ip_dst = ip_hdr->dst_addr;
 
 				k32 = (uint32_t *) key;
 				k32[0] = ip_dst & 0xFFFFFF00;
+#ifdef RTE_NEXT_ABI
+			} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
 			} else {
+#endif
 				ipv6_hdr = (struct ipv6_hdr *)
 					&m_data[sizeof(struct ether_hdr)];
 				ipv6_dst = ipv6_hdr->dst_addr;
 
 				memcpy(key, ipv6_dst, 16);
+#ifdef RTE_NEXT_ABI
+			} else
+				continue;
+#else
 			}
+#endif
 
 			*signature = test_hash(key, 0, 0);
 		}
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 06/19] enic: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (4 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 05/19] i40e: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 07/19] vmxnet3: " Helin Zhang
                     ` (13 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/enic/enic_main.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 15313c2..f47e96c 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -423,7 +423,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 		rx_pkt->pkt_len = bytes_written;
 
 		if (ipv4) {
+#ifdef RTE_NEXT_ABI
+			rx_pkt->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 			rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 			if (!csum_not_calc) {
 				if (unlikely(!ipv4_csum_ok))
 					rx_pkt->ol_flags |= PKT_RX_IP_CKSUM_BAD;
@@ -432,7 +436,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 					rx_pkt->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 			}
 		} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+			rx_pkt->packet_type = RTE_PTYPE_L3_IPV6;
+#else
 			rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 	} else {
 		/* Header split */
 		if (sop && !eop) {
@@ -445,7 +453,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 				*rx_pkt_bucket = rx_pkt;
 				rx_pkt->pkt_len = bytes_written;
 				if (ipv4) {
+#ifdef RTE_NEXT_ABI
+					rx_pkt->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 					rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 					if (!csum_not_calc) {
 						if (unlikely(!ipv4_csum_ok))
 							rx_pkt->ol_flags |=
@@ -457,13 +469,22 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 							    PKT_RX_L4_CKSUM_BAD;
 					}
 				} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+					rx_pkt->packet_type = RTE_PTYPE_L3_IPV6;
+#else
 					rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 			} else {
 				/* Payload */
 				hdr_rx_pkt = *rx_pkt_bucket;
 				hdr_rx_pkt->pkt_len += bytes_written;
 				if (ipv4) {
+#ifdef RTE_NEXT_ABI
+					hdr_rx_pkt->packet_type =
+						RTE_PTYPE_L3_IPV4;
+#else
 					hdr_rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 					if (!csum_not_calc) {
 						if (unlikely(!ipv4_csum_ok))
 							hdr_rx_pkt->ol_flags |=
@@ -475,7 +496,12 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 							    PKT_RX_L4_CKSUM_BAD;
 					}
 				} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+					hdr_rx_pkt->packet_type =
+						RTE_PTYPE_L3_IPV6;
+#else
 					hdr_rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 
 			}
 		}
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 11/19] app/testpmd: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (9 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code Helin Zhang
                     ` (8 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
 app/test-pmd/csumonly.c |  14 ++++
 app/test-pmd/rxonly.c   | 183 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 197 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v4 changes:
* Added printing logs of packet types of each received packet in rxonly mode.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 4287940..1bf3485 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -202,8 +202,14 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct testpmd_offload_info *info)
 
 /* Parse a vxlan header */
 static void
+#ifdef RTE_NEXT_ABI
+parse_vxlan(struct udp_hdr *udp_hdr,
+	    struct testpmd_offload_info *info,
+	    uint32_t pkt_type)
+#else
 parse_vxlan(struct udp_hdr *udp_hdr, struct testpmd_offload_info *info,
 	uint64_t mbuf_olflags)
+#endif
 {
 	struct ether_hdr *eth_hdr;
 
@@ -211,8 +217,12 @@ parse_vxlan(struct udp_hdr *udp_hdr, struct testpmd_offload_info *info,
 	 * (rfc7348) or that the rx offload flag is set (i40e only
 	 * currently) */
 	if (udp_hdr->dst_port != _htons(4789) &&
+#ifdef RTE_NEXT_ABI
+		RTE_ETH_IS_TUNNEL_PKT(pkt_type) == 0)
+#else
 		(mbuf_olflags & (PKT_RX_TUNNEL_IPV4_HDR |
 			PKT_RX_TUNNEL_IPV6_HDR)) == 0)
+#endif
 		return;
 
 	info->is_tunnel = 1;
@@ -549,7 +559,11 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 				struct udp_hdr *udp_hdr;
 				udp_hdr = (struct udp_hdr *)((char *)l3_hdr +
 					info.l3_len);
+#ifdef RTE_NEXT_ABI
+				parse_vxlan(udp_hdr, &info, m->packet_type);
+#else
 				parse_vxlan(udp_hdr, &info, m->ol_flags);
+#endif
 			} else if (info.l4_proto == IPPROTO_GRE) {
 				struct simple_gre_hdr *gre_hdr;
 				gre_hdr = (struct simple_gre_hdr *)
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 4a9f86e..632056d 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -91,7 +91,11 @@ pkt_burst_receive(struct fwd_stream *fs)
 	uint64_t ol_flags;
 	uint16_t nb_rx;
 	uint16_t i, packet_type;
+#ifdef RTE_NEXT_ABI
+	uint16_t is_encapsulation;
+#else
 	uint64_t is_encapsulation;
+#endif
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -135,8 +139,12 @@ pkt_burst_receive(struct fwd_stream *fs)
 		ol_flags = mb->ol_flags;
 		packet_type = mb->packet_type;
 
+#ifdef RTE_NEXT_ABI
+		is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type);
+#else
 		is_encapsulation = ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
 				PKT_RX_TUNNEL_IPV6_HDR);
+#endif
 
 		print_ether_addr("  src=", &eth_hdr->s_addr);
 		print_ether_addr(" - dst=", &eth_hdr->d_addr);
@@ -163,6 +171,177 @@ pkt_burst_receive(struct fwd_stream *fs)
 		if (ol_flags & PKT_RX_QINQ_PKT)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 					mb->vlan_tci, mb->vlan_tci_outer);
+#ifdef RTE_NEXT_ABI
+		if (mb->packet_type) {
+			uint32_t ptype;
+
+			/* (outer) L2 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_L2_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_L2_ETHER:
+				printf(" - (outer) L2 type: ETHER");
+				break;
+			case RTE_PTYPE_L2_ETHER_TIMESYNC:
+				printf(" - (outer) L2 type: ETHER_Timesync");
+				break;
+			case RTE_PTYPE_L2_ETHER_ARP:
+				printf(" - (outer) L2 type: ETHER_ARP");
+				break;
+			case RTE_PTYPE_L2_ETHER_LLDP:
+				printf(" - (outer) L2 type: ETHER_LLDP");
+				break;
+			default:
+				printf(" - (outer) L2 type: Unknown");
+				break;
+			}
+
+			/* (outer) L3 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_L3_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_L3_IPV4:
+				printf(" - (outer) L3 type: IPV4");
+				break;
+			case RTE_PTYPE_L3_IPV4_EXT:
+				printf(" - (outer) L3 type: IPV4_EXT");
+				break;
+			case RTE_PTYPE_L3_IPV6:
+				printf(" - (outer) L3 type: IPV6");
+				break;
+			case RTE_PTYPE_L3_IPV4_EXT_UNKNOWN:
+				printf(" - (outer) L3 type: IPV4_EXT_UNKNOWN");
+				break;
+			case RTE_PTYPE_L3_IPV6_EXT:
+				printf(" - (outer) L3 type: IPV6_EXT");
+				break;
+			case RTE_PTYPE_L3_IPV6_EXT_UNKNOWN:
+				printf(" - (outer) L3 type: IPV6_EXT_UNKNOWN");
+				break;
+			default:
+				printf(" - (outer) L3 type: Unknown");
+				break;
+			}
+
+			/* (outer) L4 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_L4_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_L4_TCP:
+				printf(" - (outer) L4 type: TCP");
+				break;
+			case RTE_PTYPE_L4_UDP:
+				printf(" - (outer) L4 type: UDP");
+				break;
+			case RTE_PTYPE_L4_FRAG:
+				printf(" - (outer) L4 type: L4_FRAG");
+				break;
+			case RTE_PTYPE_L4_SCTP:
+				printf(" - (outer) L4 type: SCTP");
+				break;
+			case RTE_PTYPE_L4_ICMP:
+				printf(" - (outer) L4 type: ICMP");
+				break;
+			case RTE_PTYPE_L4_NONFRAG:
+				printf(" - (outer) L4 type: L4_NONFRAG");
+				break;
+			default:
+				printf(" - (outer) L4 type: Unknown");
+				break;
+			}
+
+			/* packet tunnel type */
+			ptype = mb->packet_type & RTE_PTYPE_TUNNEL_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_TUNNEL_IP:
+				printf(" - Tunnel type: IP");
+				break;
+			case RTE_PTYPE_TUNNEL_GRE:
+				printf(" - Tunnel type: GRE");
+				break;
+			case RTE_PTYPE_TUNNEL_VXLAN:
+				printf(" - Tunnel type: VXLAN");
+				break;
+			case RTE_PTYPE_TUNNEL_NVGRE:
+				printf(" - Tunnel type: NVGRE");
+				break;
+			case RTE_PTYPE_TUNNEL_GENEVE:
+				printf(" - Tunnel type: GENEVE");
+				break;
+			case RTE_PTYPE_TUNNEL_GRENAT:
+				printf(" - Tunnel type: GRENAT");
+				break;
+			default:
+				printf(" - Tunnel type: Unknown");
+				break;
+			}
+
+			/* inner L2 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_INNER_L2_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_INNER_L2_ETHER:
+				printf(" - Inner L2 type: ETHER");
+				break;
+			case RTE_PTYPE_INNER_L2_ETHER_VLAN:
+				printf(" - Inner L2 type: ETHER_VLAN");
+				break;
+			default:
+				printf(" - Inner L2 type: Unknown");
+				break;
+			}
+
+			/* inner L3 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_INNER_INNER_L3_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_INNER_L3_IPV4:
+				printf(" - Inner L3 type: IPV4");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV4_EXT:
+				printf(" - Inner L3 type: IPV4_EXT");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV6:
+				printf(" - Inner L3 type: IPV6");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN:
+				printf(" - Inner L3 type: IPV4_EXT_UNKNOWN");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV6_EXT:
+				printf(" - Inner L3 type: IPV6_EXT");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN:
+				printf(" - Inner L3 type: IPV6_EXT_UNKOWN");
+				break;
+			default:
+				printf(" - Inner L3 type: Unknown");
+				break;
+			}
+
+			/* inner L4 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_INNER_L4_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_INNER_L4_TCP:
+				printf(" - Inner L4 type: TCP");
+				break;
+			case RTE_PTYPE_INNER_L4_UDP:
+				printf(" - Inner L4 type: UDP");
+				break;
+			case RTE_PTYPE_INNER_L4_FRAG:
+				printf(" - Inner L4 type: L4_FRAG");
+				break;
+			case RTE_PTYPE_INNER_L4_SCTP:
+				printf(" - Inner L4 type: SCTP");
+				break;
+			case RTE_PTYPE_INNER_L4_ICMP:
+				printf(" - Inner L4 type: ICMP");
+				break;
+			case RTE_PTYPE_INNER_L4_NONFRAG:
+				printf(" - Inner L4 type: L4_NONFRAG");
+				break;
+			default:
+				printf(" - Inner L4 type: Unknown");
+				break;
+			}
+			printf("\n");
+		} else
+			printf("Unknown packet type\n");
+#endif /* RTE_NEXT_ABI */
 		if (is_encapsulation) {
 			struct ipv4_hdr *ipv4_hdr;
 			struct ipv6_hdr *ipv6_hdr;
@@ -176,7 +355,11 @@ pkt_burst_receive(struct fwd_stream *fs)
 			l2_len  = sizeof(struct ether_hdr);
 
 			 /* Do not support ipv4 option field */
+#ifdef RTE_NEXT_ABI
+			if (RTE_ETH_IS_IPV4_HDR(packet_type)) {
+#else
 			if (ol_flags & PKT_RX_TUNNEL_IPV4_HDR) {
+#endif
 				l3_len = sizeof(struct ipv4_hdr);
 				ipv4_hdr = rte_pktmbuf_mtod_offset(mb,
 								   struct ipv4_hdr *,
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 09/19] cxgbe: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (7 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 08/19] fm10k: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: " Helin Zhang
                     ` (10 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be enabled
by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/cxgbe/sge.c | 8 ++++++++
 1 file changed, 8 insertions(+)

v9 changes:
* Added unified packet type support in newly added cxgbe driver.

diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 359296e..fdae0b4 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -1326,14 +1326,22 @@ int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 
 	mbuf->port = pkt->iff;
 	if (pkt->l2info & htonl(F_RXF_IP)) {
+#ifdef RTE_NEXT_ABI
+		mbuf->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 		mbuf->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 		if (unlikely(!csum_ok))
 			mbuf->ol_flags |= PKT_RX_IP_CKSUM_BAD;
 
 		if ((pkt->l2info & htonl(F_RXF_UDP | F_RXF_TCP)) && !csum_ok)
 			mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 	} else if (pkt->l2info & htonl(F_RXF_IP6)) {
+#ifdef RTE_NEXT_ABI
+		mbuf->packet_type = RTE_PTYPE_L3_IPV6;
+#else
 		mbuf->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 	}
 
 	mbuf->port = pkt->iff;
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 05/19] i40e: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (3 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 04/19] ixgbe: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 06/19] enic: " Helin Zhang
                     ` (14 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/i40e/i40e_rxtx.c | 554 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 554 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 88b015d..c667bbc 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -176,6 +176,540 @@ i40e_rxd_error_to_pkt_flags(uint64_t qword)
 	return flags;
 }
 
+#ifdef RTE_NEXT_ABI
+/* For each value it means, datasheet of hardware can tell more details */
+static inline uint32_t
+i40e_rxd_pkt_type_mapping(uint8_t ptype)
+{
+	static const uint32_t ptype_table[UINT8_MAX] __rte_cache_aligned = {
+		/* L2 types */
+		/* [0] reserved */
+		[1] = RTE_PTYPE_L2_ETHER,
+		[2] = RTE_PTYPE_L2_ETHER_TIMESYNC,
+		/* [3] - [5] reserved */
+		[6] = RTE_PTYPE_L2_ETHER_LLDP,
+		/* [7] - [10] reserved */
+		[11] = RTE_PTYPE_L2_ETHER_ARP,
+		/* [12] - [21] reserved */
+
+		/* Non tunneled IPv4 */
+		[22] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_FRAG,
+		[23] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_NONFRAG,
+		[24] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_UDP,
+		/* [25] reserved */
+		[26] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_TCP,
+		[27] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_SCTP,
+		[28] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_ICMP,
+
+		/* IPv4 --> IPv4 */
+		[29] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[30] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[31] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [32] reserved */
+		[33] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[34] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[35] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> IPv6 */
+		[36] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[37] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[38] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [39] reserved */
+		[40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN */
+		[43] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> IPv4 */
+		[44] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[45] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[46] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [47] reserved */
+		[48] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[49] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[50] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> IPv6 */
+		[51] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[52] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[53] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [54] reserved */
+		[55] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[56] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[57] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC */
+		[58] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC --> IPv4 */
+		[59] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[60] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[61] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [62] reserved */
+		[63] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[64] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[65] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC --> IPv6 */
+		[66] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[67] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[68] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [69] reserved */
+		[70] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[71] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[72] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC/VLAN */
+		[73] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv4 */
+		[74] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[75] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[76] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [77] reserved */
+		[78] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[79] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[80] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv6 */
+		[81] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[82] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[83] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [84] reserved */
+		[85] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[86] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[87] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* Non tunneled IPv6 */
+		[88] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_FRAG,
+		[89] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_NONFRAG,
+		[90] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_UDP,
+		/* [91] reserved */
+		[92] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_TCP,
+		[93] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_SCTP,
+		[94] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_ICMP,
+
+		/* IPv6 --> IPv4 */
+		[95] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[96] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[97] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [98] reserved */
+		[99] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[100] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[101] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> IPv6 */
+		[102] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[103] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[104] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [105] reserved */
+		[106] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[107] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[108] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN */
+		[109] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> IPv4 */
+		[110] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[111] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[112] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [113] reserved */
+		[114] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[115] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[116] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> IPv6 */
+		[117] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[118] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[119] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [120] reserved */
+		[121] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[122] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[123] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC */
+		[124] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC --> IPv4 */
+		[125] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[126] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[127] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [128] reserved */
+		[129] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[130] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[131] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC --> IPv6 */
+		[132] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[133] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[134] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [135] reserved */
+		[136] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[137] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[138] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC/VLAN */
+		[139] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv4 */
+		[140] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[141] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[142] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [143] reserved */
+		[144] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[145] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[146] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv6 */
+		[147] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[148] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[149] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [150] reserved */
+		[151] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[152] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[153] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* All others reserved */
+	};
+
+	return ptype_table[ptype];
+}
+#else /* RTE_NEXT_ABI */
 /* Translate pkt types to pkt flags */
 static inline uint64_t
 i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
@@ -443,6 +977,7 @@ i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
 
 	return ip_ptype_map[ptype];
 }
+#endif /* RTE_NEXT_ABI */
 
 #define I40E_RX_DESC_EXT_STATUS_FLEXBH_MASK   0x03
 #define I40E_RX_DESC_EXT_STATUS_FLEXBH_FD_ID  0x01
@@ -730,11 +1265,18 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 			i40e_rxd_to_vlan_tci(mb, &rxdp[j]);
 			pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
 			pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
+#ifdef RTE_NEXT_ABI
+			mb->packet_type =
+				i40e_rxd_pkt_type_mapping((uint8_t)((qword1 &
+						I40E_RXD_QW1_PTYPE_MASK) >>
+						I40E_RXD_QW1_PTYPE_SHIFT));
+#else
 			pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
 
 			mb->packet_type = (uint16_t)((qword1 &
 					I40E_RXD_QW1_PTYPE_MASK) >>
 					I40E_RXD_QW1_PTYPE_SHIFT);
+#endif /* RTE_NEXT_ABI */
 			if (pkt_flags & PKT_RX_RSS_HASH)
 				mb->hash.rss = rte_le_to_cpu_32(\
 					rxdp[j].wb.qword0.hi_dword.rss);
@@ -971,9 +1513,15 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		i40e_rxd_to_vlan_tci(rxm, &rxd);
 		pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
 		pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
+#ifdef RTE_NEXT_ABI
+		rxm->packet_type =
+			i40e_rxd_pkt_type_mapping((uint8_t)((qword1 &
+			I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT));
+#else
 		pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
 		rxm->packet_type = (uint16_t)((qword1 & I40E_RXD_QW1_PTYPE_MASK) >>
 				I40E_RXD_QW1_PTYPE_SHIFT);
+#endif /* RTE_NEXT_ABI */
 		if (pkt_flags & PKT_RX_RSS_HASH)
 			rxm->hash.rss =
 				rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
@@ -1129,10 +1677,16 @@ i40e_recv_scattered_pkts(void *rx_queue,
 		i40e_rxd_to_vlan_tci(first_seg, &rxd);
 		pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
 		pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
+#ifdef RTE_NEXT_ABI
+		first_seg->packet_type =
+			i40e_rxd_pkt_type_mapping((uint8_t)((qword1 &
+			I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT));
+#else
 		pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
 		first_seg->packet_type = (uint16_t)((qword1 &
 					I40E_RXD_QW1_PTYPE_MASK) >>
 					I40E_RXD_QW1_PTYPE_SHIFT);
+#endif /* RTE_NEXT_ABI */
 		if (pkt_flags & PKT_RX_RSS_HASH)
 			rxm->hash.rss =
 				rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 04/19] ixgbe: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (2 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 05/19] i40e: " Helin Zhang
                     ` (15 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet type among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.
Note that around 2.5% performance drop (64B) was observed of doing
4 ports (1 port per 82599 card) IO forwarding on the same SNB core.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 163 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index b1db57f..9e99e80 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -859,6 +859,110 @@ end_of_tx:
  *  RX functions
  *
  **********************************************************************/
+#ifdef RTE_NEXT_ABI
+#define IXGBE_PACKET_TYPE_IPV4              0X01
+#define IXGBE_PACKET_TYPE_IPV4_TCP          0X11
+#define IXGBE_PACKET_TYPE_IPV4_UDP          0X21
+#define IXGBE_PACKET_TYPE_IPV4_SCTP         0X41
+#define IXGBE_PACKET_TYPE_IPV4_EXT          0X03
+#define IXGBE_PACKET_TYPE_IPV4_EXT_SCTP     0X43
+#define IXGBE_PACKET_TYPE_IPV6              0X04
+#define IXGBE_PACKET_TYPE_IPV6_TCP          0X14
+#define IXGBE_PACKET_TYPE_IPV6_UDP          0X24
+#define IXGBE_PACKET_TYPE_IPV6_EXT          0X0C
+#define IXGBE_PACKET_TYPE_IPV6_EXT_TCP      0X1C
+#define IXGBE_PACKET_TYPE_IPV6_EXT_UDP      0X2C
+#define IXGBE_PACKET_TYPE_IPV4_IPV6         0X05
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_TCP     0X15
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_UDP     0X25
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT     0X0D
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_TCP 0X1D
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_UDP 0X2D
+#define IXGBE_PACKET_TYPE_MAX               0X80
+#define IXGBE_PACKET_TYPE_MASK              0X7F
+#define IXGBE_PACKET_TYPE_SHIFT             0X04
+static inline uint32_t
+ixgbe_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
+{
+	static const uint32_t
+		ptype_table[IXGBE_PACKET_TYPE_MAX] __rte_cache_aligned = {
+		[IXGBE_PACKET_TYPE_IPV4] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4,
+		[IXGBE_PACKET_TYPE_IPV4_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT,
+		[IXGBE_PACKET_TYPE_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6,
+		[IXGBE_PACKET_TYPE_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT,
+		[IXGBE_PACKET_TYPE_IPV4_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV4_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV4_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_SCTP,
+		[IXGBE_PACKET_TYPE_IPV4_EXT_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT | RTE_PTYPE_L4_SCTP,
+	};
+	if (unlikely(pkt_info & IXGBE_RXDADV_PKTTYPE_ETQF))
+		return RTE_PTYPE_UNKNOWN;
+
+	pkt_info = (pkt_info >> IXGBE_PACKET_TYPE_SHIFT) &
+				IXGBE_PACKET_TYPE_MASK;
+
+	return ptype_table[pkt_info];
+}
+
+static inline uint64_t
+ixgbe_rxd_pkt_info_to_pkt_flags(uint16_t pkt_info)
+{
+	static uint64_t ip_rss_types_map[16] __rte_cache_aligned = {
+		0, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH,
+		0, PKT_RX_RSS_HASH, 0, PKT_RX_RSS_HASH,
+		PKT_RX_RSS_HASH, 0, 0, 0,
+		0, 0, 0,  PKT_RX_FDIR,
+	};
+#ifdef RTE_LIBRTE_IEEE1588
+	static uint64_t ip_pkt_etqf_map[8] = {
+		0, 0, 0, PKT_RX_IEEE1588_PTP,
+		0, 0, 0, 0,
+	};
+
+	if (likely(pkt_info & IXGBE_RXDADV_PKTTYPE_ETQF))
+		return ip_pkt_etqf_map[(pkt_info >> 4) & 0X07] |
+				ip_rss_types_map[pkt_info & 0XF];
+	else
+		return ip_rss_types_map[pkt_info & 0XF];
+#else
+	return ip_rss_types_map[pkt_info & 0XF];
+#endif
+}
+#else /* RTE_NEXT_ABI */
 static inline uint64_t
 rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 {
@@ -894,6 +998,7 @@ rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 #endif
 	return pkt_flags | ip_rss_types_map[hl_tp_rs & 0xF];
 }
+#endif /* RTE_NEXT_ABI */
 
 static inline uint64_t
 rx_desc_status_to_pkt_flags(uint32_t rx_status)
@@ -949,7 +1054,13 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq)
 	struct rte_mbuf *mb;
 	uint16_t pkt_len;
 	uint64_t pkt_flags;
+#ifdef RTE_NEXT_ABI
+	int nb_dd;
+	uint32_t s[LOOK_AHEAD];
+	uint16_t pkt_info[LOOK_AHEAD];
+#else
 	int s[LOOK_AHEAD], nb_dd;
+#endif /* RTE_NEXT_ABI */
 	int i, j, nb_rx = 0;
 
 
@@ -972,6 +1083,12 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq)
 		for (j = LOOK_AHEAD-1; j >= 0; --j)
 			s[j] = rxdp[j].wb.upper.status_error;
 
+#ifdef RTE_NEXT_ABI
+		for (j = LOOK_AHEAD-1; j >= 0; --j)
+			pkt_info[j] = rxdp[j].wb.lower.lo_dword.
+						hs_rss.pkt_info;
+#endif /* RTE_NEXT_ABI */
+
 		/* Compute how many status bits were set */
 		nb_dd = 0;
 		for (j = 0; j < LOOK_AHEAD; ++j)
@@ -988,12 +1105,22 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq)
 			mb->vlan_tci = rte_le_to_cpu_16(rxdp[j].wb.upper.vlan);
 
 			/* convert descriptor fields to rte mbuf flags */
+#ifdef RTE_NEXT_ABI
+			pkt_flags = rx_desc_status_to_pkt_flags(s[j]);
+			pkt_flags |= rx_desc_error_to_pkt_flags(s[j]);
+			pkt_flags |=
+				ixgbe_rxd_pkt_info_to_pkt_flags(pkt_info[j]);
+			mb->ol_flags = pkt_flags;
+			mb->packet_type =
+				ixgbe_rxd_pkt_info_to_pkt_type(pkt_info[j]);
+#else /* RTE_NEXT_ABI */
 			pkt_flags  = rx_desc_hlen_type_rss_to_pkt_flags(
 					rxdp[j].wb.lower.lo_dword.data);
 			/* reuse status field from scan list */
 			pkt_flags |= rx_desc_status_to_pkt_flags(s[j]);
 			pkt_flags |= rx_desc_error_to_pkt_flags(s[j]);
 			mb->ol_flags = pkt_flags;
+#endif /* RTE_NEXT_ABI */
 
 			if (likely(pkt_flags & PKT_RX_RSS_HASH))
 				mb->hash.rss = rxdp[j].wb.lower.hi_dword.rss;
@@ -1210,7 +1337,11 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	union ixgbe_adv_rx_desc rxd;
 	uint64_t dma_addr;
 	uint32_t staterr;
+#ifdef RTE_NEXT_ABI
+	uint32_t pkt_info;
+#else
 	uint32_t hlen_type_rss;
+#endif
 	uint16_t pkt_len;
 	uint16_t rx_id;
 	uint16_t nb_rx;
@@ -1328,6 +1459,19 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		rxm->data_len = pkt_len;
 		rxm->port = rxq->port_id;
 
+#ifdef RTE_NEXT_ABI
+		pkt_info = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.hs_rss.
+								pkt_info);
+		/* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
+		rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+
+		pkt_flags = rx_desc_status_to_pkt_flags(staterr);
+		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
+		pkt_flags = pkt_flags |
+			ixgbe_rxd_pkt_info_to_pkt_flags(pkt_info);
+		rxm->ol_flags = pkt_flags;
+		rxm->packet_type = ixgbe_rxd_pkt_info_to_pkt_type(pkt_info);
+#else /* RTE_NEXT_ABI */
 		hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
 		/* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
 		rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
@@ -1336,6 +1480,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
 		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
 		rxm->ol_flags = pkt_flags;
+#endif /* RTE_NEXT_ABI */
 
 		if (likely(pkt_flags & PKT_RX_RSS_HASH))
 			rxm->hash.rss = rxd.wb.lower.hi_dword.rss;
@@ -1409,6 +1554,23 @@ ixgbe_fill_cluster_head_buf(
 	uint8_t port_id,
 	uint32_t staterr)
 {
+#ifdef RTE_NEXT_ABI
+	uint16_t pkt_info;
+	uint64_t pkt_flags;
+
+	head->port = port_id;
+
+	/* The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+	 * set in the pkt_flags field.
+	 */
+	head->vlan_tci = rte_le_to_cpu_16(desc->wb.upper.vlan);
+	pkt_info = rte_le_to_cpu_32(desc->wb.lower.lo_dword.hs_rss.pkt_info);
+	pkt_flags = rx_desc_status_to_pkt_flags(staterr);
+	pkt_flags |= rx_desc_error_to_pkt_flags(staterr);
+	pkt_flags |= ixgbe_rxd_pkt_info_to_pkt_flags(pkt_info);
+	head->ol_flags = pkt_flags;
+	head->packet_type = ixgbe_rxd_pkt_info_to_pkt_type(pkt_info);
+#else /* RTE_NEXT_ABI */
 	uint32_t hlen_type_rss;
 	uint64_t pkt_flags;
 
@@ -1424,6 +1586,7 @@ ixgbe_fill_cluster_head_buf(
 	pkt_flags |= rx_desc_status_to_pkt_flags(staterr);
 	pkt_flags |= rx_desc_error_to_pkt_flags(staterr);
 	head->ol_flags = pkt_flags;
+#endif /* RTE_NEXT_ABI */
 
 	if (likely(pkt_flags & PKT_RX_RSS_HASH))
 		head->hash.rss = rte_le_to_cpu_32(desc->wb.lower.hi_dword.rss);
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-13 15:53  0%     ` Thomas Monjalon
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
                     ` (18 subsequent siblings)
  19 siblings, 1 reply; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

In order to unify the packet type, the field of 'packet_type' in 'struct rte_mbuf'
needs to be extended from 16 to 32 bits. Accordingly, some fields in 'struct rte_mbuf'
are re-organized to support this change for Vector PMD. As 'struct rte_kni_mbuf' for
KNI should be right mapped to 'struct rte_mbuf', it should be modified accordingly.
In ixgbe PMD driver, corresponding changes are added for the mbuf changes, especially
the bit masks of packet type for 'ol_flags' are replaced by unified packet type. In
addition, more packet types (UDP, TCP and SCTP) are supported in vectorized ixgbe PMD.
To avoid breaking ABI compatibility, all the changes would be enabled by RTE_NEXT_ABI,
which is disabled by default.
Note that around 2% performance drop (64B) was observed of doing 4 ports (1 port per
82599 card) IO forwarding on the same SNB core.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 drivers/net/ixgbe/ixgbe_rxtx_vec.c                 | 75 +++++++++++++++++++++-
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  6 ++
 lib/librte_mbuf/rte_mbuf.h                         | 26 ++++++++
 3 files changed, 105 insertions(+), 2 deletions(-)

v2 changes:
* Enlarged the packet_type field from 16 bits to 32 bits.
* Redefined the packet type sub-fields.
* Updated the 'struct rte_kni_mbuf' for KNI according to the mbuf changes.

v3 changes:
* Put the mbuf layout changes into a single patch.
* Disabled vector ixgbe PMD by default, as mbuf layout changed.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.
* Integrated with changes of QinQ stripping/insertion.

v8 changes:
* Moved the field of 'vlan_tci_outer' in 'struct rte_mbuf' to the end
  of the 1st cache line, to avoid breaking any vectorized PMD storing.

v9 changes:
* Put the mbuf changes and vector PMD changes together, as they are
  tightly relevant.

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 912d3b4..d3ac74a 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -134,6 +134,12 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
  */
 #ifdef RTE_IXGBE_RX_OLFLAGS_ENABLE
 
+#ifdef RTE_NEXT_ABI
+#define OLFLAGS_MASK_V  (((uint64_t)PKT_RX_VLAN_PKT << 48) | \
+			((uint64_t)PKT_RX_VLAN_PKT << 32) | \
+			((uint64_t)PKT_RX_VLAN_PKT << 16) | \
+			((uint64_t)PKT_RX_VLAN_PKT))
+#else
 #define OLFLAGS_MASK     ((uint16_t)(PKT_RX_VLAN_PKT | PKT_RX_IPV4_HDR |\
 				     PKT_RX_IPV4_HDR_EXT | PKT_RX_IPV6_HDR |\
 				     PKT_RX_IPV6_HDR_EXT))
@@ -142,11 +148,26 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
 			  ((uint64_t)OLFLAGS_MASK << 16) | \
 			  ((uint64_t)OLFLAGS_MASK))
 #define PTYPE_SHIFT    (1)
+#endif /* RTE_NEXT_ABI */
+
 #define VTAG_SHIFT     (3)
 
 static inline void
 desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 {
+#ifdef RTE_NEXT_ABI
+	__m128i vtag0, vtag1;
+	union {
+		uint16_t e[4];
+		uint64_t dword;
+	} vol;
+
+	vtag0 = _mm_unpackhi_epi16(descs[0], descs[1]);
+	vtag1 = _mm_unpackhi_epi16(descs[2], descs[3]);
+	vtag1 = _mm_unpacklo_epi32(vtag0, vtag1);
+	vtag1 = _mm_srli_epi16(vtag1, VTAG_SHIFT);
+	vol.dword = _mm_cvtsi128_si64(vtag1) & OLFLAGS_MASK_V;
+#else
 	__m128i ptype0, ptype1, vtag0, vtag1;
 	union {
 		uint16_t e[4];
@@ -166,6 +187,7 @@ desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 
 	ptype1 = _mm_or_si128(ptype1, vtag1);
 	vol.dword = _mm_cvtsi128_si64(ptype1) & OLFLAGS_MASK_V;
+#endif /* RTE_NEXT_ABI */
 
 	rx_pkts[0]->ol_flags = vol.e[0];
 	rx_pkts[1]->ol_flags = vol.e[1];
@@ -196,6 +218,18 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 	int pos;
 	uint64_t var;
 	__m128i shuf_msk;
+#ifdef RTE_NEXT_ABI
+	__m128i crc_adjust = _mm_set_epi16(
+				0, 0, 0,    /* ignore non-length fields */
+				-rxq->crc_len, /* sub crc on data_len */
+				0,          /* ignore high-16bits of pkt_len */
+				-rxq->crc_len, /* sub crc on pkt_len */
+				0, 0            /* ignore pkt_type field */
+			);
+	__m128i dd_check, eop_check;
+	__m128i desc_mask = _mm_set_epi32(0xFFFFFFFF, 0xFFFFFFFF,
+					  0xFFFFFFFF, 0xFFFF07F0);
+#else
 	__m128i crc_adjust = _mm_set_epi16(
 				0, 0, 0, 0, /* ignore non-length fields */
 				0,          /* ignore high-16bits of pkt_len */
@@ -204,6 +238,7 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 				0            /* ignore pkt_type field */
 			);
 	__m128i dd_check, eop_check;
+#endif /* RTE_NEXT_ABI */
 
 	if (unlikely(nb_pkts < RTE_IXGBE_VPMD_RX_BURST))
 		return 0;
@@ -232,6 +267,18 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 	eop_check = _mm_set_epi64x(0x0000000200000002LL, 0x0000000200000002LL);
 
 	/* mask to shuffle from desc. to mbuf */
+#ifdef RTE_NEXT_ABI
+	shuf_msk = _mm_set_epi8(
+		7, 6, 5, 4,  /* octet 4~7, 32bits rss */
+		15, 14,      /* octet 14~15, low 16 bits vlan_macip */
+		13, 12,      /* octet 12~13, 16 bits data_len */
+		0xFF, 0xFF,  /* skip high 16 bits pkt_len, zero out */
+		13, 12,      /* octet 12~13, low 16 bits pkt_len */
+		0xFF, 0xFF,  /* skip high 16 bits pkt_type */
+		1,           /* octet 1, 8 bits pkt_type field */
+		0            /* octet 0, 4 bits offset 4 pkt_type field */
+		);
+#else
 	shuf_msk = _mm_set_epi8(
 		7, 6, 5, 4,  /* octet 4~7, 32bits rss */
 		0xFF, 0xFF,  /* skip high 16 bits vlan_macip, zero out */
@@ -241,18 +288,28 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		13, 12,      /* octet 12~13, 16 bits data_len */
 		0xFF, 0xFF   /* skip pkt_type field */
 		);
+#endif /* RTE_NEXT_ABI */
 
 	/* Cache is empty -> need to scan the buffer rings, but first move
 	 * the next 'n' mbufs into the cache */
 	sw_ring = &rxq->sw_ring[rxq->rx_tail];
 
-	/*
-	 * A. load 4 packet in one loop
+#ifdef RTE_NEXT_ABI
+	/* A. load 4 packet in one loop
+	 * [A*. mask out 4 unused dirty field in desc]
 	 * B. copy 4 mbuf point from swring to rx_pkts
 	 * C. calc the number of DD bits among the 4 packets
 	 * [C*. extract the end-of-packet bit, if requested]
 	 * D. fill info. from desc to mbuf
 	 */
+#else
+	/* A. load 4 packet in one loop
+	 * B. copy 4 mbuf point from swring to rx_pkts
+	 * C. calc the number of DD bits among the 4 packets
+	 * [C*. extract the end-of-packet bit, if requested]
+	 * D. fill info. from desc to mbuf
+	 */
+#endif /* RTE_NEXT_ABI */
 	for (pos = 0, nb_pkts_recd = 0; pos < RTE_IXGBE_VPMD_RX_BURST;
 			pos += RTE_IXGBE_DESCS_PER_LOOP,
 			rxdp += RTE_IXGBE_DESCS_PER_LOOP) {
@@ -289,6 +346,16 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		_mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2);
 
+#ifdef RTE_NEXT_ABI
+		/* A* mask out 0~3 bits RSS type */
+		descs[3] = _mm_and_si128(descs[3], desc_mask);
+		descs[2] = _mm_and_si128(descs[2], desc_mask);
+
+		/* A* mask out 0~3 bits RSS type */
+		descs[1] = _mm_and_si128(descs[1], desc_mask);
+		descs[0] = _mm_and_si128(descs[0], desc_mask);
+#endif /* RTE_NEXT_ABI */
+
 		/* avoid compiler reorder optimization */
 		rte_compiler_barrier();
 
@@ -301,7 +368,11 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* C.1 4=>2 filter staterr info only */
 		sterr_tmp1 = _mm_unpackhi_epi32(descs[1], descs[0]);
 
+#ifdef RTE_NEXT_ABI
+		/* set ol_flags with vlan packet type */
+#else
 		/* set ol_flags with packet type and vlan tag */
+#endif /* RTE_NEXT_ABI */
 		desc_to_olflags_v(descs, &rx_pkts[pos]);
 
 		/* D.2 pkt 3,4 set in_port/nb_seg and remove crc */
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 1e55c2d..e9f38bd 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -117,9 +117,15 @@ struct rte_kni_mbuf {
 	uint16_t data_off;      /**< Start address of data in segment buffer. */
 	char pad1[4];
 	uint64_t ol_flags;      /**< Offload features. */
+#ifdef RTE_NEXT_ABI
+	char pad2[4];
+	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
+	uint16_t data_len;      /**< Amount of data in segment buffer. */
+#else
 	char pad2[2];
 	uint16_t data_len;      /**< Amount of data in segment buffer. */
 	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
+#endif
 
 	/* fields on second cache line */
 	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 80419df..ac29da3 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -276,6 +276,28 @@ struct rte_mbuf {
 	/* remaining bytes are set on RX when pulling packet from descriptor */
 	MARKER rx_descriptor_fields1;
 
+#ifdef RTE_NEXT_ABI
+	/*
+	 * The packet type, which is the combination of outer/inner L2, L3, L4
+	 * and tunnel types.
+	 */
+	union {
+		uint32_t packet_type; /**< L2/L3/L4 and tunnel information. */
+		struct {
+			uint32_t l2_type:4; /**< (Outer) L2 type. */
+			uint32_t l3_type:4; /**< (Outer) L3 type. */
+			uint32_t l4_type:4; /**< (Outer) L4 type. */
+			uint32_t tun_type:4; /**< Tunnel type. */
+			uint32_t inner_l2_type:4; /**< Inner L2 type. */
+			uint32_t inner_l3_type:4; /**< Inner L3 type. */
+			uint32_t inner_l4_type:4; /**< Inner L4 type. */
+		};
+	};
+
+	uint32_t pkt_len;         /**< Total pkt len: sum of all segments. */
+	uint16_t data_len;        /**< Amount of data in segment buffer. */
+	uint16_t vlan_tci;        /**< VLAN Tag Control Identifier (CPU order) */
+#else /* RTE_NEXT_ABI */
 	/**
 	 * The packet type, which is used to indicate ordinary packet and also
 	 * tunneled packet format, i.e. each number is represented a type of
@@ -287,6 +309,7 @@ struct rte_mbuf {
 	uint32_t pkt_len;         /**< Total pkt len: sum of all segments. */
 	uint16_t vlan_tci;        /**< VLAN Tag Control Identifier (CPU order) */
 	uint16_t vlan_tci_outer;  /**< Outer VLAN Tag Control Identifier (CPU order) */
+#endif /* RTE_NEXT_ABI */
 	union {
 		uint32_t rss;     /**< RSS hash result if RSS enabled */
 		struct {
@@ -307,6 +330,9 @@ struct rte_mbuf {
 	} hash;                   /**< hash information */
 
 	uint32_t seqn; /**< Sequence number. See also rte_reorder_insert() */
+#ifdef RTE_NEXT_ABI
+	uint16_t vlan_tci_outer;  /**< Outer VLAN Tag Control Identifier (CPU order) */
+#endif /* RTE_NEXT_ABI */
 
 	/* second cache line - fields only used in slow path or on TX */
 	MARKER cacheline1 __rte_cache_aligned;
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 04/19] ixgbe: " Helin Zhang
                     ` (16 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/e1000/igb_rxtx.c | 104 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 43d6703..165144c 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -590,6 +590,101 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
  *  RX functions
  *
  **********************************************************************/
+#ifdef RTE_NEXT_ABI
+#define IGB_PACKET_TYPE_IPV4              0X01
+#define IGB_PACKET_TYPE_IPV4_TCP          0X11
+#define IGB_PACKET_TYPE_IPV4_UDP          0X21
+#define IGB_PACKET_TYPE_IPV4_SCTP         0X41
+#define IGB_PACKET_TYPE_IPV4_EXT          0X03
+#define IGB_PACKET_TYPE_IPV4_EXT_SCTP     0X43
+#define IGB_PACKET_TYPE_IPV6              0X04
+#define IGB_PACKET_TYPE_IPV6_TCP          0X14
+#define IGB_PACKET_TYPE_IPV6_UDP          0X24
+#define IGB_PACKET_TYPE_IPV6_EXT          0X0C
+#define IGB_PACKET_TYPE_IPV6_EXT_TCP      0X1C
+#define IGB_PACKET_TYPE_IPV6_EXT_UDP      0X2C
+#define IGB_PACKET_TYPE_IPV4_IPV6         0X05
+#define IGB_PACKET_TYPE_IPV4_IPV6_TCP     0X15
+#define IGB_PACKET_TYPE_IPV4_IPV6_UDP     0X25
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT     0X0D
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT_TCP 0X1D
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT_UDP 0X2D
+#define IGB_PACKET_TYPE_MAX               0X80
+#define IGB_PACKET_TYPE_MASK              0X7F
+#define IGB_PACKET_TYPE_SHIFT             0X04
+static inline uint32_t
+igb_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
+{
+	static const uint32_t
+		ptype_table[IGB_PACKET_TYPE_MAX] __rte_cache_aligned = {
+		[IGB_PACKET_TYPE_IPV4] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4,
+		[IGB_PACKET_TYPE_IPV4_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT,
+		[IGB_PACKET_TYPE_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6,
+		[IGB_PACKET_TYPE_IPV4_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6,
+		[IGB_PACKET_TYPE_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT,
+		[IGB_PACKET_TYPE_IPV4_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT,
+		[IGB_PACKET_TYPE_IPV4_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+		[IGB_PACKET_TYPE_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_TCP,
+		[IGB_PACKET_TYPE_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_TCP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_TCP,
+		[IGB_PACKET_TYPE_IPV4_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+		[IGB_PACKET_TYPE_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_UDP] =  RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_UDP,
+		[IGB_PACKET_TYPE_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_UDP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_UDP,
+		[IGB_PACKET_TYPE_IPV4_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_SCTP,
+		[IGB_PACKET_TYPE_IPV4_EXT_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT | RTE_PTYPE_L4_SCTP,
+	};
+	if (unlikely(pkt_info & E1000_RXDADV_PKTTYPE_ETQF))
+		return RTE_PTYPE_UNKNOWN;
+
+	pkt_info = (pkt_info >> IGB_PACKET_TYPE_SHIFT) & IGB_PACKET_TYPE_MASK;
+
+	return ptype_table[pkt_info];
+}
+
+static inline uint64_t
+rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
+{
+	uint64_t pkt_flags = ((hl_tp_rs & 0x0F) == 0) ?  0 : PKT_RX_RSS_HASH;
+
+#if defined(RTE_LIBRTE_IEEE1588)
+	static uint32_t ip_pkt_etqf_map[8] = {
+		0, 0, 0, PKT_RX_IEEE1588_PTP,
+		0, 0, 0, 0,
+	};
+
+	pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 4) & 0x07];
+#endif
+
+	return pkt_flags;
+}
+#else /* RTE_NEXT_ABI */
 static inline uint64_t
 rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 {
@@ -617,6 +712,7 @@ rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 #endif
 	return pkt_flags | (((hl_tp_rs & 0x0F) == 0) ?  0 : PKT_RX_RSS_HASH);
 }
+#endif /* RTE_NEXT_ABI */
 
 static inline uint64_t
 rx_desc_status_to_pkt_flags(uint32_t rx_status)
@@ -790,6 +886,10 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
 		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
 		rxm->ol_flags = pkt_flags;
+#ifdef RTE_NEXT_ABI
+		rxm->packet_type = igb_rxd_pkt_info_to_pkt_type(rxd.wb.lower.
+						lo_dword.hs_rss.pkt_info);
+#endif
 
 		/*
 		 * Store the mbuf address into the next entry of the array
@@ -1024,6 +1124,10 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
 		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
 		first_seg->ol_flags = pkt_flags;
+#ifdef RTE_NEXT_ABI
+		first_seg->packet_type = igb_rxd_pkt_info_to_pkt_type(rxd.wb.
+					lower.lo_dword.hs_rss.pkt_info);
+#endif
 
 		/* Prefetch data of first segment, if configured to do so. */
 		rte_packet_prefetch((char *)first_seg->buf_addr +
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-15 10:19  0%     ` Olivier MATZ
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type Helin Zhang
                     ` (17 subsequent siblings)
  19 siblings, 1 reply; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

As there are only 6 bit flags in ol_flags for indicating packet
types, which is not enough to describe all the possible packet
types hardware can recognize. For example, i40e hardware can
recognize more than 150 packet types. Unified packet type is
composed of L2 type, L3 type, L4 type, tunnel type, inner L2 type,
inner L3 type and inner L4 type fields, and can be stored in
'struct rte_mbuf' of 32 bits field 'packet_type'.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 lib/librte_mbuf/rte_mbuf.h | 486 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 486 insertions(+)

v3 changes:
* Put the definitions of unified packet type into a single patch.

v4 changes:
* Added detailed description of each packet types.

v5 changes:
* Re-worded the commit logs.
* Added more detailed description for all packet types, together with examples.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.
* Corrected the packet type explanation of RTE_PTYPE_L2_ETHER.

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ac29da3..3a17d95 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -202,6 +202,492 @@ extern "C" {
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG       (1ULL << 63) /**< Mbuf contains control data */
 
+#ifdef RTE_NEXT_ABI
+/*
+ * 32 bits are divided into several fields to mark packet types. Note that
+ * each field is indexical.
+ * - Bit 3:0 is for L2 types.
+ * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
+ * - Bit 11:8 is for L4 or outer L4 (for tunneling case) types.
+ * - Bit 15:12 is for tunnel types.
+ * - Bit 19:16 is for inner L2 types.
+ * - Bit 23:20 is for inner L3 types.
+ * - Bit 27:24 is for inner L4 types.
+ * - Bit 31:28 is reserved.
+ *
+ * To be compatible with Vector PMD, RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV4_EXT,
+ * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP
+ * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
+ *
+ * Note that L3 types values are selected for checking IPV4/IPV6 header from
+ * performance point of view. Reading annotations of RTE_ETH_IS_IPV4_HDR and
+ * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type values.
+ *
+ * Note that the packet types of the same packet recognized by different
+ * hardware may be different, as different hardware may have different
+ * capability of packet type recognition.
+ *
+ * examples:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=0x29
+ * | 'version'=6, 'next header'=0x3A
+ * | 'ICMPv6 header'>
+ * will be recognized on i40e hardware as packet type combination of,
+ * RTE_PTYPE_L2_ETHER |
+ * RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+ * RTE_PTYPE_TUNNEL_IP |
+ * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_INNER_L4_ICMP.
+ *
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x2F
+ * | 'GRE header'
+ * | 'version'=6, 'next header'=0x11
+ * | 'UDP header'>
+ * will be recognized on i40e hardware as packet type combination of,
+ * RTE_PTYPE_L2_ETHER |
+ * RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_TUNNEL_GRENAT |
+ * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_INNER_L4_UDP.
+ */
+#define RTE_PTYPE_UNKNOWN                   0x00000000
+/**
+ * Ethernet packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=[0x0800|0x86DD]>
+ */
+#define RTE_PTYPE_L2_ETHER                  0x00000001
+/**
+ * Ethernet packet type for time sync.
+ *
+ * Packet format:
+ * <'ether type'=0x88F7>
+ */
+#define RTE_PTYPE_L2_ETHER_TIMESYNC         0x00000002
+/**
+ * ARP (Address Resolution Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0806>
+ */
+#define RTE_PTYPE_L2_ETHER_ARP              0x00000003
+/**
+ * LLDP (Link Layer Discovery Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x88CC>
+ */
+#define RTE_PTYPE_L2_ETHER_LLDP             0x00000004
+/**
+ * Mask of layer 2 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L2_MASK                   0x0000000f
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and does not contain any
+ * header option.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=5>
+ */
+#define RTE_PTYPE_L3_IPV4                   0x00000010
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and contains header
+ * options.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[6-15], 'options'>
+ */
+#define RTE_PTYPE_L3_IPV4_EXT               0x00000030
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for outer packet for tunneling cases, and does not contain any
+ * extension header.
+ *
+ * Packet format:
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x3B>
+ */
+#define RTE_PTYPE_L3_IPV6                   0x00000040
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and may or maynot contain
+ * header options.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[5-15], <'options'>>
+ */
+#define RTE_PTYPE_L3_IPV4_EXT_UNKNOWN       0x00000090
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for outer packet for tunneling cases, and contains extension
+ * headers.
+ *
+ * Packet format:
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   'extension headers'>
+ */
+#define RTE_PTYPE_L3_IPV6_EXT               0x000000c0
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for outer packet for tunneling cases, and may or maynot contain
+ * extension headers.
+ *
+ * Packet format:
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x3B|0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   <'extension headers'>>
+ */
+#define RTE_PTYPE_L3_IPV6_EXT_UNKNOWN       0x000000e0
+/**
+ * Mask of layer 3 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L3_MASK                   0x000000f0
+/**
+ * TCP (Transmission Control Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=6, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=6>
+ */
+#define RTE_PTYPE_L4_TCP                    0x00000100
+/**
+ * UDP (User Datagram Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17>
+ */
+#define RTE_PTYPE_L4_UDP                    0x00000200
+/**
+ * Fragmented IP (Internet Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * It refers to those packets of any IP types, which can be recognized as
+ * fragmented. A fragmented packet cannot be recognized as any other L4 types
+ * (RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP,
+ * RTE_PTYPE_L4_NONFRAG).
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'MF'=1>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=44>
+ */
+#define RTE_PTYPE_L4_FRAG                   0x00000300
+/**
+ * SCTP (Stream Control Transmission Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=132, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=132>
+ */
+#define RTE_PTYPE_L4_SCTP                   0x00000400
+/**
+ * ICMP (Internet Control Message Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=1, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=1>
+ */
+#define RTE_PTYPE_L4_ICMP                   0x00000500
+/**
+ * Non-fragmented IP (Internet Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * It refers to those packets of any IP types, while cannot be recognized as
+ * any of above L4 types (RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP,
+ * RTE_PTYPE_L4_FRAG, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP).
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'!=[6|17|44|132|1]>
+ */
+#define RTE_PTYPE_L4_NONFRAG                0x00000600
+/**
+ * Mask of layer 4 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L4_MASK                   0x00000f00
+/**
+ * IP (Internet Protocol) in IP (Internet Protocol) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=[4|41]>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[4|41]>
+ */
+#define RTE_PTYPE_TUNNEL_IP                 0x00001000
+/**
+ * GRE (Generic Routing Encapsulation) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=47>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=47>
+ */
+#define RTE_PTYPE_TUNNEL_GRE                0x00002000
+/**
+ * VXLAN (Virtual eXtensible Local Area Network) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17
+ * | 'destination port'=4798>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17
+ * | 'destination port'=4798>
+ */
+#define RTE_PTYPE_TUNNEL_VXLAN              0x00003000
+/**
+ * NVGRE (Network Virtualization using Generic Routing Encapsulation) tunneling
+ * packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=47
+ * | 'protocol type'=0x6558>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=47
+ * | 'protocol type'=0x6558'>
+ */
+#define RTE_PTYPE_TUNNEL_NVGRE              0x00004000
+/**
+ * GENEVE (Generic Network Virtualization Encapsulation) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17
+ * | 'destination port'=6081>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17
+ * | 'destination port'=6081>
+ */
+#define RTE_PTYPE_TUNNEL_GENEVE             0x00005000
+/**
+ * Tunneling packet type of Teredo, VXLAN (Virtual eXtensible Local Area
+ * Network) or GRE (Generic Routing Encapsulation) could be recognized as this
+ * packet type, if they can not be recognized independently as of hardware
+ * capability.
+ */
+#define RTE_PTYPE_TUNNEL_GRENAT             0x00006000
+/**
+ * Mask of tunneling packet types.
+ */
+#define RTE_PTYPE_TUNNEL_MASK               0x0000f000
+/**
+ * Ethernet packet type.
+ * It is used for inner packet type only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=[0x800|0x86DD]>
+ */
+#define RTE_PTYPE_INNER_L2_ETHER            0x00010000
+/**
+ * Ethernet packet type with VLAN (Virtual Local Area Network) tag.
+ *
+ * Packet format (inner only):
+ * <'ether type'=[0x800|0x86DD], vlan=[1-4095]>
+ */
+#define RTE_PTYPE_INNER_L2_ETHER_VLAN       0x00020000
+/**
+ * Mask of inner layer 2 packet types.
+ */
+#define RTE_PTYPE_INNER_L2_MASK             0x000f0000
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for inner packet only, and does not contain any header option.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=5>
+ */
+#define RTE_PTYPE_INNER_L3_IPV4             0x00100000
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for inner packet only, and contains header options.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[6-15], 'options'>
+ */
+#define RTE_PTYPE_INNER_L3_IPV4_EXT         0x00200000
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for inner packet only, and does not contain any extension header.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x3B>
+ */
+#define RTE_PTYPE_INNER_L3_IPV6             0x00300000
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for inner packet only, and may or maynot contain header options.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[5-15], <'options'>>
+ */
+#define RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN 0x00400000
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for inner packet only, and contains extension headers.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   'extension headers'>
+ */
+#define RTE_PTYPE_INNER_L3_IPV6_EXT         0x00500000
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for inner packet only, and may or maynot contain extension
+ * headers.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x3B|0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   <'extension headers'>>
+ */
+#define RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN 0x00600000
+/**
+ * Mask of inner layer 3 packet types.
+ */
+#define RTE_PTYPE_INNER_INNER_L3_MASK       0x00f00000
+/**
+ * TCP (Transmission Control Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=6, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=6>
+ */
+#define RTE_PTYPE_INNER_L4_TCP              0x01000000
+/**
+ * UDP (User Datagram Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17>
+ */
+#define RTE_PTYPE_INNER_L4_UDP              0x02000000
+/**
+ * Fragmented IP (Internet Protocol) packet type.
+ * It is used for inner packet only, and may or maynot have layer 4 packet.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'MF'=1>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=44>
+ */
+#define RTE_PTYPE_INNER_L4_FRAG             0x03000000
+/**
+ * SCTP (Stream Control Transmission Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=132, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=132>
+ */
+#define RTE_PTYPE_INNER_L4_SCTP             0x04000000
+/**
+ * ICMP (Internet Control Message Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=1, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=1>
+ */
+#define RTE_PTYPE_INNER_L4_ICMP             0x05000000
+/**
+ * Non-fragmented IP (Internet Protocol) packet type.
+ * It is used for inner packet only, and may or maynot have other unknown layer
+ * 4 packet types.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'!=[6|17|44|132|1]>
+ */
+#define RTE_PTYPE_INNER_L4_NONFRAG          0x06000000
+/**
+ * Mask of inner layer 4 packet types.
+ */
+#define RTE_PTYPE_INNER_L4_MASK             0x0f000000
+
+/**
+ * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4 types one by
+ * one, bit 4 is selected to be used for IPv4 only. Then checking bit 4 can
+ * determin if it is an IPV4 packet.
+ */
+#define  RTE_ETH_IS_IPV4_HDR(ptype) ((ptype) & RTE_PTYPE_L3_IPV4)
+
+/**
+ * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4 types one by
+ * one, bit 6 is selected to be used for IPv4 only. Then checking bit 6 can
+ * determin if it is an IPV4 packet.
+ */
+#define  RTE_ETH_IS_IPV6_HDR(ptype) ((ptype) & RTE_PTYPE_L3_IPV6)
+
+/* Check if it is a tunneling packet */
+#define RTE_ETH_IS_TUNNEL_PKT(ptype) ((ptype) & RTE_PTYPE_TUNNEL_MASK)
+#endif /* RTE_NEXT_ABI */
+
 /**
  * Get the name of a RX offload flag
  *
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 00/19] unified packet type
  @ 2015-07-09 16:31  4% ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
                     ` (19 more replies)
  0 siblings, 20 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

Currently only 6 bits which are stored in ol_flags are used to indicate the
packet types. This is not enough, as some NIC hardware can recognize quite
a lot of packet types, e.g i40e hardware can recognize more than 150 packet
types. Hiding those packet types hides hardware offload capabilities which
could be quite useful for improving performance and for end users.
So an unified packet types are needed to support all possible PMDs. A 16
bits packet_type in mbuf structure can be changed to 32 bits and used for
this purpose. In addition, all packet types stored in ol_flag field should
be deleted at all, and 6 bits of ol_flags can be save as the benifit.

Initially, 32 bits of packet_type can be divided into several sub fields to
indicate different packet type information of a packet. The initial design
is to divide those bits into fields for L2 types, L3 types, L4 types, tunnel
types, inner L2 types, inner L3 types and inner L4 types. All PMDs should
translate the offloaded packet types into these 7 fields of information, for
user applications.

To avoid breaking ABI compatibility, currently all the code changes for
unified packet type are disabled at compile time by default. Users can enable
it manually by defining the macro of RTE_NEXT_ABI. The code changes will be
valid by default in a future release, and the old version will be deleted
accordingly, after the ABI change process is done.

Note that this patch set should be integrated after another patch set
for '[PATCH v3 0/7] support i40e QinQ stripping and insertion', to clearly
solve the conflict during integration. As both patch sets modified 'struct
rte_mbuf', and the final layout of the 'struct rte_mbuf' is key to vectorized
ixgbe PMD.

Its v8 version was acked by Konstantin Ananyev <konstantin.ananyev@intel.com>

v2 changes:
* Enlarged the packet_type field from 16 bits to 32 bits.
* Redefined the packet type sub-fields.
* Updated the 'struct rte_kni_mbuf' for KNI according to the mbuf changes.
* Used redefined packet types and enlarged packet_type field for all PMDs
  and corresponding applications.
* Removed changes in bond and its relevant application, as there is no need
  at all according to the recent bond changes.

v3 changes:
* Put the mbuf layout changes into a single patch.
* Put vector ixgbe changes right after mbuf changes.
* Disabled vector ixgbe PMD by default, as mbuf layout changed, and then
  re-enabled it after vector ixgbe PMD updated.
* Put the definitions of unified packet type into a single patch.
* Minor bug fixes and enhancements in l3fwd example.

v4 changes:
* Added detailed description of each packet types.
* Supported unified packet type of fm10k.
* Added printing logs of packet types of each received packet for rxonly
  mode in testpmd.
* Removed several useless code lines which block packet type unification from
  app/test/packet_burst_generator.c.

v5 changes:
* Added more detailed description for each packet types, together with examples.
* Rolled back the macro definitions of RX packet flags, for ABI compitability.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.
* Integrated with patch set for '[PATCH v3 0/7] support i40e QinQ stripping
  and insertion', to clearly solve the conflicts during merging.

v8 changes:
* Moved the field of 'vlan_tci_outer' in 'struct rte_mbuf' to the end of the 1st
  cache line, to avoid breaking any vectorized PMD storing, as fields of
  'packet_type, pkt_len, data_len, vlan_tci, rss' should be in an contiguous 128
  bits.

v9 changes:
* Put the mbuf changes and vector PMD changes together, as they are
  tightly relevant.
* Renamed MAC to ETHER in packet type names.
* Corrected the packet type explanation of RTE_PTYPE_L2_ETHER.
* Reworked newly added cxgbe driver and tep_termination example application to
  support unified packet type, which is disabled by default.

v10 changes:
* Fixed a compile error in tep_termination, when RTE_NEXT_ABI is enabled.

Helin Zhang (19):
  mbuf: redefine packet_type in rte_mbuf
  mbuf: add definitions of unified packet types
  e1000: replace bit mask based packet type with unified packet type
  ixgbe: replace bit mask based packet type with unified packet type
  i40e: replace bit mask based packet type with unified packet type
  enic: replace bit mask based packet type with unified packet type
  vmxnet3: replace bit mask based packet type with unified packet type
  fm10k: replace bit mask based packet type with unified packet type
  cxgbe: replace bit mask based packet type with unified packet type
  app/test-pipeline: replace bit mask based packet type with unified
    packet type
  app/testpmd: replace bit mask based packet type with unified packet
    type
  app/test: Remove useless code
  examples/ip_fragmentation: replace bit mask based packet type with
    unified packet type
  examples/ip_reassembly: replace bit mask based packet type with
    unified packet type
  examples/l3fwd-acl: replace bit mask based packet type with unified
    packet type
  examples/l3fwd-power: replace bit mask based packet type with unified
    packet type
  examples/l3fwd: replace bit mask based packet type with unified packet
    type
  examples/tep_termination: replace bit mask based packet type with
    unified packet type
  mbuf: remove old packet type bit masks

 app/test-pipeline/pipeline_hash.c                  |  13 +
 app/test-pmd/csumonly.c                            |  14 +
 app/test-pmd/rxonly.c                              | 183 +++++++
 app/test/packet_burst_generator.c                  |   6 +-
 drivers/net/cxgbe/sge.c                            |   8 +
 drivers/net/e1000/igb_rxtx.c                       | 104 ++++
 drivers/net/enic/enic_main.c                       |  26 +
 drivers/net/fm10k/fm10k_rxtx.c                     |  27 +
 drivers/net/i40e/i40e_rxtx.c                       | 554 +++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.c                     | 163 ++++++
 drivers/net/ixgbe/ixgbe_rxtx_vec.c                 |  75 ++-
 drivers/net/vmxnet3/vmxnet3_rxtx.c                 |   8 +
 examples/ip_fragmentation/main.c                   |   9 +
 examples/ip_reassembly/main.c                      |   9 +
 examples/l3fwd-acl/main.c                          |  29 +-
 examples/l3fwd-power/main.c                        |   8 +
 examples/l3fwd/main.c                              | 123 ++++-
 examples/tep_termination/vxlan.c                   |   4 +
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |   6 +
 lib/librte_mbuf/rte_mbuf.c                         |   4 +
 lib/librte_mbuf/rte_mbuf.h                         | 516 +++++++++++++++++++
 21 files changed, 1876 insertions(+), 13 deletions(-)

-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2
  2015-07-09 15:51  7%         ` Thomas Monjalon
@ 2015-07-09 16:01  4%           ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-09 16:01 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, Jul 09, 2015 at 05:51:16PM +0200, Thomas Monjalon wrote:
> 2015-07-08 14:10, Bruce Richardson:
> > On Mon, Jul 06, 2015 at 03:16:01PM +0200, Thomas Monjalon wrote:
> > > 2015-07-02 16:16, John McNamara:
> > > > --- a/doc/guides/rel_notes/abi.rst
> > > > +++ b/doc/guides/rel_notes/abi.rst
> > > >  Deprecation Notices
> > > >  -------------------
> > > > +
> > > > +* In DPDK 2.1 the IEEE1588/802.1AS support in the i40e driver makes use of the
> > > > +  ``udata64`` field in the mbuf to pass the timesync register index to the
> > > > +  user. In DPDK 2.2 this will be moved to a new field in the mbuf.
> > > 
> > > We need more acknowledgements for this decision, as stated here:
> > > http://dpdk.org/browse/dpdk/tree/doc/guides/guidelines/versioning.rst#n51
> > 
> > Why can't this new field just be added at the end of cache line 1 (the second
> > cache line) of the mbuf? That would avoid any ABI breakage and would mean we
> > can just put the change in in this release, instead of waiting.
> 
> Are you sure that (because of __rte_cache_aligned) the size of the structure
> is never increased with this new field?
> Please confirm your opinion.

This is checked at compile time by the test app.

 930 static int
 931 test_mbuf(void)
 932 {
 933         RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) != RTE_CACHE_LINE_SIZE * 2);
 ....

So if a change does result in an increase the mbuf size, by causing overflow in
either the first or the second cache line, we will get compiler errors in the
build because of it. Therefore, such changes are pretty easy to test by compiling
up on our supported targets.

/Bruce

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2
  2015-07-08 13:10  7%       ` Bruce Richardson
@ 2015-07-09 15:51  7%         ` Thomas Monjalon
  2015-07-09 16:01  4%           ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-09 15:51 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2015-07-08 14:10, Bruce Richardson:
> On Mon, Jul 06, 2015 at 03:16:01PM +0200, Thomas Monjalon wrote:
> > 2015-07-02 16:16, John McNamara:
> > > --- a/doc/guides/rel_notes/abi.rst
> > > +++ b/doc/guides/rel_notes/abi.rst
> > >  Deprecation Notices
> > >  -------------------
> > > +
> > > +* In DPDK 2.1 the IEEE1588/802.1AS support in the i40e driver makes use of the
> > > +  ``udata64`` field in the mbuf to pass the timesync register index to the
> > > +  user. In DPDK 2.2 this will be moved to a new field in the mbuf.
> > 
> > We need more acknowledgements for this decision, as stated here:
> > http://dpdk.org/browse/dpdk/tree/doc/guides/guidelines/versioning.rst#n51
> 
> Why can't this new field just be added at the end of cache line 1 (the second
> cache line) of the mbuf? That would avoid any ABI breakage and would mean we
> can just put the change in in this release, instead of waiting.

Are you sure that (because of __rte_cache_aligned) the size of the structure
is never increased with this new field?
Please confirm your opinion.

A comment to explain ABI compatibility in the commit message of the v4 is
also welcome.

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v13 00/14] Interrupt mode PMD
  @ 2015-07-09 13:58  3%   ` David Marchand
  2015-07-17  6:04  0%     ` Liang, Cunming
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
  1 sibling, 1 reply; 200+ results
From: David Marchand @ 2015-07-09 13:58 UTC (permalink / raw)
  To: Cunming Liang; +Cc: Stephen Hemminger, dev, liang-min.wang

On Fri, Jun 19, 2015 at 6:00 AM, Cunming Liang <cunming.liang@intel.com>
wrote:

> v13 changes
>  - version map cleanup for v2.1
>  - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
>

Please, this patchset ends with a patch that deals with ABI compatibility
while it should do so on a per-patch basis.
Besides, some patches are introducing stuff that is reworked in other
patches without a clear reason.

Can you rework this to ease review and ensure patch atomicity ?

Thanks.

-- 
David Marchand

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping
@ 2015-07-09 13:30  3% John McNamara
  2015-07-10  0:43  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: John McNamara @ 2015-07-09 13:30 UTC (permalink / raw)
  To: dev

This patchset adds ethdev API to enable and read IEEE1588/802.1AS PTP
timestamps from devices that support it. The following functions are added:

    rte_eth_timesync_enable()
    rte_eth_timesync_disable()
    rte_eth_timesync_read_rx_timestamp()
    rte_eth_timesync_read_tx_timestamp()

The "ieee1588" forwarding mode in testpmd is also refactored to demonstrate
the new API and to clean up the code.

Adds support for igb, ixgbe and i40e.

V4:
* Added timesync field to end of mbuf to pass IEEE1588 registers and flags.
  Removed previous ABI deprecation notice.

V3:
* Fixed issued with version.map.

V2:
* Added i40e support.

* Renamed ethdev functions from rte_eth_ieee15888_*() to rte_eth_timesync_*()
  since 802.1AS can be supported through the same interfaces.

V1:
* Initial version for igb and ixgbe.


John McNamara (7):
  ethdev: add support for ieee1588 timestamping
  mbuf: add field for ieee1588 timesync index
  e1000: add support for ieee1588 timestamping
  ixgbe: add support for ieee1588 timestamping
  i40e: add support for ieee1588 timestamping
  app/testpmd: refactor ieee1588 forwarding
  doc: document ieee1588 forwarding mode

 app/test-pmd/ieee1588fwd.c                  | 466 ++--------------------------
 doc/guides/testpmd_app_ug/run_app.rst       |   2 +-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |   2 +
 drivers/net/e1000/igb_ethdev.c              | 115 +++++++
 drivers/net/i40e/i40e_ethdev.c              | 143 +++++++++
 drivers/net/i40e/i40e_rxtx.c                |  40 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c            | 122 ++++++++
 lib/librte_ether/rte_ethdev.c               |  70 ++++-
 lib/librte_ether/rte_ethdev.h               |  90 +++++-
 lib/librte_ether/rte_ether_version.map      |   4 +
 lib/librte_mbuf/rte_mbuf.h                  |   3 +
 11 files changed, 614 insertions(+), 443 deletions(-)

--
1.8.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks
  2015-07-09  7:32  4% ` Thomas Monjalon
@ 2015-07-09  8:39  4%   ` Lu, Wenzhuo
  0 siblings, 0 replies; 200+ results
From: Lu, Wenzhuo @ 2015-07-09  8:39 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, July 9, 2015 3:33 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter,
> rte_fdir_masks
> 
> 2015-07-09 10:47, Wenzhuo Lu:
> > +* The ABI changes are planned for struct rte_fdir_filter and rte_fdir_masks in
> order to support new flow director modes, MAC VLAN and Cloud on x550. The
> upcoming release 2.1 will not contain these ABI changes, but release 2.2 will,
> and no backwards compatibility is planed due to this change. Binaries using this
> library build prior to version 2.2 will require updating and recompilation.
> 
> Please wrap it.
> What means Cloud for flow director? If you think about a tunnel type, please
> name it.
Thanks for the comments, I'll send a V2.
Wenzhuo

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5 04/11] config: remove RTE_LIBNAME definition.
  @ 2015-07-09  8:25  5% ` Zhigang Lu
  0 siblings, 0 replies; 200+ results
From: Zhigang Lu @ 2015-07-09  8:25 UTC (permalink / raw)
  To: dev; +Cc: Cyril Chemparathy

From: Cyril Chemparathy <cchemparathy@ezchip.com>

The library name is now being pinned to "dpdk" instead of intel_dpdk,
powerpc_dpdk, etc.  As a result, we no longer need this config item.
This patch removes it.

Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Signed-off-by: Zhigang Lu <zlu@ezchip.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 config/common_bsdapp                        | 1 -
 config/common_linuxapp                      | 1 -
 config/defconfig_ppc_64-power8-linuxapp-gcc | 2 --
 mk/rte.vars.mk                              | 5 +----
 4 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index dfa61a3..7112f1c 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME=intel_dpdk
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 1732b70..46297cd 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME="intel_dpdk"
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index d97a885..f1af518 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -39,8 +39,6 @@ CONFIG_RTE_ARCH_64=y
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
 
-CONFIG_RTE_LIBNAME="powerpc_dpdk"
-
 # Note: Power doesn't have this support
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
diff --git a/mk/rte.vars.mk b/mk/rte.vars.mk
index 0469064..f87cf4b 100644
--- a/mk/rte.vars.mk
+++ b/mk/rte.vars.mk
@@ -65,10 +65,7 @@ ifneq ($(BUILDING_RTE_SDK),)
   RTE_SDK_BIN := $(RTE_OUTPUT)
 endif
 
-RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
-ifeq ($(RTE_LIBNAME),)
-RTE_LIBNAME := intel_dpdk
-endif
+RTE_LIBNAME := dpdk
 
 # RTE_TARGET is deducted from config when we are building the SDK.
 # Else, when building an external app, RTE_TARGET must be specified
-- 
2.1.2

^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  2015-07-08 16:57  0%     ` Matthew Hall
@ 2015-07-09  8:12  3%       ` Bruce Richardson
  2015-07-09 20:42  3%         ` Matthew Hall
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-09  8:12 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

On Wed, Jul 08, 2015 at 09:57:03AM -0700, Matthew Hall wrote:
> On Wed, Jul 08, 2015 at 02:21:42PM +0100, Bruce Richardson wrote:
> > Irrespective of whether or not we change the underlying hash table implementation
> > this looks a good change to me. The rte_hash structure should not be used directly
> > by any applications - the APIs all take pointers to the structure,
> > so there should be no ABI breakage from this, I think.
> > 
> > Therefore:
> > 
> > Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> Hi guys,
> 
> There are places where this will be annoying on the app side.
> 
> A lot of rte_hash, rte_lpm*, rte_table, etc. don't provide methods to iterate 
> whole structures with a callback function that includes the current structure 
> node, and a user-data pointer.
> 
> This can make it real unpleasant when you want to walk through the structure 
> and free a bunch of items it points to and so forth.
> 
> So if you're going to obfuscate things by censoring the structure contents 
> then we'd really like to be sure they have a full set of CRUD operations and 
> iteration support so one could manage the nodes individually and in bulk.
> 
> Matthew.

Thanks for the feedback Matthew. Can you suggest a function prototype for such
a walk operation that would make it useful for you. While we can keep the
hash structure public, I'd prefer if we could avoid it, as it makes making changes
hard due to ABI issues.

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 00/11] Cuckoo hash
  2015-07-08 23:23  0%   ` Thomas Monjalon
@ 2015-07-09  8:02  0%     ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-09  8:02 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, Jul 09, 2015 at 01:23:54AM +0200, Thomas Monjalon wrote:
> Bruce, what is the status of this series?
>

The parts of the series can that are independent of the cuckoo hash update
have already been sent out as updates so changes to those can quicker be resolved
and merged. Pablo should be sending out a V4 of the cuckoo hash implementation
very shortly.

/Bruce

> 2015-06-28 23:25, Pablo de Lara:
> > This patchset is to replace the existing hash library with
> > a more efficient and functional approach, using the Cuckoo hash
> > method to deal with collisions. This method is based on using
> > two different hash functions to have two possible locations
> > in the hash table where an entry can be.
> > So, if a bucket is full, a new entry can push one of the items
> > in that bucket to its alternative location, making space for itself.
> > 
> > Advantages
> > ~~~~~~
> > - Offers the option to store more entries when the target bucket is full
> >   (unlike the previous implementation)
> > - Memory efficient: for storing those entries, it is not necessary to
> >   request new memory, as the entries will be stored in the same table
> > - Constant worst lookup time: in worst case scenario, it always takes
> >   the same time to look up an entry, as there are only two possible locations
> >   where an entry can be.
> > - Storing data: user can store data in the hash table, unlike the
> >   previous implementation, but he can still use the old API
> > 
> > This implementation tipically offers over 90% utilization.
> > Notice that API has been extended, but old API remains. The main
> > change in ABI is that rte_hash structure is now private and the
> > deprecation of two macros.
> > 
> > Changes in v3:
> > 
> > - Now user can store variable size data, instead of 32 or 64-bit size data,
> >   using the new parameter "data_len" in rte_hash_parameters
> > - Add lookup_bulk_with_hash function in performance  unit tests
> > - Add new functions that handle data in performance unit tests
> > - Remove duplicates in performance unit tests
> > - Fix rte_hash_reset, which was not reseting the last entry
> [...]
> > Pablo de Lara (11):
> >   eal: add const in prefetch functions
> >   hash: move rte_hash structure to C file and make it internal
> >   test/hash: enhance hash unit tests
> >   test/hash: rename new hash perf unit test back to original name
> >   hash: replace existing hash library with cuckoo hash implementation
> >   hash: add new lookup_bulk_with_hash function
> >   hash: add new function rte_hash_reset
> >   hash: add new functionality to store data in hash table
> >   MAINTAINERS: claim responsability for hash library
> >   doc: announce ABI change of librte_hash
> >   doc: update hash documentation
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks
  2015-07-09  2:47 21% [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks Wenzhuo Lu
@ 2015-07-09  7:32  4% ` Thomas Monjalon
  2015-07-09  8:39  4%   ` Lu, Wenzhuo
  2015-07-10  2:24 21% ` [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks Wenzhuo Lu
  1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-09  7:32 UTC (permalink / raw)
  To: Wenzhuo Lu; +Cc: dev

2015-07-09 10:47, Wenzhuo Lu:
> +* The ABI changes are planned for struct rte_fdir_filter and rte_fdir_masks in order to support new flow director modes, MAC VLAN and Cloud on x550. The upcoming release 2.1 will not contain these ABI changes, but release 2.2 will, and no backwards compatibility is planed due to this change. Binaries using this library build prior to version 2.2 will require updating and recompilation.

Please wrap it.
What means Cloud for flow director? If you think about a tunnel type, please
name it.

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 04/11] config: remove RTE_LIBNAME definition.
  @ 2015-07-09  4:58  5% ` Zhigang Lu
  0 siblings, 0 replies; 200+ results
From: Zhigang Lu @ 2015-07-09  4:58 UTC (permalink / raw)
  To: dev; +Cc: Cyril Chemparathy

From: Cyril Chemparathy <cchemparathy@ezchip.com>

The library name is now being pinned to "dpdk" instead of intel_dpdk,
powerpc_dpdk, etc.  As a result, we no longer need this config item.
This patch removes it.

Signed-off-by: Zhigang Lu <zlu@ezchip.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 config/common_bsdapp                        | 1 -
 config/common_linuxapp                      | 1 -
 config/defconfig_ppc_64-power8-linuxapp-gcc | 2 --
 mk/rte.vars.mk                              | 5 +----
 4 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index dfa61a3..7112f1c 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME=intel_dpdk
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 1732b70..46297cd 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME="intel_dpdk"
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index d97a885..f1af518 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -39,8 +39,6 @@ CONFIG_RTE_ARCH_64=y
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
 
-CONFIG_RTE_LIBNAME="powerpc_dpdk"
-
 # Note: Power doesn't have this support
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
diff --git a/mk/rte.vars.mk b/mk/rte.vars.mk
index 0469064..f87cf4b 100644
--- a/mk/rte.vars.mk
+++ b/mk/rte.vars.mk
@@ -65,10 +65,7 @@ ifneq ($(BUILDING_RTE_SDK),)
   RTE_SDK_BIN := $(RTE_OUTPUT)
 endif
 
-RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
-ifeq ($(RTE_LIBNAME),)
-RTE_LIBNAME := intel_dpdk
-endif
+RTE_LIBNAME := dpdk
 
 # RTE_TARGET is deducted from config when we are building the SDK.
 # Else, when building an external app, RTE_TARGET must be specified
-- 
2.1.2

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks
@ 2015-07-09  2:47 21% Wenzhuo Lu
  2015-07-09  7:32  4% ` Thomas Monjalon
  2015-07-10  2:24 21% ` [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks Wenzhuo Lu
  0 siblings, 2 replies; 200+ results
From: Wenzhuo Lu @ 2015-07-09  2:47 UTC (permalink / raw)
  To: dev

For x550 supports 2 new flow director modes, MAC VLAN and Cloud. There're several
new lookup fields for these 2 new modes, like MAC, tunnel type, TNI, VNI. So, we
have to change the ABI to support these new lookup fields.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..eca4d2b 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The ABI changes are planned for struct rte_fdir_filter and rte_fdir_masks in order to support new flow director modes, MAC VLAN and Cloud on x550. The upcoming release 2.1 will not contain these ABI changes, but release 2.2 will, and no backwards compatibility is planed due to this change. Binaries using this library build prior to version 2.2 will require updating and recompilation.
-- 
1.9.3

^ permalink raw reply	[relevance 21%]

* Re: [dpdk-dev] [PATCH v3] doc: announce ABI changes planned for unified packet type
  2015-07-07 17:45 21% ` [dpdk-dev] [PATCH v3] " Helin Zhang
  2015-07-08 15:58  4%   ` Zhang, Helin
@ 2015-07-09  0:56  4%   ` Wu, Jingjing
  2015-07-15 23:37  4%     ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: Wu, Jingjing @ 2015-07-09  0:56 UTC (permalink / raw)
  To: Zhang, Helin, dev

Acked-by: Jingjing Wu <jingjing.wu@intel.com>

> -----Original Message-----
> From: Zhang, Helin
> Sent: Wednesday, July 08, 2015 1:45 AM
> To: dev@dpdk.org
> Cc: Liu, Jijiang; Wu, Jingjing; nhorman@tuxdriver.com; Zhang, Helin
> Subject: [PATCH v3] doc: announce ABI changes planned for unified packet
> type
> 
> The significant ABI changes of all shared libraries are planned to support
> unified packet type which will be taken effect from release 2.2. Here
> announces that ABI changes in detail.
> 
> Signed-off-by: Helin Zhang <helin.zhang@intel.com>
> ---

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v4 3/7] hv: add basic vmbus support
  @ 2015-07-08 23:51  0%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 23:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, alexmay

2015-04-21 10:32, Stephen Hemminger:
> The hyper-v device driver forces the base EAL code to change
> to support multiple bus types. This is done changing the pci_device
> in ether driver to a generic union.
> 
> As much as possible this is done in a backwards source compatiable
> way. It will break ABI for device drivers.

> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -80,6 +80,7 @@ eal_long_options[] = {
>  	{OPT_NO_HPET,           0, NULL, OPT_NO_HPET_NUM          },
>  	{OPT_NO_HUGE,           0, NULL, OPT_NO_HUGE_NUM          },
>  	{OPT_NO_PCI,            0, NULL, OPT_NO_PCI_NUM           },
> +	{OPT_NO_VMBUS,		0, NULL, OPT_NO_VMBUS_NUM	  },

Alignment, please.

> @@ -66,6 +66,7 @@ struct internal_config {
>  	volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
>  	volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/
>  	volatile unsigned no_pci;         /**< true to disable PCI */
> +	volatile unsigned no_vmbus;	  /**< true to disable VMBUS */
>  	volatile unsigned no_hpet;        /**< true to disable HPET */
>  	volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping

Alignment may be better.

> +#ifdef RTE_LIBRTE_HV_PMD
> +	case RTE_BUS_VMBUS:
> +		eth_drv->vmbus_drv.devinit = rte_vmbus_dev_init;
> +		eth_drv->vmbus_drv.devuninit = rte_vmbus_dev_uninit;
> +		rte_eal_vmbus_register(&eth_drv->vmbus_drv);
> +		break;
> +#endif

Why ifdef'ing this code?

> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1477,7 +1478,10 @@ struct rte_eth_dev {
>  	struct rte_eth_dev_data *data;  /**< Pointer to device data */
>  	const struct eth_driver *driver;/**< Driver for this device */
>  	const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
> -	struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
> +	union {
> +		struct rte_pci_device *pci_dev; /**< PCI info. supplied by probig */
> +		struct rte_vmbus_device *vmbus_dev; /**< VMBUS info. supplied by probing */
> +	};
[...]
>  struct eth_driver {
> -	struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
> +	union {
> +		struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
> +		struct rte_vmbus_driver vmbus_drv;/**< The PMD is also a VMBUS drv. */
> +	};
> +	enum {
> +		RTE_BUS_PCI=0,
> +		RTE_BUS_VMBUS
> +	} bus_type;			  /**< Device bus type. */

A device may also be virtual.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3 00/11] Cuckoo hash
  @ 2015-07-08 23:23  0%   ` Thomas Monjalon
  2015-07-09  8:02  0%     ` Bruce Richardson
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
  1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-08 23:23 UTC (permalink / raw)
  To: bruce.richardson; +Cc: dev

Bruce, what is the status of this series?

2015-06-28 23:25, Pablo de Lara:
> This patchset is to replace the existing hash library with
> a more efficient and functional approach, using the Cuckoo hash
> method to deal with collisions. This method is based on using
> two different hash functions to have two possible locations
> in the hash table where an entry can be.
> So, if a bucket is full, a new entry can push one of the items
> in that bucket to its alternative location, making space for itself.
> 
> Advantages
> ~~~~~~
> - Offers the option to store more entries when the target bucket is full
>   (unlike the previous implementation)
> - Memory efficient: for storing those entries, it is not necessary to
>   request new memory, as the entries will be stored in the same table
> - Constant worst lookup time: in worst case scenario, it always takes
>   the same time to look up an entry, as there are only two possible locations
>   where an entry can be.
> - Storing data: user can store data in the hash table, unlike the
>   previous implementation, but he can still use the old API
> 
> This implementation tipically offers over 90% utilization.
> Notice that API has been extended, but old API remains. The main
> change in ABI is that rte_hash structure is now private and the
> deprecation of two macros.
> 
> Changes in v3:
> 
> - Now user can store variable size data, instead of 32 or 64-bit size data,
>   using the new parameter "data_len" in rte_hash_parameters
> - Add lookup_bulk_with_hash function in performance  unit tests
> - Add new functions that handle data in performance unit tests
> - Remove duplicates in performance unit tests
> - Fix rte_hash_reset, which was not reseting the last entry
[...]
> Pablo de Lara (11):
>   eal: add const in prefetch functions
>   hash: move rte_hash structure to C file and make it internal
>   test/hash: enhance hash unit tests
>   test/hash: rename new hash perf unit test back to original name
>   hash: replace existing hash library with cuckoo hash implementation
>   hash: add new lookup_bulk_with_hash function
>   hash: add new function rte_hash_reset
>   hash: add new functionality to store data in hash table
>   MAINTAINERS: claim responsability for hash library
>   doc: announce ABI change of librte_hash
>   doc: update hash documentation

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3 0/6] query hash key size in byte
  @ 2015-07-08 23:17  3%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 23:17 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev

> > As different hardware has different hash key sizes, querying it (in byte)
> > per port was asked by users. Otherwise there is no convenient way to know
> > the size of hash key which should be prepared.
> > 
> > v2 changes:
> > * Disabled the code changes by default, to avoid breaking ABI compatibility.
> > 
> > v3 changes:
> > * Moved the newly added element right after 'uint16_t reta_size', where it
> >   was a padding. So it will not break any ABI compatibility, and no need to
> >   disable the code changes by default.
> > 
> > Helin Zhang (6):
> >   ethdev: add an field for querying hash key size
> >   e1000: fill the hash key size
> >   fm10k: fill the hash key size
> >   i40e: fill the hash key size
> >   ixgbe: fill the hash key size
> >   app/testpmd: show the hash key size
> 
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Applied, thanks

It is assumed that inserting a new field in a padding alignment won't break
the ABI.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 0/2] next abi option
  2015-07-08 16:50  4%   ` [dpdk-dev] [PATCH v2 0/2] next abi option Neil Horman
@ 2015-07-08 22:58  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 22:58 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

2015-07-08 12:50, Neil Horman:
> On Wed, Jul 08, 2015 at 04:55:21PM +0200, Thomas Monjalon wrote:
> > This is the second version of the NEXT_ABI policy.
> > It can now be used for shared and static libraries.
> > 
> > While updating rte.lib.mk, it appeared that some useless
> > (and not consistent) variables were used for some config
> > options. The first patch clean them.
> > 
> > Thomas Monjalon (2):
> >   mk: remove variables identical to config ones
> >   mk: enable next abi preview
> 
> For the series
> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Applied, thanks Neil for your help

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  2015-07-08 13:21  3%   ` Bruce Richardson
@ 2015-07-08 16:57  0%     ` Matthew Hall
  2015-07-09  8:12  3%       ` Bruce Richardson
  2015-07-10 10:27  0%     ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: Matthew Hall @ 2015-07-08 16:57 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Wed, Jul 08, 2015 at 02:21:42PM +0100, Bruce Richardson wrote:
> Irrespective of whether or not we change the underlying hash table implementation
> this looks a good change to me. The rte_hash structure should not be used directly
> by any applications - the APIs all take pointers to the structure,
> so there should be no ABI breakage from this, I think.
> 
> Therefore:
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Hi guys,

There are places where this will be annoying on the app side.

A lot of rte_hash, rte_lpm*, rte_table, etc. don't provide methods to iterate 
whole structures with a callback function that includes the current structure 
node, and a user-data pointer.

This can make it real unpleasant when you want to walk through the structure 
and free a bunch of items it points to and so forth.

So if you're going to obfuscate things by censoring the structure contents 
then we'd really like to be sure they have a full set of CRUD operations and 
iteration support so one could manage the nodes individually and in bulk.

Matthew.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 0/2] next abi option
  2015-07-08 14:55  8% ` [dpdk-dev] [PATCH v2 0/2] next abi option Thomas Monjalon
  2015-07-08 14:55  4%   ` [dpdk-dev] [PATCH v2 1/2] mk: remove variables identical to config ones Thomas Monjalon
  2015-07-08 14:55 34%   ` [dpdk-dev] [PATCH v2 2/2] mk: enable next abi preview Thomas Monjalon
@ 2015-07-08 16:50  4%   ` Neil Horman
  2015-07-08 22:58  4%     ` Thomas Monjalon
  2 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-08 16:50 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Wed, Jul 08, 2015 at 04:55:21PM +0200, Thomas Monjalon wrote:
> This is the second version of the NEXT_ABI policy.
> It can now be used for shared and static libraries.
> 
> While updating rte.lib.mk, it appeared that some useless
> (and not consistent) variables were used for some config
> options. The first patch clean them.
> 
> Thomas Monjalon (2):
>   mk: remove variables identical to config ones
>   mk: enable next abi preview
> 
>  config/common_bsdapp                 |  5 +++++
>  config/common_linuxapp               |  5 +++++
>  doc/guides/guidelines/versioning.rst | 12 +++++++++---
>  mk/exec-env/bsdapp/rte.vars.mk       |  4 ++--
>  mk/exec-env/linuxapp/rte.vars.mk     |  4 ++--
>  mk/rte.lib.mk                        | 16 +++++++++-------
>  mk/rte.sdkbuild.mk                   |  2 +-
>  mk/rte.sharelib.mk                   |  8 ++++----
>  mk/rte.vars.mk                       |  8 --------
>  pkg/dpdk.spec                        |  1 +
>  scripts/validate-abi.sh              |  2 ++
>  11 files changed, 40 insertions(+), 27 deletions(-)
> 
> -- 
> 2.4.2
> 
> 
For the series
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3] mk: enable next abi preview
  2015-07-08 14:55 34%   ` [dpdk-dev] [PATCH v2 2/2] mk: enable next abi preview Thomas Monjalon
@ 2015-07-08 16:44 33%     ` Thomas Monjalon
  2015-07-13  7:32  7%       ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-08 16:44 UTC (permalink / raw)
  To: nhorman; +Cc: dev

When a change makes really hard to keep ABI compatibility,
instead of waiting next release to break the ABI, it is smoother
to introduce the new code as a preview and disable it when packaging.
The flag RTE_NEXT_ABI must be used to "ifdef" the new code.
When the release is out, a dynamically linked application can use
the new shared libraries with the old ABI while developpers can prepare
their application for the next ABI by reading the deprecation notice
and easily testing the new code.
When starting the next release cycle, the "ifdefs" will be removed
and the ABI break will be marked by incrementing LIBABIVER. The map
files will also be updated.

The default value is enabled to be developer compliant.
The packagers must disable it as done in pkg/dpdk.spec.
When enabled, all shared library numbers are incremented by appending
a minor .1 to the old ABI number. In the next release, only impacted
libraries will have a major +1 increment.
The impacted libraries must provide an alternative map file to use
with this option.

The ABI policy is updated.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
v3 change:
- fix .so symbolic link to .so.x.1

 config/common_bsdapp                 |  5 +++++
 config/common_linuxapp               |  5 +++++
 doc/guides/guidelines/versioning.rst | 12 +++++++++---
 mk/rte.lib.mk                        |  9 +++++----
 pkg/dpdk.spec                        |  1 +
 scripts/validate-abi.sh              |  2 ++
 6 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 78754b2..a4e3262 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -90,6 +90,11 @@ CONFIG_RTE_BUILD_COMBINE_LIBS=n
 CONFIG_RTE_LIBNAME=intel_dpdk
 
 #
+# Use newest code breaking previous ABI
+#
+CONFIG_RTE_NEXT_ABI=y
+
+#
 # Compile Environment Abstraction Layer
 #
 CONFIG_RTE_LIBRTE_EAL=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index f5646e0..050bf35 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -90,6 +90,11 @@ CONFIG_RTE_BUILD_COMBINE_LIBS=n
 CONFIG_RTE_LIBNAME="intel_dpdk"
 
 #
+# Use newest code breaking previous ABI
+#
+CONFIG_RTE_NEXT_ABI=y
+
+#
 # Compile Environment Abstraction Layer
 #
 CONFIG_RTE_LIBRTE_EAL=y
diff --git a/doc/guides/guidelines/versioning.rst b/doc/guides/guidelines/versioning.rst
index ea789cb..8a739dd 100644
--- a/doc/guides/guidelines/versioning.rst
+++ b/doc/guides/guidelines/versioning.rst
@@ -55,12 +55,18 @@ being provided. The requirements for doing so are:
 #. At least 3 acknowledgments of the need to do so must be made on the
    dpdk.org mailing list.
 
+#. The changes (including an alternative map file) must be gated with
+   the ``RTE_NEXT_ABI`` option, and provided with a deprecation notice at the
+   same time.
+   It will become the default ABI in the next release.
+
 #. A full deprecation cycle, as explained above, must be made to offer
    downstream consumers sufficient warning of the change.
 
-#. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes are
-   incorporated must be incremented in parallel with the ABI changes
-   themselves.
+#. At the beginning of the next release cycle, every ``RTE_NEXT_ABI``
+   conditions will be removed, the ``LIBABIVER`` variable in the makefile(s)
+   where the ABI is changed will be incremented, and the map files will
+   be updated.
 
 Note that the above process for ABI deprecation should not be undertaken
 lightly. ABI stability is extremely important for downstream consumers of the
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index fff62a7..f15de9b 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -37,11 +37,13 @@ include $(RTE_SDK)/mk/internal/rte.depdirs-pre.mk
 
 # VPATH contains at least SRCDIR
 VPATH += $(SRCDIR)
-ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
+ifeq ($(CONFIG_RTE_NEXT_ABI),y)
+LIB := $(LIB).1
+endif
 CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
-
 endif
 
 
@@ -167,12 +169,11 @@ endif
 # install lib in $(RTE_OUTPUT)/lib
 #
 $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
-	$(eval LIBSONAME := $(basename $(LIB)))
 	@echo "  INSTALL-LIB $(LIB)"
 	@[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
 	$(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
-	$(Q)ln -s -f $< $(RTE_OUTPUT)/lib/$(LIBSONAME)
+	$(Q)ln -s -f $< $(basename $(basename $@))
 endif
 
 #
diff --git a/pkg/dpdk.spec b/pkg/dpdk.spec
index 5f6ec6a..fb71ccc 100644
--- a/pkg/dpdk.spec
+++ b/pkg/dpdk.spec
@@ -82,6 +82,7 @@ make O=%{target} T=%{target} config
 sed -ri 's,(RTE_MACHINE=).*,\1%{machine},' %{target}/.config
 sed -ri 's,(RTE_APP_TEST=).*,\1n,'         %{target}/.config
 sed -ri 's,(RTE_BUILD_SHARED_LIB=).*,\1y,' %{target}/.config
+sed -ri 's,(RTE_NEXT_ABI=).*,\1n,'         %{target}/.config
 sed -ri 's,(LIBRTE_VHOST=).*,\1y,'         %{target}/.config
 sed -ri 's,(LIBRTE_PMD_PCAP=).*,\1y,'      %{target}/.config
 sed -ri 's,(LIBRTE_PMD_XENVIRT=).*,\1y,'   %{target}/.config
diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh
index 1747b8b..4476433 100755
--- a/scripts/validate-abi.sh
+++ b/scripts/validate-abi.sh
@@ -157,6 +157,7 @@ git checkout $TAG1
 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
 sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
+sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET
 
@@ -198,6 +199,7 @@ git checkout $TAG2
 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
 sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
+sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET
 
-- 
2.4.2

^ permalink raw reply	[relevance 33%]

* Re: [dpdk-dev] [PATCH 1/4] eal: provide functions to access PCI config
  2015-07-08 16:11  3%     ` Stephen Hemminger
@ 2015-07-08 16:34  0%       ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 16:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Stephen Hemminger

2015-07-08 09:11, Stephen Hemminger:
> On Wed, 8 Jul 2015 15:04:16 +0000
> Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> 
> > 2015-07-07 17:08, Stephen Hemminger:
> > > --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > > +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > > @@ -98,3 +98,8 @@ DPDK_2.0 {
> > >  
> > >         local: *;
> > >  };
> > > +
> > > +DPDK_2.1 {
> > > +       rte_eal_pci_read_config;
> > > +       rte_eal_pci_write_config;
> > > +};  
> > 
> > DPDK_2.0 is missing to make 2.1 node inheriting from 2.0 one.
> 
> Do you mean that it is ok to add functions but keep same ABI version?

No. It's explained there:
	http://dpdk.org/browse/dpdk/tree/doc/guides/guidelines/versioning.rst#n209
"
   DPDK_2.1 {
        global:
        rte_acl_create;

   } DPDK_2.0;

The addition of the new block tells the linker that a new version node is
available (DPDK_2.1), which contains the symbol rte_acl_create, and inherits the
symbols from the DPDK_2.0 node.
"

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 1/4] eal: provide functions to access PCI config
       [not found]       ` <d35efa226d95495e804181b2246063d2@HQ1WP-EXMB11.corp.brocade.com>
@ 2015-07-08 16:11  3%     ` Stephen Hemminger
  2015-07-08 16:34  0%       ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2015-07-08 16:11 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Stephen Hemminger

On Wed, 8 Jul 2015 15:04:16 +0000
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:

> 2015-07-07 17:08, Stephen Hemminger:
> > --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> > @@ -98,3 +98,8 @@ DPDK_2.0 {
> >  
> >         local: *;
> >  };
> > +
> > +DPDK_2.1 {
> > +       rte_eal_pci_read_config;
> > +       rte_eal_pci_write_config;
> > +};  
> 
> DPDK_2.0 is missing to make 2.1 node inheriting from 2.0 one.

Do you mean that it is ok to add functions but keep same ABI version?

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3] doc: announce ABI changes planned for unified packet type
  2015-07-07 17:45 21% ` [dpdk-dev] [PATCH v3] " Helin Zhang
@ 2015-07-08 15:58  4%   ` Zhang, Helin
  2015-07-09  0:56  4%   ` Wu, Jingjing
  1 sibling, 0 replies; 200+ results
From: Zhang, Helin @ 2015-07-08 15:58 UTC (permalink / raw)
  To: nhorman; +Cc: dev

Hi Neil

Please review it and ack it, if no objections! Thank you very much in advance!

- Helin

> -----Original Message-----
> From: Zhang, Helin
> Sent: Tuesday, July 7, 2015 10:45 AM
> To: dev@dpdk.org
> Cc: Liu, Jijiang; Wu, Jingjing; nhorman@tuxdriver.com; Zhang, Helin
> Subject: [PATCH v3] doc: announce ABI changes planned for unified packet type
> 
> The significant ABI changes of all shared libraries are planned to support unified
> packet type which will be taken effect from release 2.2. Here announces that ABI
> changes in detail.
> 
> Signed-off-by: Helin Zhang <helin.zhang@intel.com>
> ---
>  doc/guides/rel_notes/abi.rst | 1 +
>  1 file changed, 1 insertion(+)
> 
> v2 changes:
> * Added 'struct rte_kni_mbuf' to the ABI change announcement.
> 
> v3 changes:
> * Re-worded the ABI change announcement, as the ABI changes will be
>   taken effect from release 2.1 for static libraries, and from
>   release 2.2 for shared libraries.
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst index
> 110c486..9f8a9b3 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -12,3 +12,4 @@ Examples of Deprecation Notices
> 
>  Deprecation Notices
>  -------------------
> +* Significant ABI changes are planned for struct rte_mbuf, struct rte_kni_mbuf,
> and several PKT_RX_ flags will be removed, to support unified packet type from
> release 2.1. The upcoming release 2.1 will have those changes for static libraries
> only, while have no change for shared libraries by default. Those changes will be
> taken effect for shared libraries from release 2.2. There is no backward
> compatibility planned from release 2.2. All binaries will need to be rebuilt from
> release 2.2.
> --
> 1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 2/2] mk: enable next abi preview
  2015-07-08 14:55  8% ` [dpdk-dev] [PATCH v2 0/2] next abi option Thomas Monjalon
  2015-07-08 14:55  4%   ` [dpdk-dev] [PATCH v2 1/2] mk: remove variables identical to config ones Thomas Monjalon
@ 2015-07-08 14:55 34%   ` Thomas Monjalon
  2015-07-08 16:44 33%     ` [dpdk-dev] [PATCH v3] " Thomas Monjalon
  2015-07-08 16:50  4%   ` [dpdk-dev] [PATCH v2 0/2] next abi option Neil Horman
  2 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-08 14:55 UTC (permalink / raw)
  To: nhorman; +Cc: dev

When a change makes really hard to keep ABI compatibility,
instead of waiting next release to break the ABI, it is smoother
to introduce the new code as a preview and disable it when packaging.
The flag RTE_NEXT_ABI must be used to "ifdef" the new code.
When the release is out, a dynamically linked application can use
the new shared libraries with the old ABI while developpers can prepare
their application for the next ABI by reading the deprecation notice
and easily testing the new code.
When starting the next release cycle, the "ifdefs" will be removed
and the ABI break will be marked by incrementing LIBABIVER. The map
files will also be updated.

The default value is enabled to be developer compliant.
The packagers must disable it as done in pkg/dpdk.spec.
When enabled, all shared library numbers are incremented by appending
a minor .1 to the old ABI number. In the next release, only impacted
libraries will have a major +1 increment.
The impacted libraries must provide an alternative map file to use
with this option.

The ABI policy is updated.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 config/common_bsdapp                 |  5 +++++
 config/common_linuxapp               |  5 +++++
 doc/guides/guidelines/versioning.rst | 12 +++++++++---
 mk/rte.lib.mk                        |  6 ++++--
 pkg/dpdk.spec                        |  1 +
 scripts/validate-abi.sh              |  2 ++
 6 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 78754b2..a4e3262 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -90,6 +90,11 @@ CONFIG_RTE_BUILD_COMBINE_LIBS=n
 CONFIG_RTE_LIBNAME=intel_dpdk
 
 #
+# Use newest code breaking previous ABI
+#
+CONFIG_RTE_NEXT_ABI=y
+
+#
 # Compile Environment Abstraction Layer
 #
 CONFIG_RTE_LIBRTE_EAL=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index f5646e0..050bf35 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -90,6 +90,11 @@ CONFIG_RTE_BUILD_COMBINE_LIBS=n
 CONFIG_RTE_LIBNAME="intel_dpdk"
 
 #
+# Use newest code breaking previous ABI
+#
+CONFIG_RTE_NEXT_ABI=y
+
+#
 # Compile Environment Abstraction Layer
 #
 CONFIG_RTE_LIBRTE_EAL=y
diff --git a/doc/guides/guidelines/versioning.rst b/doc/guides/guidelines/versioning.rst
index ea789cb..8a739dd 100644
--- a/doc/guides/guidelines/versioning.rst
+++ b/doc/guides/guidelines/versioning.rst
@@ -55,12 +55,18 @@ being provided. The requirements for doing so are:
 #. At least 3 acknowledgments of the need to do so must be made on the
    dpdk.org mailing list.
 
+#. The changes (including an alternative map file) must be gated with
+   the ``RTE_NEXT_ABI`` option, and provided with a deprecation notice at the
+   same time.
+   It will become the default ABI in the next release.
+
 #. A full deprecation cycle, as explained above, must be made to offer
    downstream consumers sufficient warning of the change.
 
-#. The ``LIBABIVER`` variable in the makefile(s) where the ABI changes are
-   incorporated must be incremented in parallel with the ABI changes
-   themselves.
+#. At the beginning of the next release cycle, every ``RTE_NEXT_ABI``
+   conditions will be removed, the ``LIBABIVER`` variable in the makefile(s)
+   where the ABI is changed will be incremented, and the map files will
+   be updated.
 
 Note that the above process for ABI deprecation should not be undertaken
 lightly. ABI stability is extremely important for downstream consumers of the
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index fff62a7..0ad32cb 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -37,11 +37,13 @@ include $(RTE_SDK)/mk/internal/rte.depdirs-pre.mk
 
 # VPATH contains at least SRCDIR
 VPATH += $(SRCDIR)
-ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
+ifeq ($(CONFIG_RTE_NEXT_ABI),y)
+LIB := $(LIB).1
+endif
 CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
-
 endif
 
 
diff --git a/pkg/dpdk.spec b/pkg/dpdk.spec
index 5f6ec6a..fb71ccc 100644
--- a/pkg/dpdk.spec
+++ b/pkg/dpdk.spec
@@ -82,6 +82,7 @@ make O=%{target} T=%{target} config
 sed -ri 's,(RTE_MACHINE=).*,\1%{machine},' %{target}/.config
 sed -ri 's,(RTE_APP_TEST=).*,\1n,'         %{target}/.config
 sed -ri 's,(RTE_BUILD_SHARED_LIB=).*,\1y,' %{target}/.config
+sed -ri 's,(RTE_NEXT_ABI=).*,\1n,'         %{target}/.config
 sed -ri 's,(LIBRTE_VHOST=).*,\1y,'         %{target}/.config
 sed -ri 's,(LIBRTE_PMD_PCAP=).*,\1y,'      %{target}/.config
 sed -ri 's,(LIBRTE_PMD_XENVIRT=).*,\1y,'   %{target}/.config
diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh
index 1747b8b..4476433 100755
--- a/scripts/validate-abi.sh
+++ b/scripts/validate-abi.sh
@@ -157,6 +157,7 @@ git checkout $TAG1
 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
 sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
+sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET
 
@@ -198,6 +199,7 @@ git checkout $TAG2
 # Make sure we configure SHARED libraries
 # Also turn off IGB and KNI as those require kernel headers to build
 sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
+sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET
 sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET
 
-- 
2.4.2

^ permalink raw reply	[relevance 34%]

* [dpdk-dev] [PATCH v2 1/2] mk: remove variables identical to config ones
  2015-07-08 14:55  8% ` [dpdk-dev] [PATCH v2 0/2] next abi option Thomas Monjalon
@ 2015-07-08 14:55  4%   ` Thomas Monjalon
  2015-07-08 14:55 34%   ` [dpdk-dev] [PATCH v2 2/2] mk: enable next abi preview Thomas Monjalon
  2015-07-08 16:50  4%   ` [dpdk-dev] [PATCH v2 0/2] next abi option Neil Horman
  2 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 14:55 UTC (permalink / raw)
  To: nhorman; +Cc: dev

CONFIG_RTE_BUILD_SHARED_LIB and CONFIG_RTE_BUILD_COMBINE_LIBS does not
have quotes in their values (only y or n). That's why the variables
RTE_BUILD_SHARED_LIB and RTE_BUILD_COMBINE_LIBS are always identical to
their CONFIG_ counterpart, and are useless.
In order to have consistent naming of config options in the makefiles,
these options are removed and the "CONFIG_ prefixed" variables are used.

Fixes: e25e4d7ef16b ("mk: shared libraries")
Fixes: 4d3d79e7a5c6 ("mk: combined library")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 mk/exec-env/bsdapp/rte.vars.mk   |  4 ++--
 mk/exec-env/linuxapp/rte.vars.mk |  4 ++--
 mk/rte.lib.mk                    | 12 ++++++------
 mk/rte.sdkbuild.mk               |  2 +-
 mk/rte.sharelib.mk               |  8 ++++----
 mk/rte.vars.mk                   |  8 --------
 6 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/mk/exec-env/bsdapp/rte.vars.mk b/mk/exec-env/bsdapp/rte.vars.mk
index aed0e18..47a673e 100644
--- a/mk/exec-env/bsdapp/rte.vars.mk
+++ b/mk/exec-env/bsdapp/rte.vars.mk
@@ -39,7 +39,7 @@
 #
 # examples for RTE_EXEC_ENV: linuxapp, bsdapp
 #
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 EXECENV_CFLAGS  = -pthread -fPIC
 else
 EXECENV_CFLAGS  = -pthread
@@ -49,7 +49,7 @@ EXECENV_LDFLAGS =
 EXECENV_LDLIBS  = -lexecinfo
 EXECENV_ASFLAGS =
 
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 EXECENV_LDLIBS += -lgcc_s
 endif
 
diff --git a/mk/exec-env/linuxapp/rte.vars.mk b/mk/exec-env/linuxapp/rte.vars.mk
index e5af318..5fd7d85 100644
--- a/mk/exec-env/linuxapp/rte.vars.mk
+++ b/mk/exec-env/linuxapp/rte.vars.mk
@@ -39,7 +39,7 @@
 #
 # examples for RTE_EXEC_ENV: linuxapp, bsdapp
 #
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 EXECENV_CFLAGS  = -pthread -fPIC
 else
 EXECENV_CFLAGS  = -pthread
@@ -51,7 +51,7 @@ EXECENV_LDFLAGS = --no-as-needed
 EXECENV_LDLIBS  = -lrt -lm
 EXECENV_ASFLAGS =
 
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 EXECENV_LDLIBS += -lgcc_s
 endif
 
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 25aa989..fff62a7 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -37,7 +37,7 @@ include $(RTE_SDK)/mk/internal/rte.depdirs-pre.mk
 
 # VPATH contains at least SRCDIR
 VPATH += $(SRCDIR)
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
 LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
 CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
@@ -87,7 +87,7 @@ O_TO_S_DO = @set -e; \
 	$(O_TO_S) && \
 	echo $(O_TO_S_CMD) > $(call exe2cmd,$(@))
 
-ifeq ($(RTE_BUILD_SHARED_LIB),n)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n)
 O_TO_C = $(AR) crus $(LIB_ONE) $(OBJS-y)
 O_TO_C_STR = $(subst ','\'',$(O_TO_C)) #'# fix syntax highlight
 O_TO_C_DISP = $(if $(V),"$(O_TO_C_STR)","  AR_C $(@)")
@@ -110,7 +110,7 @@ lib_dir = [ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib;
 #
 # Archive objects in .a file if needed
 #
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 $(LIB): $(OBJS-y) $(DEP_$(LIB)) FORCE
 ifeq ($(LIBABIVER),)
 	@echo "Must Specify a $(LIB) ABI version"
@@ -130,7 +130,7 @@ endif
 		$(depfile_newer)),\
 		$(O_TO_S_DO))
 
-ifeq ($(RTE_BUILD_COMBINE_LIBS),y)
+ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),y)
 	$(if $(or \
         $(file_missing),\
         $(call cmdline_changed,$(O_TO_C_STR)),\
@@ -153,7 +153,7 @@ $(LIB): $(OBJS-y) $(DEP_$(LIB)) FORCE
 	    $(depfile_missing),\
 	    $(depfile_newer)),\
 	    $(O_TO_A_DO))
-ifeq ($(RTE_BUILD_COMBINE_LIBS),y)
+ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),y)
 	$(if $(or \
         $(file_missing),\
         $(call cmdline_changed,$(O_TO_C_STR)),\
@@ -171,7 +171,7 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
 	@echo "  INSTALL-LIB $(LIB)"
 	@[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
 	$(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 	$(Q)ln -s -f $< $(RTE_OUTPUT)/lib/$(LIBSONAME)
 endif
 
diff --git a/mk/rte.sdkbuild.mk b/mk/rte.sdkbuild.mk
index 352f738..38ec7bd 100644
--- a/mk/rte.sdkbuild.mk
+++ b/mk/rte.sdkbuild.mk
@@ -93,7 +93,7 @@ $(ROOTDIRS-y):
 	@[ -d $(BUILDDIR)/$@ ] || mkdir -p $(BUILDDIR)/$@
 	@echo "== Build $@"
 	$(Q)$(MAKE) S=$@ -f $(RTE_SRCDIR)/$@/Makefile -C $(BUILDDIR)/$@ all
-	@if [ $@ = drivers -a $(RTE_BUILD_COMBINE_LIBS) = y ]; then \
+	@if [ $@ = drivers -a $(CONFIG_RTE_BUILD_COMBINE_LIBS) = y ]; then \
 		$(MAKE) -f $(RTE_SDK)/lib/Makefile sharelib; \
 	fi
 
diff --git a/mk/rte.sharelib.mk b/mk/rte.sharelib.mk
index de53558..7bb7219 100644
--- a/mk/rte.sharelib.mk
+++ b/mk/rte.sharelib.mk
@@ -34,8 +34,8 @@ include $(RTE_SDK)/mk/internal/rte.build-pre.mk
 # VPATH contains at least SRCDIR
 VPATH += $(SRCDIR)
 
-ifeq ($(RTE_BUILD_COMBINE_LIBS),y)
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 LIB_ONE := lib$(RTE_LIBNAME).so
 else
 LIB_ONE := lib$(RTE_LIBNAME).a
@@ -75,8 +75,8 @@ O_TO_A_DO = @set -e; \
 # Archive objects to share library
 #
 
-ifeq ($(RTE_BUILD_COMBINE_LIBS),y)
-ifeq ($(RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),y)
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 $(LIB_ONE): FORCE
 	@[ -d $(dir $@) ] || mkdir -p $(dir $@)
 	$(O_TO_S_DO)
diff --git a/mk/rte.vars.mk b/mk/rte.vars.mk
index d2f01b6..0469064 100644
--- a/mk/rte.vars.mk
+++ b/mk/rte.vars.mk
@@ -63,14 +63,6 @@ ifneq ($(BUILDING_RTE_SDK),)
   RTE_TOOLCHAIN := $(CONFIG_RTE_TOOLCHAIN:"%"=%)
   RTE_TARGET := $(RTE_ARCH)-$(RTE_MACHINE)-$(RTE_EXEC_ENV)-$(RTE_TOOLCHAIN)
   RTE_SDK_BIN := $(RTE_OUTPUT)
-  RTE_BUILD_SHARED_LIB := $(CONFIG_RTE_BUILD_SHARED_LIB:"%"=%)
-  ifeq ($(RTE_BUILD_SHARED_LIB),)
-    RTE_BUILD_SHARED_LIB := n
-  endif
-  RTE_BUILD_COMBINE_LIBS := $(CONFIG_RTE_BUILD_COMBINE_LIBS:"%"=%)
-  ifeq ($(RTE_BUILD_COMBINE_LIBS),)
-    RTE_BUILD_COMBINE_LIBS := n
-  endif
 endif
 
 RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
-- 
2.4.2

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 0/2] next abi option
    @ 2015-07-08 14:55  8% ` Thomas Monjalon
  2015-07-08 14:55  4%   ` [dpdk-dev] [PATCH v2 1/2] mk: remove variables identical to config ones Thomas Monjalon
                     ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 14:55 UTC (permalink / raw)
  To: nhorman; +Cc: dev

This is the second version of the NEXT_ABI policy.
It can now be used for shared and static libraries.

While updating rte.lib.mk, it appeared that some useless
(and not consistent) variables were used for some config
options. The first patch clean them.

Thomas Monjalon (2):
  mk: remove variables identical to config ones
  mk: enable next abi preview

 config/common_bsdapp                 |  5 +++++
 config/common_linuxapp               |  5 +++++
 doc/guides/guidelines/versioning.rst | 12 +++++++++---
 mk/exec-env/bsdapp/rte.vars.mk       |  4 ++--
 mk/exec-env/linuxapp/rte.vars.mk     |  4 ++--
 mk/rte.lib.mk                        | 16 +++++++++-------
 mk/rte.sdkbuild.mk                   |  2 +-
 mk/rte.sharelib.mk                   |  8 ++++----
 mk/rte.vars.mk                       |  8 --------
 pkg/dpdk.spec                        |  1 +
 scripts/validate-abi.sh              |  2 ++
 11 files changed, 40 insertions(+), 27 deletions(-)

-- 
2.4.2

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  @ 2015-07-08 13:21  3%   ` Bruce Richardson
  2015-07-08 16:57  0%     ` Matthew Hall
  2015-07-10 10:27  0%     ` Thomas Monjalon
  0 siblings, 2 replies; 200+ results
From: Bruce Richardson @ 2015-07-08 13:21 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

On Wed, Jul 08, 2015 at 12:27:34PM +0100, Pablo de Lara wrote:
> rte_hash structure should not be a public structure,
> and therefore it should be moved to the C file and be declared
> as internal. rte_hash_hash implementation is also moved
> to the C file, as it uses the structure.
> 
> This patch also removes part of a unit test that was checking
> a field of the structure.
> 
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>

Irrespective of whether or not we change the underlying hash table implementation
this looks a good change to me. The rte_hash structure should not be used directly
by any applications - the APIs all take pointers to the structure,
so there should be no ABI breakage from this, I think.

Therefore:

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2
  @ 2015-07-08 13:10  7%       ` Bruce Richardson
  2015-07-09 15:51  7%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-08 13:10 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 06, 2015 at 03:16:01PM +0200, Thomas Monjalon wrote:
> 2015-07-02 16:16, John McNamara:
> > --- a/doc/guides/rel_notes/abi.rst
> > +++ b/doc/guides/rel_notes/abi.rst
> >  Deprecation Notices
> >  -------------------
> > +
> > +* In DPDK 2.1 the IEEE1588/802.1AS support in the i40e driver makes use of the
> > +  ``udata64`` field in the mbuf to pass the timesync register index to the
> > +  user. In DPDK 2.2 this will be moved to a new field in the mbuf.
> 
> We need more acknowledgements for this decision, as stated here:
> http://dpdk.org/browse/dpdk/tree/doc/guides/guidelines/versioning.rst#n51

Why can't this new field just be added at the end of cache line 1 (the second
cache line) of the mbuf? That would avoid any ABI breakage and would mean we
can just put the change in in this release, instead of waiting.

/Bruce

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port
  2015-07-08  2:08 20% [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port Jijiang Liu
@ 2015-07-08 11:07  4% ` Neil Horman
  2015-07-10 22:14  4%   ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-08 11:07 UTC (permalink / raw)
  To: Jijiang Liu; +Cc: dev

On Wed, Jul 08, 2015 at 10:08:25AM +0800, Jijiang Liu wrote:
> The significant ABI change of all shared libraries is planned for struct rte_eth_dev to support up to 1024 queues per port which will be taken effect from release 2.2.
> 
> Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> ---
>  doc/guides/rel_notes/abi.rst |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 110c486..ff4a810 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -12,3 +12,4 @@ Examples of Deprecation Notices
>  
>  Deprecation Notices
>  -------------------
> +* Significant ABI changes are planned for struct rte_eth_dev to support up to 1024 queues per port. This change will be taken effect for shared libraries from release 2.2. There is no backward compatibility planned from release 2.2. All binaries will need to be rebuilt from release 2.2.
> -- 
> 1.7.7.6
> 
> 

Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename rte_eth_vmdq_mirror_conf
  2015-07-08 10:31  5%       ` Mcnamara, John
@ 2015-07-08 10:35  0%         ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-08 10:35 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Wed, Jul 08, 2015 at 10:31:15AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > Sent: Tuesday, July 7, 2015 3:51 PM
> > To: Wu, Jingjing
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename
> > rte_eth_vmdq_mirror_conf
> > 
> > > Additional, about the validate-abi.sh, does it mean we need to fix all
> > > the problems it reports? Or we can decide case by case.
> > > Can a Low Severity problem be acceptable?
> > 
> > We have to decide case by case.
> > It makes ABI checking impossible to automate.
> > That's why any help is welcome to check the git HEAD for ABI violation.
> 
> Hi,
> 
> Automated testing for ABI breaks is tricky since it requires some interpretation of the errors and the severity level.
> 
> However, each report file contains a 2 line summary at the top that can be read and parsed. For example (with long lines wrapped):
> 
>     head -2 compat_reports/librte_mbuf.so/v2.0.0_to_ABI_CHECK/compat_report.html
> 
>     <!-- kind:binary;verdict:compatible;affected:0;added:1;removed:0;
>          type_problems_high:0;type_problems_medium:0;type_problems_low:0;
>          interface_problems_high:0;interface_problems_medium:0;
>          interface_problems_low:0;changed_constants:0;tool_version:1.99.8.5 -->
>     <!-- kind:source;verdict:compatible;affected:0;added:1;removed:0;
>          type_problems_high:0;type_problems_medium:0;type_problems_low:0;
>          interface_problems_high:0;interface_problems_medium:0;
>          interface_problems_low:0;changed_constants:0;tool_version:1.99.8.5 -->
> 
> This still requires interpretation but it can be used to search for categories of incompatibility.
> 
> To test if a patchset breaks the API it is possible to do something like the following. Say, for example that you have committed 3 commits to your local master, you could tag and run the checker like this:
> 
>     git checkout master
> 
>     git tag -f ABI_CHECK_AFTER
>     git tag -f ABI_CHECK_BEFORE HEAD~3
> 
>     ./scripts/validate-abi.sh ABI_CHECK_BEFORE ABI_CHECK_AFTER x86_64-native-linuxapp-gcc
> 
>     grep -lr "kind:binary;verdict:incompatible" compat_reports/
> 
> The grep could be extended to match other categories that you are interested in. However, it is still probably better to examine the reports manually.
> 
> You can also use something this to run a git bisect to find a commit that introduced an ABI incompatibility:
> 
>     # Tag the head and some known good point in the history.
>     git checkout master
>     git tag -f ABI_CHECK_END
>     git tag -f ABI_CHECK_START v2.0.0
> 
>     # Start git bisect with bad and good.
>     git bisect start ABI_CHECK_END ABI_CHECK_START
> 
>     # Run bisect with a script.
>     git bisect run ~/abi_bisect.sh
> 
> Where the ~/abi_bisect.sh script that searches for ABI incompatibilities looks like this:
> 
>     #!/bin/sh
> 
>     git tag -f ABI_CHECK_END
> 
>     rm -rf compat_reports/
> 
>     ./scripts/validate-abi.sh ABI_CHECK_START ABI_CHECK_END x86_64-native-linuxapp-gcc
> 
>     ! grep -lr "kind:binary;verdict:incompatible" compat_reports/
> 
> John.
> -- 

I'd mod this post up as "+1 informative", except that this isn't slashdot. :-)

Very helpful, John. Thanks.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename rte_eth_vmdq_mirror_conf
  2015-07-07 14:51  4%     ` Thomas Monjalon
  2015-07-08  0:58  0%       ` Wu, Jingjing
@ 2015-07-08 10:31  5%       ` Mcnamara, John
  2015-07-08 10:35  0%         ` Bruce Richardson
  1 sibling, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-08 10:31 UTC (permalink / raw)
  To: Thomas Monjalon, Wu, Jingjing; +Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, July 7, 2015 3:51 PM
> To: Wu, Jingjing
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename
> rte_eth_vmdq_mirror_conf
> 
> > Additional, about the validate-abi.sh, does it mean we need to fix all
> > the problems it reports? Or we can decide case by case.
> > Can a Low Severity problem be acceptable?
> 
> We have to decide case by case.
> It makes ABI checking impossible to automate.
> That's why any help is welcome to check the git HEAD for ABI violation.

Hi,

Automated testing for ABI breaks is tricky since it requires some interpretation of the errors and the severity level.

However, each report file contains a 2 line summary at the top that can be read and parsed. For example (with long lines wrapped):

    head -2 compat_reports/librte_mbuf.so/v2.0.0_to_ABI_CHECK/compat_report.html

    <!-- kind:binary;verdict:compatible;affected:0;added:1;removed:0;
         type_problems_high:0;type_problems_medium:0;type_problems_low:0;
         interface_problems_high:0;interface_problems_medium:0;
         interface_problems_low:0;changed_constants:0;tool_version:1.99.8.5 -->
    <!-- kind:source;verdict:compatible;affected:0;added:1;removed:0;
         type_problems_high:0;type_problems_medium:0;type_problems_low:0;
         interface_problems_high:0;interface_problems_medium:0;
         interface_problems_low:0;changed_constants:0;tool_version:1.99.8.5 -->

This still requires interpretation but it can be used to search for categories of incompatibility.

To test if a patchset breaks the API it is possible to do something like the following. Say, for example that you have committed 3 commits to your local master, you could tag and run the checker like this:

    git checkout master

    git tag -f ABI_CHECK_AFTER
    git tag -f ABI_CHECK_BEFORE HEAD~3

    ./scripts/validate-abi.sh ABI_CHECK_BEFORE ABI_CHECK_AFTER x86_64-native-linuxapp-gcc

    grep -lr "kind:binary;verdict:incompatible" compat_reports/

The grep could be extended to match other categories that you are interested in. However, it is still probably better to examine the reports manually.

You can also use something this to run a git bisect to find a commit that introduced an ABI incompatibility:

    # Tag the head and some known good point in the history.
    git checkout master
    git tag -f ABI_CHECK_END
    git tag -f ABI_CHECK_START v2.0.0

    # Start git bisect with bad and good.
    git bisect start ABI_CHECK_END ABI_CHECK_START

    # Run bisect with a script.
    git bisect run ~/abi_bisect.sh

Where the ~/abi_bisect.sh script that searches for ABI incompatibilities looks like this:

    #!/bin/sh

    git tag -f ABI_CHECK_END

    rm -rf compat_reports/

    ./scripts/validate-abi.sh ABI_CHECK_START ABI_CHECK_END x86_64-native-linuxapp-gcc

    ! grep -lr "kind:binary;verdict:incompatible" compat_reports/

John.
-- 

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port
@ 2015-07-08  2:08 20% Jijiang Liu
  2015-07-08 11:07  4% ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Jijiang Liu @ 2015-07-08  2:08 UTC (permalink / raw)
  To: dev

The significant ABI change of all shared libraries is planned for struct rte_eth_dev to support up to 1024 queues per port which will be taken effect from release 2.2.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..ff4a810 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* Significant ABI changes are planned for struct rte_eth_dev to support up to 1024 queues per port. This change will be taken effect for shared libraries from release 2.2. There is no backward compatibility planned from release 2.2. All binaries will need to be rebuilt from release 2.2.
-- 
1.7.7.6

^ permalink raw reply	[relevance 20%]

* Re: [dpdk-dev] [PATCH v3 0/7] ethdev: add support for ieee1588 timestamping
    @ 2015-07-08  1:59  0%   ` Fu, JingguoX
  1 sibling, 0 replies; 200+ results
From: Fu, JingguoX @ 2015-07-08  1:59 UTC (permalink / raw)
  To: dev

Tested-by: Jingguo Fu <jingguox.fu@intel.com>
- Test Commit: 82be8d544253a4b5c49b778babf717e5e63f3dc1
- OS: Ubuntu 14.04
- GCC: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation Device [8086:1563]
- Total 1 case, 1 passed, 0 failed.

Test Case :  test_ieee1588_disable
====================================================================
1. set CONFIG_RTE_LIBRTE_IEEE1588=y, rebuild the DPDK code.
2.mount hugetlbfs
3. insmod igb_uio and bind the two ports to igb_uio driver
4. startup testpmd
	
	testpmd -c 0x6 -n 4 -- -i --txqflags 0x0

5. set ieee1588 fwd mode and start it:

	set fwd ieee1588
	start

6. send a packet to the receive port and check the output and send port status:
	
	a: "IEEE1588 PTP V2 SYNC" in pmd output
	b: "RX timestamp value ([0-9a-fA-F]+)" in pmd output
	c:  "TX timestamp value ([0-9a-fA-F]+)" in pmd output

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John McNamara
> Sent: Thursday, July 02, 2015 23:16
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/7] ethdev: add support for ieee1588
> timestamping
> 
> This patchset adds ethdev API to enable and read IEEE1588/802.1AS PTP
> timestamps from devices that support it. The following functions are added:
> 
>     rte_eth_timesync_enable()
>     rte_eth_timesync_disable()
>     rte_eth_timesync_read_rx_timestamp()
>     rte_eth_timesync_read_tx_timestamp()
> 
> The "ieee1588" forwarding mode in testpmd is also refactored to
> demonstrate the new API and to clean up the code.
> 
> Adds support for igb, ixgbe and i40e.
> 
> V3:
> * Fixed issued with version.map.
> 
> V2:
> * Added i40e support.
> 
> * Renamed ethdev functions from rte_eth_ieee15888_*() to
> rte_eth_timesync_*()
>   since 802.1AS can be supported through the same interfaces.
> 
> V1:
> * Initial version for
> 
> 
> John McNamara (7):
>   ethdev: add support for ieee1588 timestamping
>   e1000: add support for ieee1588 timestamping
>   ixgbe: add support for ieee1588 timestamping
>   i40e: add support for ieee1588 timestamping
>   app/testpmd: refactor ieee1588 forwarding
>   doc: document ieee1588 forwarding mode
>   abi: announce mbuf addition for ieee1588 in DPDK 2.2
> 
>  app/test-pmd/ieee1588fwd.c                  | 466 ++--------------------------
>  doc/guides/rel_notes/abi.rst                |   5 +
>  doc/guides/testpmd_app_ug/run_app.rst       |   2 +-
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |   2 +
>  drivers/net/e1000/igb_ethdev.c              | 115 +++++++
>  drivers/net/i40e/i40e_ethdev.c              | 143 +++++++++
>  drivers/net/i40e/i40e_rxtx.c                |  39 ++-
>  drivers/net/ixgbe/ixgbe_ethdev.c            | 122 ++++++++
>  lib/librte_ether/rte_ethdev.c               |  70 ++++-
>  lib/librte_ether/rte_ethdev.h               |  90 +++++-
>  lib/librte_ether/rte_ether_version.map      |   4 +
>  11 files changed, 615 insertions(+), 443 deletions(-)
> 
> --
> 1.8.1.4

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] config: revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256
@ 2015-07-08  1:24  3% Jijiang Liu
  2015-07-10 21:58  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Jijiang Liu @ 2015-07-08  1:24 UTC (permalink / raw)
  To: dev

Revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256.

The previous commit changed the size and the offsets of struct rte_eth_dev,
so it is an ABI breakage. I revert it, and will send a deprecation notice for this.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
 config/common_linuxapp |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index f5646e0..e883f69 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -139,7 +139,7 @@ CONFIG_RTE_LIBRTE_KVARGS=y
 CONFIG_RTE_LIBRTE_ETHER=y
 CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=n
 CONFIG_RTE_MAX_ETHPORTS=32
-CONFIG_RTE_MAX_QUEUES_PER_PORT=1024
+CONFIG_RTE_MAX_QUEUES_PER_PORT=256
 CONFIG_RTE_LIBRTE_IEEE1588=n
 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y
-- 
1.7.7.6

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename rte_eth_vmdq_mirror_conf
  2015-07-07 14:51  4%     ` Thomas Monjalon
@ 2015-07-08  0:58  0%       ` Wu, Jingjing
  2015-07-08 10:31  5%       ` Mcnamara, John
  1 sibling, 0 replies; 200+ results
From: Wu, Jingjing @ 2015-07-08  0:58 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Thanks for the clarification.

Jingjing

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, July 07, 2015 10:51 PM
> To: Wu, Jingjing
> Cc: dev@dpdk.org; nhorman@tuxdriver.com
> Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename
> rte_eth_vmdq_mirror_conf
> 
> 2015-06-26 07:03, Wu, Jingjing:
> > Hi, Neil
> >
> > About this patch I have an ABI concern about it.
> > This patch just renamed a struct rte_eth_vmdq_mirror_conf to
> > rte_eth_mirror_conf, the size and its elements don't change.
> > As my understanding, it will not break the ABI. And I also tested it.
> > But when I use the script ./scripts/validate-abi.sh to check.
> > A low severity problem is reported in symbol "rte_eth_mirror_rule_set"
> >  - Change: "Base type of 2nd parameter mirror_conf has been changed
> > from struct rte_eth_vmdq_mirror_conf to struct rte_eth_mirror_conf."
> >  - Effect: "Replacement of parameter base type may indicate a change
> > in its semantic meaning."
> >
> > So, I'm not sure whether this patch meet the ABI policy?
> 
> I think it's OK.
> 
> > Additional, about the validate-abi.sh, does it mean we need to fix all
> > the problems it reports? Or we can decide case by case.
> > Can a Low Severity problem be acceptable?
> 
> We have to decide case by case.
> It makes ABI checking impossible to automate.
> That's why any help is welcome to check the git HEAD for ABI violation.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 0/4] extend flow director to support L2_paylod type
  @ 2015-07-07 21:24  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-07 21:24 UTC (permalink / raw)
  To: Wu, Jingjing; +Cc: dev

> > This patch set extends flow director to support L2_paylod type in i40e driver.
> > 
> > v2 change:
> >  - remove the flow director VF filtering from this patch to avoid breaking ABI.
> > 
> > Jingjing Wu (4):
> >   ethdev: add struct rte_eth_l2_flow to support l2_payload flow type
> >   i40e: extend flow diretcor to support l2_payload flow type
> >   testpmd: extend commands
> >   doc: extend commands in testpmd
> 
> Acked-by: Helin Zhang <helin.zhang@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v3] doc: announce ABI changes planned for unified packet type
  @ 2015-07-07 17:45 21% ` Helin Zhang
  2015-07-08 15:58  4%   ` Zhang, Helin
  2015-07-09  0:56  4%   ` Wu, Jingjing
  0 siblings, 2 replies; 200+ results
From: Helin Zhang @ 2015-07-07 17:45 UTC (permalink / raw)
  To: dev

The significant ABI changes of all shared libraries are planned to
support unified packet type which will be taken effect from
release 2.2. Here announces that ABI changes in detail.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

v2 changes:
* Added 'struct rte_kni_mbuf' to the ABI change announcement.

v3 changes:
* Re-worded the ABI change announcement, as the ABI changes will be
  taken effect from release 2.1 for static libraries, and from
  release 2.2 for shared libraries.

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..9f8a9b3 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* Significant ABI changes are planned for struct rte_mbuf, struct rte_kni_mbuf, and several PKT_RX_ flags will be removed, to support unified packet type from release 2.1. The upcoming release 2.1 will have those changes for static libraries only, while have no change for shared libraries by default. Those changes will be taken effect for shared libraries from release 2.2. There is no backward compatibility planned from release 2.2. All binaries will need to be rebuilt from release 2.2.
-- 
1.9.3

^ permalink raw reply	[relevance 21%]

* Re: [dpdk-dev] [PATCH v4 1/4] ethdev: rename rte_eth_vmdq_mirror_conf
  @ 2015-07-07 14:51  4%     ` Thomas Monjalon
  2015-07-08  0:58  0%       ` Wu, Jingjing
  2015-07-08 10:31  5%       ` Mcnamara, John
  0 siblings, 2 replies; 200+ results
From: Thomas Monjalon @ 2015-07-07 14:51 UTC (permalink / raw)
  To: Wu, Jingjing; +Cc: dev

2015-06-26 07:03, Wu, Jingjing:
> Hi, Neil
> 
> About this patch I have an ABI concern about it. 
> This patch just renamed a struct rte_eth_vmdq_mirror_conf to
> rte_eth_mirror_conf, the size and its elements don't change.
> As my understanding, it will not break the ABI. And I also tested it.
> But when I use the script ./scripts/validate-abi.sh to check.
> A low severity problem is reported in symbol "rte_eth_mirror_rule_set"
>  - Change: "Base type of 2nd parameter mirror_conf has been changed
>  from struct rte_eth_vmdq_mirror_conf to struct rte_eth_mirror_conf."
>  - Effect: "Replacement of parameter base type may indicate a change
>  in its semantic meaning."
> 
> So, I'm not sure whether this patch meet the ABI policy?

I think it's OK.

> Additional, about the validate-abi.sh, does it mean we need to fix
> all the problems it reports? Or we can decide case by case.
> Can a Low Severity problem be acceptable?

We have to decide case by case.
It makes ABI checking impossible to automate.
That's why any help is welcome to check the git HEAD for ABI violation.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3 0/7] support i40e QinQ stripping and insertion
  @ 2015-07-07 14:43  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-07 14:43 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev

> > As i40e hardware can be reconfigured to support QinQ stripping and insertion,
> > this patch set is to enable that together with using the reserved 16 bits in
> > 'struct rte_mbuf' for the second vlan tag. Corresponding command is added
> > in testpmd for testing.
> > Note that no need to rework vPMD, as nothings used in it changed.
> > 
> > v2 changes:
> > * Added more commit logs of which commit it fix for.
> > * Fixed a typo.
> > * Kept the original RX/TX offload flags as they were, added new
> >   flags after with new bit masks, for ABI compatibility.
> > * Supported double vlan stripping/insertion in examples/ipv4_multicast.
> > 
> > v3 changes:
> > * update documentation (Testpmd Application User Guide).
> > 
> > Helin Zhang (7):
> >   ixgbe: remove a discarded source line
> >   mbuf: use the reserved 16 bits for double vlan
> >   i40e: support double vlan stripping and insertion
> >   i40evf: add supported offload capability flags
> >   app/testpmd: add test cases for qinq stripping and insertion
> >   examples/ipv4_multicast: support double vlan stripping and insertion
> >   doc: update testpmd command
> 
> Acked-by: Jingjing Wu <jingjing.wu@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-07 12:46  7%         ` Thomas Monjalon
  2015-07-07 13:11  4%           ` Neil Horman
@ 2015-07-07 13:44  8%           ` Neil Horman
  2015-07-10 16:07  4%             ` Mcnamara, John
  1 sibling, 1 reply; 200+ results
From: Neil Horman @ 2015-07-07 13:44 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Tue, Jul 07, 2015 at 05:46:08AM -0700, Thomas Monjalon wrote:
> Thanks Neil, we are making good progress.
> 
> 2015-07-07 07:14, Neil Horman:
> > On Mon, Jul 06, 2015 at 11:44:59PM +0200, Thomas Monjalon wrote:
> > > 2015-07-06 14:22, Neil Horman:
> > > > On Mon, Jul 06, 2015 at 03:49:50PM +0200, Thomas Monjalon wrote:
> > > > > 2015-07-06 09:35, Neil Horman:
> > > > > > On Mon, Jul 06, 2015 at 03:18:51PM +0200, Thomas Monjalon wrote:
> > > > > > > Any comment or ack?
> > > > > > > 
> > > > > > > 2015-07-03 00:05, Thomas Monjalon:
> > > > > > > > When a change makes really hard to keep ABI compatibility,
> > > > > > > > instead of waiting next release to break the ABI, it is smoother
> > > > > > > > to introduce the new code and enable it only for static libraries.
> > > > > > > > The flag RTE_NEXT_ABI may be used to "ifdef" the new code.
> > > > > > > > When the release is out, a dynamically linked application can use
> > > > > > > > the new shared libraries without rebuild while developpers can prepare
> > > > > > > > their application for the next ABI by reading the deprecation notice
> > > > > > > > and easily testing the new code.
> > > > > > > > When starting the next release cycle, the "ifdefs" will be removed
> > > > > > > > and the ABI break will be marked by incrementing LIBABIVER.
> > > > > > > > 
> > > > > > > > The new option CONFIG_RTE_NEXT_ABI is not defined in the configuration
> > > > > > > > templates because it is deduced from CONFIG_RTE_BUILD_SHARED_LIB.
> > > > > > > > It is automatically enabled for static libraries and disabled for
> > > > > > > > shared libraries.
> > > > > > > > It can be forced to another value by editing the generated .config file.
> > > > > > > > It shouldn't be enabled for shared libraries because it would break the
> > > > > > > > ABI without changing the version number LIBABIVER. That's why a warning
> > > > > > > > is printed in this case.
> > > > > > > > 
> > > > > > > > The guideline is also updated to integrate this new possibility.
> > > [...]
> > > > I'd be ok with it iff:
> > > > 
> > > > 1) It applies to static and shared ABI's together.  That is to say that setting
> > > > the NEXT_ABI config flag creates the same ABI changes regardless of other build
> > > > configuration.  It needs to be used in such a way that a consistent ABI is
> > > > presented when set, otherwise it won't be useful.
> > > 
> > > Yes the option trigger exactly the same ABI for static and shared libraries.
> > > But it's too complicated (at least for 2.1) to make LIBABIVER and version map
> > > dependant of this build-time option.
> > 
> > No, I think thats a bridge too far.  I'm not sure whats difficult about
> > overriding LIBABIVER in lib.rte.mk and bump all numbers 1 higher (or better
> > still just add a .1 to the end of it), by checking CONFIG_NEXT_ABI
> 
> Good idea, I will submit a v2 which adds .1.
> 
> > As for maintaining the version map, I don't see any problem with duplicating the
> > map files, to a -next variant, and changing the CPU_LDFLAGS in rte.lib.mk based
> > on the NEXT_ABI config option again.
> 
> OK
> 
> > In fact, if this is a thing that people want, that might be beneficial, as
> > something else occurs to me.  I think you're going to want this to be a mandated
> > piece of the update process.  That is to say, if someone wants to deprecate an
> > aspect of the ABI, or change it, I think you'll want to mandate that they submit
> > the change at the same time they submit the deprecation notice, and simply
> > protect it with this NEXT_ABI config option.  That provides several advantages:
> 
> For the release 2.1, we have some deprecation notices without code. It was
> the policy agreed in 2.0 release.
> Maybe we can force code to be submitted with deprecation notices, starting
> with release 2.2.
> It needs to be amended in v2 of this patch.
> 
> > 1) It ensures that the notice is submitted at the same time as the actual change
> > 2) It ensures that the NEXT_ABI provides a complete view of what the next ABI
> > version looks like, not just a partial view of it
> 
> Yes it would be probably useful.
> 
> > Adding a *-version-next.map file for each library makes adhering to the above
> > easier, and allows for an easy converstion, in that when its time to officially
> > update the ABI, fixing the version map is a matter of copying
> > <library>-version-next.map to <library>-version.map.
> 
> OK
> 
> [...]
> > > That's why, it should not be enabled to deploy shared libraries, though it can
> > > be used for tests and development.
> > > As static libraries are almost never packaged, they will be built and linked
> > > at the same time. That's why users of static libraries tend to prefer the
> > > newest ABI, which is the default in this case.
> > > 
> > > > 2) It only applies to the next ABI.  That is to say, it can't be a hodgepodge of
> > > > the next ABI and the one after that, and the one after that, or it won't provide
> > > > an appropriate preview for anyone.
> > > 
> > > If you mean the next ABI must be promoted as the standard ABI in the next release,
> > > yes: ifdefs will be cleaned when starting a new release.
> > > Thanks, I learnt the english word hodgepodge :)
> > > 
> > Je-mexcuse, une meli-melo? :)
> 
> Oui un meli-melo, un ramassis. un beau bordel en somme.
> 
> > I mean't what you indicate yes, and in addition to that, I just wanted to
> > clarify that this option could strictly _only_ apply to the very next ABI.  That
> > is to say, someone can't use this without also posting an ABI deprecation
> > notice, or we would find ourselves in a situation where something would only be
> > available in NEXT_ABI for more than one release, which would be unacceptable.
> > But I think we're saying the same thing.
> 
> Yes. I'll try to make it clear in v2.
> 
> Neil, in the meantime, could you please help to check ABI breakage in the HEAD?
> 
Took a look, the only ABI break I see that we need to worry about is the one
introduced in commit 8eecb3295aed0a979def52245564d03be172a83c. It adds a
bitfield called lro into the existing uint8_t there, but does so in the middle
of the set, which pushes the other bits around, breaking ABI.  It should have
been added to the end.

Neil

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-07 12:46  7%         ` Thomas Monjalon
@ 2015-07-07 13:11  4%           ` Neil Horman
  2015-07-07 13:44  8%           ` Neil Horman
  1 sibling, 0 replies; 200+ results
From: Neil Horman @ 2015-07-07 13:11 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Tue, Jul 07, 2015 at 05:46:08AM -0700, Thomas Monjalon wrote:
> Thanks Neil, we are making good progress.
> 
> 2015-07-07 07:14, Neil Horman:
> > On Mon, Jul 06, 2015 at 11:44:59PM +0200, Thomas Monjalon wrote:
> > > 2015-07-06 14:22, Neil Horman:
> > > > On Mon, Jul 06, 2015 at 03:49:50PM +0200, Thomas Monjalon wrote:
> > > > > 2015-07-06 09:35, Neil Horman:
> > > > > > On Mon, Jul 06, 2015 at 03:18:51PM +0200, Thomas Monjalon wrote:
> > > > > > > Any comment or ack?
> > > > > > > 
> > > > > > > 2015-07-03 00:05, Thomas Monjalon:
> > > > > > > > When a change makes really hard to keep ABI compatibility,
> > > > > > > > instead of waiting next release to break the ABI, it is smoother
> > > > > > > > to introduce the new code and enable it only for static libraries.
> > > > > > > > The flag RTE_NEXT_ABI may be used to "ifdef" the new code.
> > > > > > > > When the release is out, a dynamically linked application can use
> > > > > > > > the new shared libraries without rebuild while developpers can prepare
> > > > > > > > their application for the next ABI by reading the deprecation notice
> > > > > > > > and easily testing the new code.
> > > > > > > > When starting the next release cycle, the "ifdefs" will be removed
> > > > > > > > and the ABI break will be marked by incrementing LIBABIVER.
> > > > > > > > 
> > > > > > > > The new option CONFIG_RTE_NEXT_ABI is not defined in the configuration
> > > > > > > > templates because it is deduced from CONFIG_RTE_BUILD_SHARED_LIB.
> > > > > > > > It is automatically enabled for static libraries and disabled for
> > > > > > > > shared libraries.
> > > > > > > > It can be forced to another value by editing the generated .config file.
> > > > > > > > It shouldn't be enabled for shared libraries because it would break the
> > > > > > > > ABI without changing the version number LIBABIVER. That's why a warning
> > > > > > > > is printed in this case.
> > > > > > > > 
> > > > > > > > The guideline is also updated to integrate this new possibility.
> > > [...]
> > > > I'd be ok with it iff:
> > > > 
> > > > 1) It applies to static and shared ABI's together.  That is to say that setting
> > > > the NEXT_ABI config flag creates the same ABI changes regardless of other build
> > > > configuration.  It needs to be used in such a way that a consistent ABI is
> > > > presented when set, otherwise it won't be useful.
> > > 
> > > Yes the option trigger exactly the same ABI for static and shared libraries.
> > > But it's too complicated (at least for 2.1) to make LIBABIVER and version map
> > > dependant of this build-time option.
> > 
> > No, I think thats a bridge too far.  I'm not sure whats difficult about
> > overriding LIBABIVER in lib.rte.mk and bump all numbers 1 higher (or better
> > still just add a .1 to the end of it), by checking CONFIG_NEXT_ABI
> 
> Good idea, I will submit a v2 which adds .1.
> 
> > As for maintaining the version map, I don't see any problem with duplicating the
> > map files, to a -next variant, and changing the CPU_LDFLAGS in rte.lib.mk based
> > on the NEXT_ABI config option again.
> 
> OK
> 
> > In fact, if this is a thing that people want, that might be beneficial, as
> > something else occurs to me.  I think you're going to want this to be a mandated
> > piece of the update process.  That is to say, if someone wants to deprecate an
> > aspect of the ABI, or change it, I think you'll want to mandate that they submit
> > the change at the same time they submit the deprecation notice, and simply
> > protect it with this NEXT_ABI config option.  That provides several advantages:
> 
> For the release 2.1, we have some deprecation notices without code. It was
> the policy agreed in 2.0 release.
Right, we're stuck with that for the next release, but this idea of yours I
think will help make sure we get both together in the future.

> Maybe we can force code to be submitted with deprecation notices, starting
> with release 2.2.
> It needs to be amended in v2 of this patch.
Yes, I think that makes sense

> 
> > 1) It ensures that the notice is submitted at the same time as the actual change
> > 2) It ensures that the NEXT_ABI provides a complete view of what the next ABI
> > version looks like, not just a partial view of it
> 
> Yes it would be probably useful.
> 
> > Adding a *-version-next.map file for each library makes adhering to the above
> > easier, and allows for an easy converstion, in that when its time to officially
> > update the ABI, fixing the version map is a matter of copying
> > <library>-version-next.map to <library>-version.map.
> 
> OK
> 
> [...]
> > > That's why, it should not be enabled to deploy shared libraries, though it can
> > > be used for tests and development.
> > > As static libraries are almost never packaged, they will be built and linked
> > > at the same time. That's why users of static libraries tend to prefer the
> > > newest ABI, which is the default in this case.
> > > 
> > > > 2) It only applies to the next ABI.  That is to say, it can't be a hodgepodge of
> > > > the next ABI and the one after that, and the one after that, or it won't provide
> > > > an appropriate preview for anyone.
> > > 
> > > If you mean the next ABI must be promoted as the standard ABI in the next release,
> > > yes: ifdefs will be cleaned when starting a new release.
> > > Thanks, I learnt the english word hodgepodge :)
> > > 
> > Je-mexcuse, une meli-melo? :)
> 
> Oui un meli-melo, un ramassis. un beau bordel en somme.
> 
exactement! :)

> > I mean't what you indicate yes, and in addition to that, I just wanted to
> > clarify that this option could strictly _only_ apply to the very next ABI.  That
> > is to say, someone can't use this without also posting an ABI deprecation
> > notice, or we would find ourselves in a situation where something would only be
> > available in NEXT_ABI for more than one release, which would be unacceptable.
> > But I think we're saying the same thing.
> 
> Yes. I'll try to make it clear in v2.
> 
> Neil, in the meantime, could you please help to check ABI breakage in the HEAD?
I'll try to take a look today
Neil

> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-07 11:14  9%       ` Neil Horman
@ 2015-07-07 12:46  7%         ` Thomas Monjalon
  2015-07-07 13:11  4%           ` Neil Horman
  2015-07-07 13:44  8%           ` Neil Horman
  0 siblings, 2 replies; 200+ results
From: Thomas Monjalon @ 2015-07-07 12:46 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

Thanks Neil, we are making good progress.

2015-07-07 07:14, Neil Horman:
> On Mon, Jul 06, 2015 at 11:44:59PM +0200, Thomas Monjalon wrote:
> > 2015-07-06 14:22, Neil Horman:
> > > On Mon, Jul 06, 2015 at 03:49:50PM +0200, Thomas Monjalon wrote:
> > > > 2015-07-06 09:35, Neil Horman:
> > > > > On Mon, Jul 06, 2015 at 03:18:51PM +0200, Thomas Monjalon wrote:
> > > > > > Any comment or ack?
> > > > > > 
> > > > > > 2015-07-03 00:05, Thomas Monjalon:
> > > > > > > When a change makes really hard to keep ABI compatibility,
> > > > > > > instead of waiting next release to break the ABI, it is smoother
> > > > > > > to introduce the new code and enable it only for static libraries.
> > > > > > > The flag RTE_NEXT_ABI may be used to "ifdef" the new code.
> > > > > > > When the release is out, a dynamically linked application can use
> > > > > > > the new shared libraries without rebuild while developpers can prepare
> > > > > > > their application for the next ABI by reading the deprecation notice
> > > > > > > and easily testing the new code.
> > > > > > > When starting the next release cycle, the "ifdefs" will be removed
> > > > > > > and the ABI break will be marked by incrementing LIBABIVER.
> > > > > > > 
> > > > > > > The new option CONFIG_RTE_NEXT_ABI is not defined in the configuration
> > > > > > > templates because it is deduced from CONFIG_RTE_BUILD_SHARED_LIB.
> > > > > > > It is automatically enabled for static libraries and disabled for
> > > > > > > shared libraries.
> > > > > > > It can be forced to another value by editing the generated .config file.
> > > > > > > It shouldn't be enabled for shared libraries because it would break the
> > > > > > > ABI without changing the version number LIBABIVER. That's why a warning
> > > > > > > is printed in this case.
> > > > > > > 
> > > > > > > The guideline is also updated to integrate this new possibility.
> > [...]
> > > I'd be ok with it iff:
> > > 
> > > 1) It applies to static and shared ABI's together.  That is to say that setting
> > > the NEXT_ABI config flag creates the same ABI changes regardless of other build
> > > configuration.  It needs to be used in such a way that a consistent ABI is
> > > presented when set, otherwise it won't be useful.
> > 
> > Yes the option trigger exactly the same ABI for static and shared libraries.
> > But it's too complicated (at least for 2.1) to make LIBABIVER and version map
> > dependant of this build-time option.
> 
> No, I think thats a bridge too far.  I'm not sure whats difficult about
> overriding LIBABIVER in lib.rte.mk and bump all numbers 1 higher (or better
> still just add a .1 to the end of it), by checking CONFIG_NEXT_ABI

Good idea, I will submit a v2 which adds .1.

> As for maintaining the version map, I don't see any problem with duplicating the
> map files, to a -next variant, and changing the CPU_LDFLAGS in rte.lib.mk based
> on the NEXT_ABI config option again.

OK

> In fact, if this is a thing that people want, that might be beneficial, as
> something else occurs to me.  I think you're going to want this to be a mandated
> piece of the update process.  That is to say, if someone wants to deprecate an
> aspect of the ABI, or change it, I think you'll want to mandate that they submit
> the change at the same time they submit the deprecation notice, and simply
> protect it with this NEXT_ABI config option.  That provides several advantages:

For the release 2.1, we have some deprecation notices without code. It was
the policy agreed in 2.0 release.
Maybe we can force code to be submitted with deprecation notices, starting
with release 2.2.
It needs to be amended in v2 of this patch.

> 1) It ensures that the notice is submitted at the same time as the actual change
> 2) It ensures that the NEXT_ABI provides a complete view of what the next ABI
> version looks like, not just a partial view of it

Yes it would be probably useful.

> Adding a *-version-next.map file for each library makes adhering to the above
> easier, and allows for an easy converstion, in that when its time to officially
> update the ABI, fixing the version map is a matter of copying
> <library>-version-next.map to <library>-version.map.

OK

[...]
> > That's why, it should not be enabled to deploy shared libraries, though it can
> > be used for tests and development.
> > As static libraries are almost never packaged, they will be built and linked
> > at the same time. That's why users of static libraries tend to prefer the
> > newest ABI, which is the default in this case.
> > 
> > > 2) It only applies to the next ABI.  That is to say, it can't be a hodgepodge of
> > > the next ABI and the one after that, and the one after that, or it won't provide
> > > an appropriate preview for anyone.
> > 
> > If you mean the next ABI must be promoted as the standard ABI in the next release,
> > yes: ifdefs will be cleaned when starting a new release.
> > Thanks, I learnt the english word hodgepodge :)
> > 
> Je-mexcuse, une meli-melo? :)

Oui un meli-melo, un ramassis. un beau bordel en somme.

> I mean't what you indicate yes, and in addition to that, I just wanted to
> clarify that this option could strictly _only_ apply to the very next ABI.  That
> is to say, someone can't use this without also posting an ABI deprecation
> notice, or we would find ourselves in a situation where something would only be
> available in NEXT_ABI for more than one release, which would be unacceptable.
> But I think we're saying the same thing.

Yes. I'll try to make it clear in v2.

Neil, in the meantime, could you please help to check ABI breakage in the HEAD?

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  @ 2015-07-07 11:14  9%       ` Neil Horman
  2015-07-07 12:46  7%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-07 11:14 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 06, 2015 at 11:44:59PM +0200, Thomas Monjalon wrote:
> 2015-07-06 14:22, Neil Horman:
> > On Mon, Jul 06, 2015 at 03:49:50PM +0200, Thomas Monjalon wrote:
> > > 2015-07-06 09:35, Neil Horman:
> > > > On Mon, Jul 06, 2015 at 03:18:51PM +0200, Thomas Monjalon wrote:
> > > > > Any comment or ack?
> > > > > 
> > > > > 2015-07-03 00:05, Thomas Monjalon:
> > > > > > When a change makes really hard to keep ABI compatibility,
> > > > > > instead of waiting next release to break the ABI, it is smoother
> > > > > > to introduce the new code and enable it only for static libraries.
> > > > > > The flag RTE_NEXT_ABI may be used to "ifdef" the new code.
> > > > > > When the release is out, a dynamically linked application can use
> > > > > > the new shared libraries without rebuild while developpers can prepare
> > > > > > their application for the next ABI by reading the deprecation notice
> > > > > > and easily testing the new code.
> > > > > > When starting the next release cycle, the "ifdefs" will be removed
> > > > > > and the ABI break will be marked by incrementing LIBABIVER.
> > > > > > 
> > > > > > The new option CONFIG_RTE_NEXT_ABI is not defined in the configuration
> > > > > > templates because it is deduced from CONFIG_RTE_BUILD_SHARED_LIB.
> > > > > > It is automatically enabled for static libraries and disabled for
> > > > > > shared libraries.
> > > > > > It can be forced to another value by editing the generated .config file.
> > > > > > It shouldn't be enabled for shared libraries because it would break the
> > > > > > ABI without changing the version number LIBABIVER. That's why a warning
> > > > > > is printed in this case.
> > > > > > 
> > > > > > The guideline is also updated to integrate this new possibility.
> [...]
> > I'd be ok with it iff:
> > 
> > 1) It applies to static and shared ABI's together.  That is to say that setting
> > the NEXT_ABI config flag creates the same ABI changes regardless of other build
> > configuration.  It needs to be used in such a way that a consistent ABI is
> > presented when set, otherwise it won't be useful.
> 
> Yes the option trigger exactly the same ABI for static and shared libraries.
> But it's too complicated (at least for 2.1) to make LIBABIVER and version map
> dependant of this build-time option.

No, I think thats a bridge too far.  I'm not sure whats difficult about
overriding LIBABIVER in lib.rte.mk and bump all numbers 1 higher (or better
still just add a .1 to the end of it), by checking CONFIG_NEXT_ABI

As for maintaining the version map, I don't see any problem with duplicating the
map files, to a -next variant, and changing the CPU_LDFLAGS in rte.lib.mk based
on the NEXT_ABI config option again.

In fact, if this is a thing that people want, that might be beneficial, as
something else occurs to me.  I think you're going to want this to be a mandated
piece of the update process.  That is to say, if someone wants to deprecate an
aspect of the ABI, or change it, I think you'll want to mandate that they submit
the change at the same time they submit the deprecation notice, and simply
protect it with this NEXT_ABI config option.  That provides several advantages:

1) It ensures that the notice is submitted at the same time as the actual change
2) It ensures that the NEXT_ABI provides a complete view of what the next ABI
version looks like, not just a partial view of it

Adding a *-version-next.map file for each library makes adhering to the above
easier, and allows for an easy converstion, in that when its time to officially
update the ABI, fixing the version map is a matter of copying
<library>-version-next.map to <library>-version.map.

The use case that I'm thinking of here is as such:

Consider two ABI modifying updates, A and B.

The author of A writes his changes, submits them with appropriate ifdefs for
CONFIG_NEXT_ABI, along with a deprecation notice.

The author of B writes his changes, but doesn't submit them, instead submitting
only a deprecation notice, with plans to post the actuall patches after the
deprecation notice is shipped in a release

After the release CONFIG_NEXT_ABI exposes the ABI changes made by A, but not by
B (because they don't yet exist in the code).  I think to give users a complete
view of the NEXT_ABI, changes to the ABI, should be done by submitting the patch
(gated on the NEXT_ABI config option), along with the deprecation notice, at the
same time.  That way the ABI view is complete and consistent.  And if you do the
above bits with the cloned version map and LIBABIVER bump, its also consistent
between shared and static libraries.

> That's why, it should not be enabled to deploy shared libraries, though it can
> be used for tests and development.
> As static libraries are almost never packaged, they will be built and linked
> at the same time. That's why users of static libraries tend to prefer the
> newest ABI, which is the default in this case.
> 
> > 2) It only applies to the next ABI.  That is to say, it can't be a hodgepodge of
> > the next ABI and the one after that, and the one after that, or it won't provide
> > an appropriate preview for anyone.
> 
> If you mean the next ABI must be promoted as the standard ABI in the next release,
> yes: ifdefs will be cleaned when starting a new release.
> Thanks, I learnt the english word hodgepodge :)
> 
Je-mexcuse, une meli-melo? :)

I mean't what you indicate yes, and in addition to that, I just wanted to
clarify that this option could strictly _only_ apply to the very next ABI.  That
is to say, someone can't use this without also posting an ABI deprecation
notice, or we would find ourselves in a situation where something would only be
available in NEXT_ABI for more than one release, which would be unacceptable.
But I think we're saying the same thing.

> 

^ permalink raw reply	[relevance 9%]

* Re: [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR
  2015-07-07  7:58  3% [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR Jingjing Wu
@ 2015-07-07  9:04  0% ` Liu, Yong
  2015-07-19 22:54  3%   ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Liu, Yong @ 2015-07-07  9:04 UTC (permalink / raw)
  To: Wu, Jingjing, dev

Hi,
> -----Original Message-----
> From: Wu, Jingjing
> Sent: Tuesday, July 07, 2015 3:58 PM
> To: dev@dpdk.org
> Cc: Wu, Jingjing; Liu, Yong; Xu, HuilongX
> Subject: [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR
> 
> This patch set fixes the issue SCTP flow cannot be matched by FVL's flow
> director. The issue's root cause is that due to the NIC's firmware update,
> the input set of sctp flow are changed to source IP, destination IP,
> source port, destination port and Verification-Tag, which are source IP,
> destination IP and Verification-Tag previously.
> And because this fix will affect the struct rte_eth_fdir_flow, use
> RTE_NEXT_ABI to avoid ABI breaking.
> 
> Jingjing Wu (3):
>   ethdev: change the input set of sctp flow
>   i40e: make sport and dport of sctp flow involved in match
>   testpmd: add sport and dport configuration for sctp flow
> 
>  app/test-pmd/cmdline.c          | 12 ++++++++++++
>  drivers/net/i40e/i40e_fdir.c    | 18 ++++++++++++++++++
>  lib/librte_ether/rte_eth_ctrl.h |  8 ++++++++
>  3 files changed, 38 insertions(+)
> 
> --
> 1.9.3

Tested-by: Marvin Liu <yong.liu@intel.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR
@ 2015-07-07  7:58  3% Jingjing Wu
  2015-07-07  9:04  0% ` Liu, Yong
  0 siblings, 1 reply; 200+ results
From: Jingjing Wu @ 2015-07-07  7:58 UTC (permalink / raw)
  To: dev

This patch set fixes the issue SCTP flow cannot be matched by FVL's flow
director. The issue's root cause is that due to the NIC's firmware update,
the input set of sctp flow are changed to source IP, destination IP,
source port, destination port and Verification-Tag, which are source IP,
destination IP and Verification-Tag previously.
And because this fix will affect the struct rte_eth_fdir_flow, use
RTE_NEXT_ABI to avoid ABI breaking.

Jingjing Wu (3):
  ethdev: change the input set of sctp flow
  i40e: make sport and dport of sctp flow involved in match
  testpmd: add sport and dport configuration for sctp flow

 app/test-pmd/cmdline.c          | 12 ++++++++++++
 drivers/net/i40e/i40e_fdir.c    | 18 ++++++++++++++++++
 lib/librte_ether/rte_eth_ctrl.h |  8 ++++++++
 3 files changed, 38 insertions(+)

-- 
1.9.3

^ permalink raw reply	[relevance 3%]

Results 12801-13000 of ~18000   |  | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2015-04-21 17:32     [dpdk-dev] [PATCH v4 0/7] Hyper-V Poll Mode driver Stephen Hemminger
2015-04-21 17:32     ` [dpdk-dev] [PATCH v4 3/7] hv: add basic vmbus support Stephen Hemminger
2015-07-08 23:51  0%   ` Thomas Monjalon
2015-05-11  3:46     [dpdk-dev] [PATCH 0/6] extend flow director to support L2_paylod type and VF filtering in i40e driver Jingjing Wu
2015-06-16  3:43     ` [dpdk-dev] [PATCH v2 0/4] extend flow director to support L2_paylod type Jingjing Wu
2015-06-26  3:14       ` Zhang, Helin
2015-07-07 21:24  0%     ` Thomas Monjalon
2015-06-02  3:16     [dpdk-dev] [PATCH v2 0/6] support i40e QinQ stripping and insertion Helin Zhang
2015-06-11  7:03     ` [dpdk-dev] [PATCH v3 0/7] " Helin Zhang
2015-06-11  7:25       ` Wu, Jingjing
2015-07-07 14:43  0%     ` Thomas Monjalon
2015-06-04  7:33     [dpdk-dev] [PATCH v2 0/6] query hash key size in byte Helin Zhang
2015-06-12  7:33     ` [dpdk-dev] [PATCH v3 " Helin Zhang
2015-06-12  9:31       ` Ananyev, Konstantin
2015-07-08 23:17  3%     ` Thomas Monjalon
2015-06-05  8:16     [dpdk-dev] [PATCH v3 0/4] enable mirror functionality in i40e driver Jingjing Wu
2015-06-10  6:24     ` [dpdk-dev] [PATCH v4 1/4] ethdev: rename rte_eth_vmdq_mirror_conf Jingjing Wu
2015-06-26  7:03       ` Wu, Jingjing
2015-07-07 14:51  4%     ` Thomas Monjalon
2015-07-08  0:58  0%       ` Wu, Jingjing
2015-07-08 10:31  5%       ` Mcnamara, John
2015-07-08 10:35  0%         ` Bruce Richardson
2015-06-08  5:28     [dpdk-dev] [PATCH v12 00/14] Interrupt mode PMD Cunming Liang
2015-06-19  4:00     ` [dpdk-dev] [PATCH v13 " Cunming Liang
2015-07-09 13:58  3%   ` David Marchand
2015-07-17  6:04  0%     ` Liang, Cunming
2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
2015-07-19 23:31  0%       ` Thomas Monjalon
2015-07-20  2:02  0%         ` Liang, Cunming
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support Cunming Liang
2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition Cunming Liang
2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
2015-07-17  6:16  1%     ` [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF Cunming Liang
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
2015-07-20  3:02  1%       ` [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
2015-06-25 22:05     [dpdk-dev] [PATCH v2 00/11] Cuckoo hash Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 " Pablo de Lara
2015-07-08 23:23  0%   ` Thomas Monjalon
2015-07-09  8:02  0%     ` Bruce Richardson
2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 20:52  0%     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
2015-07-10 21:57 14%       ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
2015-07-10 23:30 14%         ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
2015-07-11  0:18               ` [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-07-12 22:29  3%             ` Thomas Monjalon
2015-07-13 16:11  3%               ` Bruce Richardson
2015-07-13 16:14  0%                 ` Bruce Richardson
2015-07-13 16:20  0%                   ` Thomas Monjalon
2015-07-13 16:26  0%                     ` Bruce Richardson
2015-07-11  0:18 14%           ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-12 22:38  8%             ` Thomas Monjalon
2015-07-12 22:46  0%           ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Thomas Monjalon
2015-06-29 13:42     [dpdk-dev] [PATCH v2 0/7] ethdev: add support for ieee1588 timestamping John McNamara
2015-07-02 15:16     ` [dpdk-dev] [PATCH v3 " John McNamara
2015-07-02 15:16       ` [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2 John McNamara
2015-07-06 13:16         ` Thomas Monjalon
2015-07-08 13:10  7%       ` Bruce Richardson
2015-07-09 15:51  7%         ` Thomas Monjalon
2015-07-09 16:01  4%           ` Bruce Richardson
2015-07-08  1:59  0%   ` [dpdk-dev] [PATCH v3 0/7] ethdev: add support for ieee1588 timestamping Fu, JingguoX
2015-07-02 13:50     [dpdk-dev] [PATCH 0/3] doc: added guidelines on dpdk documentation John McNamara
2015-07-10 15:45     ` [dpdk-dev] [PATCH v2 " John McNamara
2015-07-10 15:45  3%   ` [dpdk-dev] [PATCH v2 2/3] " John McNamara
2015-07-02 22:05     [dpdk-dev] [PATCH] mk: enable next abi in static libs Thomas Monjalon
2015-07-06 13:49     ` Thomas Monjalon
2015-07-06 18:22       ` Neil Horman
2015-07-06 21:44         ` Thomas Monjalon
2015-07-07 11:14  9%       ` Neil Horman
2015-07-07 12:46  7%         ` Thomas Monjalon
2015-07-07 13:11  4%           ` Neil Horman
2015-07-07 13:44  8%           ` Neil Horman
2015-07-10 16:07  4%             ` Mcnamara, John
2015-07-11 14:19  7%               ` Neil Horman
2015-07-13 10:14  8%                 ` Mcnamara, John
2015-07-08 14:55  8% ` [dpdk-dev] [PATCH v2 0/2] next abi option Thomas Monjalon
2015-07-08 14:55  4%   ` [dpdk-dev] [PATCH v2 1/2] mk: remove variables identical to config ones Thomas Monjalon
2015-07-08 14:55 34%   ` [dpdk-dev] [PATCH v2 2/2] mk: enable next abi preview Thomas Monjalon
2015-07-08 16:44 33%     ` [dpdk-dev] [PATCH v3] " Thomas Monjalon
2015-07-13  7:32  7%       ` Mcnamara, John
2015-07-13  8:48  7%         ` Thomas Monjalon
2015-07-13  9:02  8%           ` [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi Thomas Monjalon
2015-07-13  9:24  4%             ` Mcnamara, John
2015-07-13  9:32  7%               ` Thomas Monjalon
2015-07-08 16:50  4%   ` [dpdk-dev] [PATCH v2 0/2] next abi option Neil Horman
2015-07-08 22:58  4%     ` Thomas Monjalon
2015-07-03  8:32     [dpdk-dev] [PATCH v9 00/19] unified packet type Helin Zhang
2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
2015-07-13 15:53  0%     ` Thomas Monjalon
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
2015-07-15 10:19  0%     ` Olivier MATZ
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 04/19] ixgbe: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 05/19] i40e: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 06/19] enic: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 07/19] vmxnet3: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 08/19] fm10k: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 09/19] cxgbe: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 11/19] app/testpmd: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks Helin Zhang
2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
2015-07-15 23:51  0%     ` Zhang, Helin
2015-07-03  8:49     [dpdk-dev] [PATCH v2] doc: announce ABI changes planned for " Helin Zhang
2015-07-07 17:45 21% ` [dpdk-dev] [PATCH v3] " Helin Zhang
2015-07-08 15:58  4%   ` Zhang, Helin
2015-07-09  0:56  4%   ` Wu, Jingjing
2015-07-15 23:37  4%     ` Thomas Monjalon
2015-07-03  9:55     [dpdk-dev] [PATCH v7 0/9] Dynamic memzones Sergio Gonzalez Monroy
2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
2015-07-14  8:57 19%   ` [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
2015-07-15  8:26 19%     ` [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
2015-07-15 16:32  1%       ` [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
2015-07-15 16:32 19%       ` [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
2015-07-07  7:58  3% [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR Jingjing Wu
2015-07-07  9:04  0% ` Liu, Yong
2015-07-19 22:54  3%   ` Thomas Monjalon
2015-07-08  0:08     [dpdk-dev] [PATCH v3 0/4] bnx2x: poll mode driver Stephen Hemminger
2015-07-08  0:08     ` [dpdk-dev] [PATCH 1/4] eal: provide functions to access PCI config Stephen Hemminger
     [not found]       ` <d35efa226d95495e804181b2246063d2@HQ1WP-EXMB11.corp.brocade.com>
2015-07-08 16:11  3%     ` Stephen Hemminger
2015-07-08 16:34  0%       ` Thomas Monjalon
2015-07-08  1:24  3% [dpdk-dev] [PATCH] config: revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256 Jijiang Liu
2015-07-10 21:58  0% ` Thomas Monjalon
2015-07-08  2:08 20% [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port Jijiang Liu
2015-07-08 11:07  4% ` Neil Horman
2015-07-10 22:14  4%   ` Thomas Monjalon
2015-07-08 11:27     [dpdk-dev] [PATCH] Make rte_hash struct internal - Cuckoo hash part 1 Pablo de Lara
2015-07-08 11:27     ` [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal Pablo de Lara
2015-07-08 13:21  3%   ` Bruce Richardson
2015-07-08 16:57  0%     ` Matthew Hall
2015-07-09  8:12  3%       ` Bruce Richardson
2015-07-09 20:42  3%         ` Matthew Hall
2015-07-10 10:27  0%     ` Thomas Monjalon
2015-07-09  2:47 21% [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks Wenzhuo Lu
2015-07-09  7:32  4% ` Thomas Monjalon
2015-07-09  8:39  4%   ` Lu, Wenzhuo
2015-07-10  2:24 21% ` [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks Wenzhuo Lu
2015-07-09  4:58     [dpdk-dev] [PATCH v4 00/11] Introducing the TILE-Gx platform Zhigang Lu
2015-07-09  4:58  5% ` [dpdk-dev] [PATCH v4 04/11] config: remove RTE_LIBNAME definition Zhigang Lu
2015-07-09  8:25     [dpdk-dev] [PATCH v5 00/11] Introducing the TILE-Gx platform Zhigang Lu
2015-07-09  8:25  5% ` [dpdk-dev] [PATCH v5 04/11] config: remove RTE_LIBNAME definition Zhigang Lu
2015-07-09 13:30  3% [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping John McNamara
2015-07-10  0:43  0% ` Thomas Monjalon
2015-07-13 10:26  7% [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code John McNamara
2015-07-13 10:42  7% ` Neil Horman
2015-07-13 10:46  4%   ` Thomas Monjalon
2015-07-13 10:47  4%   ` Mcnamara, John
2015-07-13 13:59  4%     ` Neil Horman
2015-07-17 11:45  7%       ` Mcnamara, John
2015-07-17 12:25  4%         ` Neil Horman
2015-07-16 22:22  4% ` Vlad Zolotarov
2015-07-13 14:17  3% [dpdk-dev] [PATCH v5 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
2015-07-13 14:17  9% ` [dpdk-dev] [PATCH v5 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
2015-07-13 16:25  3% [dpdk-dev] [PATCH] hash: rename unused field to "reserved" Bruce Richardson
2015-07-13 16:28  0% ` Bruce Richardson
2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
2015-07-13 17:29  0%   ` Thomas Monjalon
2015-07-15  8:08  3%   ` Olga Shern
2015-07-15 13:11  3% [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
2015-07-15 13:11  9% ` [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
2015-07-16  7:54  0% ` [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Olivier MATZ
2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
2015-07-16 11:50  4% ` Gajdzica, MaciejX T
2015-07-16 12:28  4% ` Mrzyglod, DanielX T
2015-07-16 12:49  4% ` Singh, Jasvinder
2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
2015-07-16 12:25  4% ` Gajdzica, MaciejX T
2015-07-16 12:25  4% ` Thomas Monjalon
2015-07-16 15:09  4%   ` Dumitrescu, Cristian
2015-07-16 12:30  4% ` Mrzyglod, DanielX T
2015-07-16 12:49  4% ` Singh, Jasvinder
2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
2015-07-16 15:51  4% ` Singh, Jasvinder
2015-07-17  7:56  4% ` Gajdzica, MaciejX T
2015-07-17  8:08  4% ` Mrzyglod, DanielX T
2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
2015-07-17  7:54  4% ` Gajdzica, MaciejX T
2015-07-17  8:09  4% ` Mrzyglod, DanielX T
2015-07-17 12:02  4% ` Singh, Jasvinder
2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
2015-07-17  7:54  4% ` Gajdzica, MaciejX T
2015-07-17  8:10  4% ` Mrzyglod, DanielX T
2015-07-17 12:03  4% ` Singh, Jasvinder
2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
2015-07-16 21:25  4% ` Dumitrescu, Cristian
2015-07-16 21:28  4% ` Neil Horman
2015-07-23 10:18  4% ` Dumitrescu, Cristian
2015-07-16 21:34     [dpdk-dev] [PATCH v5 0/4] rte_sched: cleanup and deprecation Stephen Hemminger
2015-07-16 21:34  3% ` [dpdk-dev] [PATCH v5 4/4] rte_sched: hide structure of port hierarchy Stephen Hemminger
2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
2015-07-21 13:20  7%   ` Dumitrescu, Cristian
2015-07-21 14:03  7%     ` Thomas Monjalon
2015-07-19 21:32  0% ` [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
2015-07-20 10:45  0% ` Neil Horman
2015-07-20  7:03 13% [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter Jingjing Wu
2015-07-20 12:19     [dpdk-dev] [PATCHv3 0/5] ethdev: add new API to retrieve RX/TX queue information Konstantin Ananyev
2015-07-20 12:19  2% ` [dpdk-dev] [PATCHv3 1/5] " Konstantin Ananyev
2015-07-22 16:50  0%   ` Zhang, Helin
2015-07-22 17:00  0%     ` Ananyev, Konstantin
2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
2015-07-22 19:48  0%     ` Stephen Hemminger
2015-07-23 10:52  0%       ` Ananyev, Konstantin
2015-07-21 14:10  3% [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct Pablo de Lara
2015-07-22  9:08  0% ` Thomas Monjalon
2015-07-22 18:28     [dpdk-dev] [PATCHv4 0/5] ethdev: add new API to retrieve RX/TX queue information Konstantin Ananyev
2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
2015-07-23 11:05  4% ` Singh, Jasvinder
2015-07-23 11:07  4% ` Mrzyglod, DanielX T
2015-07-23 11:34  4% ` Gajdzica, MaciejX T

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).