DPDK patches and discussions
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download: 
* Re: [dpdk-dev] [PATCH v4 3/7] hv: add basic vmbus support
  @ 2015-07-08 23:51  0%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-08 23:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, alexmay

2015-04-21 10:32, Stephen Hemminger:
> The hyper-v device driver forces the base EAL code to change
> to support multiple bus types. This is done changing the pci_device
> in ether driver to a generic union.
> 
> As much as possible this is done in a backwards source compatiable
> way. It will break ABI for device drivers.

> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -80,6 +80,7 @@ eal_long_options[] = {
>  	{OPT_NO_HPET,           0, NULL, OPT_NO_HPET_NUM          },
>  	{OPT_NO_HUGE,           0, NULL, OPT_NO_HUGE_NUM          },
>  	{OPT_NO_PCI,            0, NULL, OPT_NO_PCI_NUM           },
> +	{OPT_NO_VMBUS,		0, NULL, OPT_NO_VMBUS_NUM	  },

Alignment, please.

> @@ -66,6 +66,7 @@ struct internal_config {
>  	volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
>  	volatile unsigned xen_dom0_support; /**< support app running on Xen Dom0*/
>  	volatile unsigned no_pci;         /**< true to disable PCI */
> +	volatile unsigned no_vmbus;	  /**< true to disable VMBUS */
>  	volatile unsigned no_hpet;        /**< true to disable HPET */
>  	volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping

Alignment may be better.

> +#ifdef RTE_LIBRTE_HV_PMD
> +	case RTE_BUS_VMBUS:
> +		eth_drv->vmbus_drv.devinit = rte_vmbus_dev_init;
> +		eth_drv->vmbus_drv.devuninit = rte_vmbus_dev_uninit;
> +		rte_eal_vmbus_register(&eth_drv->vmbus_drv);
> +		break;
> +#endif

Why ifdef'ing this code?

> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1477,7 +1478,10 @@ struct rte_eth_dev {
>  	struct rte_eth_dev_data *data;  /**< Pointer to device data */
>  	const struct eth_driver *driver;/**< Driver for this device */
>  	const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
> -	struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
> +	union {
> +		struct rte_pci_device *pci_dev; /**< PCI info. supplied by probig */
> +		struct rte_vmbus_device *vmbus_dev; /**< VMBUS info. supplied by probing */
> +	};
[...]
>  struct eth_driver {
> -	struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
> +	union {
> +		struct rte_pci_driver pci_drv;    /**< The PMD is also a PCI driver. */
> +		struct rte_vmbus_driver vmbus_drv;/**< The PMD is also a VMBUS drv. */
> +	};
> +	enum {
> +		RTE_BUS_PCI=0,
> +		RTE_BUS_VMBUS
> +	} bus_type;			  /**< Device bus type. */

A device may also be virtual.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] doc: announce ABI changes planned for unified packet type
  @ 2015-07-09  0:56  4%   ` Wu, Jingjing
  2015-07-15 23:37  4%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Wu, Jingjing @ 2015-07-09  0:56 UTC (permalink / raw)
  To: Zhang, Helin, dev

Acked-by: Jingjing Wu <jingjing.wu@intel.com>

> -----Original Message-----
> From: Zhang, Helin
> Sent: Wednesday, July 08, 2015 1:45 AM
> To: dev@dpdk.org
> Cc: Liu, Jijiang; Wu, Jingjing; nhorman@tuxdriver.com; Zhang, Helin
> Subject: [PATCH v3] doc: announce ABI changes planned for unified packet
> type
> 
> The significant ABI changes of all shared libraries are planned to support
> unified packet type which will be taken effect from release 2.2. Here
> announces that ABI changes in detail.
> 
> Signed-off-by: Helin Zhang <helin.zhang@intel.com>
> ---

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks
@ 2015-07-09  2:47 21% Wenzhuo Lu
  2015-07-09  7:32  4% ` Thomas Monjalon
  2015-07-10  2:24 21% ` [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks Wenzhuo Lu
  0 siblings, 2 replies; 200+ results
From: Wenzhuo Lu @ 2015-07-09  2:47 UTC (permalink / raw)
  To: dev

For x550 supports 2 new flow director modes, MAC VLAN and Cloud. There're several
new lookup fields for these 2 new modes, like MAC, tunnel type, TNI, VNI. So, we
have to change the ABI to support these new lookup fields.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..eca4d2b 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The ABI changes are planned for struct rte_fdir_filter and rte_fdir_masks in order to support new flow director modes, MAC VLAN and Cloud on x550. The upcoming release 2.1 will not contain these ABI changes, but release 2.2 will, and no backwards compatibility is planed due to this change. Binaries using this library build prior to version 2.2 will require updating and recompilation.
-- 
1.9.3

^ permalink raw reply	[relevance 21%]

* [dpdk-dev] [PATCH v4 04/11] config: remove RTE_LIBNAME definition.
  @ 2015-07-09  4:58  5% ` Zhigang Lu
  0 siblings, 0 replies; 200+ results
From: Zhigang Lu @ 2015-07-09  4:58 UTC (permalink / raw)
  To: dev; +Cc: Cyril Chemparathy

From: Cyril Chemparathy <cchemparathy@ezchip.com>

The library name is now being pinned to "dpdk" instead of intel_dpdk,
powerpc_dpdk, etc.  As a result, we no longer need this config item.
This patch removes it.

Signed-off-by: Zhigang Lu <zlu@ezchip.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 config/common_bsdapp                        | 1 -
 config/common_linuxapp                      | 1 -
 config/defconfig_ppc_64-power8-linuxapp-gcc | 2 --
 mk/rte.vars.mk                              | 5 +----
 4 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index dfa61a3..7112f1c 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME=intel_dpdk
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 1732b70..46297cd 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME="intel_dpdk"
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index d97a885..f1af518 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -39,8 +39,6 @@ CONFIG_RTE_ARCH_64=y
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
 
-CONFIG_RTE_LIBNAME="powerpc_dpdk"
-
 # Note: Power doesn't have this support
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
diff --git a/mk/rte.vars.mk b/mk/rte.vars.mk
index 0469064..f87cf4b 100644
--- a/mk/rte.vars.mk
+++ b/mk/rte.vars.mk
@@ -65,10 +65,7 @@ ifneq ($(BUILDING_RTE_SDK),)
   RTE_SDK_BIN := $(RTE_OUTPUT)
 endif
 
-RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
-ifeq ($(RTE_LIBNAME),)
-RTE_LIBNAME := intel_dpdk
-endif
+RTE_LIBNAME := dpdk
 
 # RTE_TARGET is deducted from config when we are building the SDK.
 # Else, when building an external app, RTE_TARGET must be specified
-- 
2.1.2

^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks
  2015-07-09  2:47 21% [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks Wenzhuo Lu
@ 2015-07-09  7:32  4% ` Thomas Monjalon
  2015-07-09  8:39  4%   ` Lu, Wenzhuo
  2015-07-10  2:24 21% ` [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks Wenzhuo Lu
  1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-09  7:32 UTC (permalink / raw)
  To: Wenzhuo Lu; +Cc: dev

2015-07-09 10:47, Wenzhuo Lu:
> +* The ABI changes are planned for struct rte_fdir_filter and rte_fdir_masks in order to support new flow director modes, MAC VLAN and Cloud on x550. The upcoming release 2.1 will not contain these ABI changes, but release 2.2 will, and no backwards compatibility is planed due to this change. Binaries using this library build prior to version 2.2 will require updating and recompilation.

Please wrap it.
What means Cloud for flow director? If you think about a tunnel type, please
name it.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3 00/11] Cuckoo hash
  @ 2015-07-09  8:02  0%     ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-09  8:02 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, Jul 09, 2015 at 01:23:54AM +0200, Thomas Monjalon wrote:
> Bruce, what is the status of this series?
>

The parts of the series can that are independent of the cuckoo hash update
have already been sent out as updates so changes to those can quicker be resolved
and merged. Pablo should be sending out a V4 of the cuckoo hash implementation
very shortly.

/Bruce

> 2015-06-28 23:25, Pablo de Lara:
> > This patchset is to replace the existing hash library with
> > a more efficient and functional approach, using the Cuckoo hash
> > method to deal with collisions. This method is based on using
> > two different hash functions to have two possible locations
> > in the hash table where an entry can be.
> > So, if a bucket is full, a new entry can push one of the items
> > in that bucket to its alternative location, making space for itself.
> > 
> > Advantages
> > ~~~~~~
> > - Offers the option to store more entries when the target bucket is full
> >   (unlike the previous implementation)
> > - Memory efficient: for storing those entries, it is not necessary to
> >   request new memory, as the entries will be stored in the same table
> > - Constant worst lookup time: in worst case scenario, it always takes
> >   the same time to look up an entry, as there are only two possible locations
> >   where an entry can be.
> > - Storing data: user can store data in the hash table, unlike the
> >   previous implementation, but he can still use the old API
> > 
> > This implementation tipically offers over 90% utilization.
> > Notice that API has been extended, but old API remains. The main
> > change in ABI is that rte_hash structure is now private and the
> > deprecation of two macros.
> > 
> > Changes in v3:
> > 
> > - Now user can store variable size data, instead of 32 or 64-bit size data,
> >   using the new parameter "data_len" in rte_hash_parameters
> > - Add lookup_bulk_with_hash function in performance  unit tests
> > - Add new functions that handle data in performance unit tests
> > - Remove duplicates in performance unit tests
> > - Fix rte_hash_reset, which was not reseting the last entry
> [...]
> > Pablo de Lara (11):
> >   eal: add const in prefetch functions
> >   hash: move rte_hash structure to C file and make it internal
> >   test/hash: enhance hash unit tests
> >   test/hash: rename new hash perf unit test back to original name
> >   hash: replace existing hash library with cuckoo hash implementation
> >   hash: add new lookup_bulk_with_hash function
> >   hash: add new function rte_hash_reset
> >   hash: add new functionality to store data in hash table
> >   MAINTAINERS: claim responsability for hash library
> >   doc: announce ABI change of librte_hash
> >   doc: update hash documentation
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  @ 2015-07-09  8:12  3%       ` Bruce Richardson
  2015-07-09 20:42  3%         ` Matthew Hall
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-09  8:12 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

On Wed, Jul 08, 2015 at 09:57:03AM -0700, Matthew Hall wrote:
> On Wed, Jul 08, 2015 at 02:21:42PM +0100, Bruce Richardson wrote:
> > Irrespective of whether or not we change the underlying hash table implementation
> > this looks a good change to me. The rte_hash structure should not be used directly
> > by any applications - the APIs all take pointers to the structure,
> > so there should be no ABI breakage from this, I think.
> > 
> > Therefore:
> > 
> > Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> Hi guys,
> 
> There are places where this will be annoying on the app side.
> 
> A lot of rte_hash, rte_lpm*, rte_table, etc. don't provide methods to iterate 
> whole structures with a callback function that includes the current structure 
> node, and a user-data pointer.
> 
> This can make it real unpleasant when you want to walk through the structure 
> and free a bunch of items it points to and so forth.
> 
> So if you're going to obfuscate things by censoring the structure contents 
> then we'd really like to be sure they have a full set of CRUD operations and 
> iteration support so one could manage the nodes individually and in bulk.
> 
> Matthew.

Thanks for the feedback Matthew. Can you suggest a function prototype for such
a walk operation that would make it useful for you. While we can keep the
hash structure public, I'd prefer if we could avoid it, as it makes making changes
hard due to ABI issues.

/Bruce

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v5 04/11] config: remove RTE_LIBNAME definition.
  @ 2015-07-09  8:25  5% ` Zhigang Lu
  0 siblings, 0 replies; 200+ results
From: Zhigang Lu @ 2015-07-09  8:25 UTC (permalink / raw)
  To: dev; +Cc: Cyril Chemparathy

From: Cyril Chemparathy <cchemparathy@ezchip.com>

The library name is now being pinned to "dpdk" instead of intel_dpdk,
powerpc_dpdk, etc.  As a result, we no longer need this config item.
This patch removes it.

Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Signed-off-by: Zhigang Lu <zlu@ezchip.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 config/common_bsdapp                        | 1 -
 config/common_linuxapp                      | 1 -
 config/defconfig_ppc_64-power8-linuxapp-gcc | 2 --
 mk/rte.vars.mk                              | 5 +----
 4 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index dfa61a3..7112f1c 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME=intel_dpdk
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 1732b70..46297cd 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -87,7 +87,6 @@ CONFIG_RTE_BUILD_SHARED_LIB=n
 # Combine to one single library
 #
 CONFIG_RTE_BUILD_COMBINE_LIBS=n
-CONFIG_RTE_LIBNAME="intel_dpdk"
 
 #
 # Use newest code breaking previous ABI
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index d97a885..f1af518 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -39,8 +39,6 @@ CONFIG_RTE_ARCH_64=y
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
 
-CONFIG_RTE_LIBNAME="powerpc_dpdk"
-
 # Note: Power doesn't have this support
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
 
diff --git a/mk/rte.vars.mk b/mk/rte.vars.mk
index 0469064..f87cf4b 100644
--- a/mk/rte.vars.mk
+++ b/mk/rte.vars.mk
@@ -65,10 +65,7 @@ ifneq ($(BUILDING_RTE_SDK),)
   RTE_SDK_BIN := $(RTE_OUTPUT)
 endif
 
-RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%)
-ifeq ($(RTE_LIBNAME),)
-RTE_LIBNAME := intel_dpdk
-endif
+RTE_LIBNAME := dpdk
 
 # RTE_TARGET is deducted from config when we are building the SDK.
 # Else, when building an external app, RTE_TARGET must be specified
-- 
2.1.2

^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks
  2015-07-09  7:32  4% ` Thomas Monjalon
@ 2015-07-09  8:39  4%   ` Lu, Wenzhuo
  0 siblings, 0 replies; 200+ results
From: Lu, Wenzhuo @ 2015-07-09  8:39 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, July 9, 2015 3:33 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter,
> rte_fdir_masks
> 
> 2015-07-09 10:47, Wenzhuo Lu:
> > +* The ABI changes are planned for struct rte_fdir_filter and rte_fdir_masks in
> order to support new flow director modes, MAC VLAN and Cloud on x550. The
> upcoming release 2.1 will not contain these ABI changes, but release 2.2 will,
> and no backwards compatibility is planed due to this change. Binaries using this
> library build prior to version 2.2 will require updating and recompilation.
> 
> Please wrap it.
> What means Cloud for flow director? If you think about a tunnel type, please
> name it.
Thanks for the comments, I'll send a V2.
Wenzhuo

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping
@ 2015-07-09 13:30  3% John McNamara
  2015-07-10  0:43  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: John McNamara @ 2015-07-09 13:30 UTC (permalink / raw)
  To: dev

This patchset adds ethdev API to enable and read IEEE1588/802.1AS PTP
timestamps from devices that support it. The following functions are added:

    rte_eth_timesync_enable()
    rte_eth_timesync_disable()
    rte_eth_timesync_read_rx_timestamp()
    rte_eth_timesync_read_tx_timestamp()

The "ieee1588" forwarding mode in testpmd is also refactored to demonstrate
the new API and to clean up the code.

Adds support for igb, ixgbe and i40e.

V4:
* Added timesync field to end of mbuf to pass IEEE1588 registers and flags.
  Removed previous ABI deprecation notice.

V3:
* Fixed issued with version.map.

V2:
* Added i40e support.

* Renamed ethdev functions from rte_eth_ieee15888_*() to rte_eth_timesync_*()
  since 802.1AS can be supported through the same interfaces.

V1:
* Initial version for igb and ixgbe.


John McNamara (7):
  ethdev: add support for ieee1588 timestamping
  mbuf: add field for ieee1588 timesync index
  e1000: add support for ieee1588 timestamping
  ixgbe: add support for ieee1588 timestamping
  i40e: add support for ieee1588 timestamping
  app/testpmd: refactor ieee1588 forwarding
  doc: document ieee1588 forwarding mode

 app/test-pmd/ieee1588fwd.c                  | 466 ++--------------------------
 doc/guides/testpmd_app_ug/run_app.rst       |   2 +-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |   2 +
 drivers/net/e1000/igb_ethdev.c              | 115 +++++++
 drivers/net/i40e/i40e_ethdev.c              | 143 +++++++++
 drivers/net/i40e/i40e_rxtx.c                |  40 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c            | 122 ++++++++
 lib/librte_ether/rte_ethdev.c               |  70 ++++-
 lib/librte_ether/rte_ethdev.h               |  90 +++++-
 lib/librte_ether/rte_ether_version.map      |   4 +
 lib/librte_mbuf/rte_mbuf.h                  |   3 +
 11 files changed, 614 insertions(+), 443 deletions(-)

--
1.8.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v13 00/14] Interrupt mode PMD
  @ 2015-07-09 13:58  3%   ` David Marchand
  2015-07-17  6:04  0%     ` Liang, Cunming
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
  1 sibling, 1 reply; 200+ results
From: David Marchand @ 2015-07-09 13:58 UTC (permalink / raw)
  To: Cunming Liang; +Cc: Stephen Hemminger, dev, liang-min.wang

On Fri, Jun 19, 2015 at 6:00 AM, Cunming Liang <cunming.liang@intel.com>
wrote:

> v13 changes
>  - version map cleanup for v2.1
>  - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
>

Please, this patchset ends with a patch that deals with ABI compatibility
while it should do so on a per-patch basis.
Besides, some patches are introducing stuff that is reworked in other
patches without a clear reason.

Can you rework this to ease review and ensure patch atomicity ?

Thanks.

-- 
David Marchand

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2
  @ 2015-07-09 15:51  7%     ` Thomas Monjalon
  2015-07-09 16:01  4%       ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-09 15:51 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2015-07-08 14:10, Bruce Richardson:
> On Mon, Jul 06, 2015 at 03:16:01PM +0200, Thomas Monjalon wrote:
> > 2015-07-02 16:16, John McNamara:
> > > --- a/doc/guides/rel_notes/abi.rst
> > > +++ b/doc/guides/rel_notes/abi.rst
> > >  Deprecation Notices
> > >  -------------------
> > > +
> > > +* In DPDK 2.1 the IEEE1588/802.1AS support in the i40e driver makes use of the
> > > +  ``udata64`` field in the mbuf to pass the timesync register index to the
> > > +  user. In DPDK 2.2 this will be moved to a new field in the mbuf.
> > 
> > We need more acknowledgements for this decision, as stated here:
> > http://dpdk.org/browse/dpdk/tree/doc/guides/guidelines/versioning.rst#n51
> 
> Why can't this new field just be added at the end of cache line 1 (the second
> cache line) of the mbuf? That would avoid any ABI breakage and would mean we
> can just put the change in in this release, instead of waiting.

Are you sure that (because of __rte_cache_aligned) the size of the structure
is never increased with this new field?
Please confirm your opinion.

A comment to explain ABI compatibility in the commit message of the v4 is
also welcome.

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2
  2015-07-09 15:51  7%     ` Thomas Monjalon
@ 2015-07-09 16:01  4%       ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-09 16:01 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, Jul 09, 2015 at 05:51:16PM +0200, Thomas Monjalon wrote:
> 2015-07-08 14:10, Bruce Richardson:
> > On Mon, Jul 06, 2015 at 03:16:01PM +0200, Thomas Monjalon wrote:
> > > 2015-07-02 16:16, John McNamara:
> > > > --- a/doc/guides/rel_notes/abi.rst
> > > > +++ b/doc/guides/rel_notes/abi.rst
> > > >  Deprecation Notices
> > > >  -------------------
> > > > +
> > > > +* In DPDK 2.1 the IEEE1588/802.1AS support in the i40e driver makes use of the
> > > > +  ``udata64`` field in the mbuf to pass the timesync register index to the
> > > > +  user. In DPDK 2.2 this will be moved to a new field in the mbuf.
> > > 
> > > We need more acknowledgements for this decision, as stated here:
> > > http://dpdk.org/browse/dpdk/tree/doc/guides/guidelines/versioning.rst#n51
> > 
> > Why can't this new field just be added at the end of cache line 1 (the second
> > cache line) of the mbuf? That would avoid any ABI breakage and would mean we
> > can just put the change in in this release, instead of waiting.
> 
> Are you sure that (because of __rte_cache_aligned) the size of the structure
> is never increased with this new field?
> Please confirm your opinion.

This is checked at compile time by the test app.

 930 static int
 931 test_mbuf(void)
 932 {
 933         RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) != RTE_CACHE_LINE_SIZE * 2);
 ....

So if a change does result in an increase the mbuf size, by causing overflow in
either the first or the second cache line, we will get compiler errors in the
build because of it. Therefore, such changes are pretty easy to test by compiling
up on our supported targets.

/Bruce

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 00/19] unified packet type
  @ 2015-07-09 16:31  4% ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
                     ` (19 more replies)
  0 siblings, 20 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

Currently only 6 bits which are stored in ol_flags are used to indicate the
packet types. This is not enough, as some NIC hardware can recognize quite
a lot of packet types, e.g i40e hardware can recognize more than 150 packet
types. Hiding those packet types hides hardware offload capabilities which
could be quite useful for improving performance and for end users.
So an unified packet types are needed to support all possible PMDs. A 16
bits packet_type in mbuf structure can be changed to 32 bits and used for
this purpose. In addition, all packet types stored in ol_flag field should
be deleted at all, and 6 bits of ol_flags can be save as the benifit.

Initially, 32 bits of packet_type can be divided into several sub fields to
indicate different packet type information of a packet. The initial design
is to divide those bits into fields for L2 types, L3 types, L4 types, tunnel
types, inner L2 types, inner L3 types and inner L4 types. All PMDs should
translate the offloaded packet types into these 7 fields of information, for
user applications.

To avoid breaking ABI compatibility, currently all the code changes for
unified packet type are disabled at compile time by default. Users can enable
it manually by defining the macro of RTE_NEXT_ABI. The code changes will be
valid by default in a future release, and the old version will be deleted
accordingly, after the ABI change process is done.

Note that this patch set should be integrated after another patch set
for '[PATCH v3 0/7] support i40e QinQ stripping and insertion', to clearly
solve the conflict during integration. As both patch sets modified 'struct
rte_mbuf', and the final layout of the 'struct rte_mbuf' is key to vectorized
ixgbe PMD.

Its v8 version was acked by Konstantin Ananyev <konstantin.ananyev@intel.com>

v2 changes:
* Enlarged the packet_type field from 16 bits to 32 bits.
* Redefined the packet type sub-fields.
* Updated the 'struct rte_kni_mbuf' for KNI according to the mbuf changes.
* Used redefined packet types and enlarged packet_type field for all PMDs
  and corresponding applications.
* Removed changes in bond and its relevant application, as there is no need
  at all according to the recent bond changes.

v3 changes:
* Put the mbuf layout changes into a single patch.
* Put vector ixgbe changes right after mbuf changes.
* Disabled vector ixgbe PMD by default, as mbuf layout changed, and then
  re-enabled it after vector ixgbe PMD updated.
* Put the definitions of unified packet type into a single patch.
* Minor bug fixes and enhancements in l3fwd example.

v4 changes:
* Added detailed description of each packet types.
* Supported unified packet type of fm10k.
* Added printing logs of packet types of each received packet for rxonly
  mode in testpmd.
* Removed several useless code lines which block packet type unification from
  app/test/packet_burst_generator.c.

v5 changes:
* Added more detailed description for each packet types, together with examples.
* Rolled back the macro definitions of RX packet flags, for ABI compitability.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.
* Integrated with patch set for '[PATCH v3 0/7] support i40e QinQ stripping
  and insertion', to clearly solve the conflicts during merging.

v8 changes:
* Moved the field of 'vlan_tci_outer' in 'struct rte_mbuf' to the end of the 1st
  cache line, to avoid breaking any vectorized PMD storing, as fields of
  'packet_type, pkt_len, data_len, vlan_tci, rss' should be in an contiguous 128
  bits.

v9 changes:
* Put the mbuf changes and vector PMD changes together, as they are
  tightly relevant.
* Renamed MAC to ETHER in packet type names.
* Corrected the packet type explanation of RTE_PTYPE_L2_ETHER.
* Reworked newly added cxgbe driver and tep_termination example application to
  support unified packet type, which is disabled by default.

v10 changes:
* Fixed a compile error in tep_termination, when RTE_NEXT_ABI is enabled.

Helin Zhang (19):
  mbuf: redefine packet_type in rte_mbuf
  mbuf: add definitions of unified packet types
  e1000: replace bit mask based packet type with unified packet type
  ixgbe: replace bit mask based packet type with unified packet type
  i40e: replace bit mask based packet type with unified packet type
  enic: replace bit mask based packet type with unified packet type
  vmxnet3: replace bit mask based packet type with unified packet type
  fm10k: replace bit mask based packet type with unified packet type
  cxgbe: replace bit mask based packet type with unified packet type
  app/test-pipeline: replace bit mask based packet type with unified
    packet type
  app/testpmd: replace bit mask based packet type with unified packet
    type
  app/test: Remove useless code
  examples/ip_fragmentation: replace bit mask based packet type with
    unified packet type
  examples/ip_reassembly: replace bit mask based packet type with
    unified packet type
  examples/l3fwd-acl: replace bit mask based packet type with unified
    packet type
  examples/l3fwd-power: replace bit mask based packet type with unified
    packet type
  examples/l3fwd: replace bit mask based packet type with unified packet
    type
  examples/tep_termination: replace bit mask based packet type with
    unified packet type
  mbuf: remove old packet type bit masks

 app/test-pipeline/pipeline_hash.c                  |  13 +
 app/test-pmd/csumonly.c                            |  14 +
 app/test-pmd/rxonly.c                              | 183 +++++++
 app/test/packet_burst_generator.c                  |   6 +-
 drivers/net/cxgbe/sge.c                            |   8 +
 drivers/net/e1000/igb_rxtx.c                       | 104 ++++
 drivers/net/enic/enic_main.c                       |  26 +
 drivers/net/fm10k/fm10k_rxtx.c                     |  27 +
 drivers/net/i40e/i40e_rxtx.c                       | 554 +++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.c                     | 163 ++++++
 drivers/net/ixgbe/ixgbe_rxtx_vec.c                 |  75 ++-
 drivers/net/vmxnet3/vmxnet3_rxtx.c                 |   8 +
 examples/ip_fragmentation/main.c                   |   9 +
 examples/ip_reassembly/main.c                      |   9 +
 examples/l3fwd-acl/main.c                          |  29 +-
 examples/l3fwd-power/main.c                        |   8 +
 examples/l3fwd/main.c                              | 123 ++++-
 examples/tep_termination/vxlan.c                   |   4 +
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |   6 +
 lib/librte_mbuf/rte_mbuf.c                         |   4 +
 lib/librte_mbuf/rte_mbuf.h                         | 516 +++++++++++++++++++
 21 files changed, 1876 insertions(+), 13 deletions(-)

-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-15 10:19  0%     ` Olivier MATZ
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type Helin Zhang
                     ` (17 subsequent siblings)
  19 siblings, 1 reply; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

As there are only 6 bit flags in ol_flags for indicating packet
types, which is not enough to describe all the possible packet
types hardware can recognize. For example, i40e hardware can
recognize more than 150 packet types. Unified packet type is
composed of L2 type, L3 type, L4 type, tunnel type, inner L2 type,
inner L3 type and inner L4 type fields, and can be stored in
'struct rte_mbuf' of 32 bits field 'packet_type'.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 lib/librte_mbuf/rte_mbuf.h | 486 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 486 insertions(+)

v3 changes:
* Put the definitions of unified packet type into a single patch.

v4 changes:
* Added detailed description of each packet types.

v5 changes:
* Re-worded the commit logs.
* Added more detailed description for all packet types, together with examples.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.
* Corrected the packet type explanation of RTE_PTYPE_L2_ETHER.

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ac29da3..3a17d95 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -202,6 +202,492 @@ extern "C" {
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG       (1ULL << 63) /**< Mbuf contains control data */
 
+#ifdef RTE_NEXT_ABI
+/*
+ * 32 bits are divided into several fields to mark packet types. Note that
+ * each field is indexical.
+ * - Bit 3:0 is for L2 types.
+ * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
+ * - Bit 11:8 is for L4 or outer L4 (for tunneling case) types.
+ * - Bit 15:12 is for tunnel types.
+ * - Bit 19:16 is for inner L2 types.
+ * - Bit 23:20 is for inner L3 types.
+ * - Bit 27:24 is for inner L4 types.
+ * - Bit 31:28 is reserved.
+ *
+ * To be compatible with Vector PMD, RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV4_EXT,
+ * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP
+ * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
+ *
+ * Note that L3 types values are selected for checking IPV4/IPV6 header from
+ * performance point of view. Reading annotations of RTE_ETH_IS_IPV4_HDR and
+ * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type values.
+ *
+ * Note that the packet types of the same packet recognized by different
+ * hardware may be different, as different hardware may have different
+ * capability of packet type recognition.
+ *
+ * examples:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=0x29
+ * | 'version'=6, 'next header'=0x3A
+ * | 'ICMPv6 header'>
+ * will be recognized on i40e hardware as packet type combination of,
+ * RTE_PTYPE_L2_ETHER |
+ * RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+ * RTE_PTYPE_TUNNEL_IP |
+ * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_INNER_L4_ICMP.
+ *
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x2F
+ * | 'GRE header'
+ * | 'version'=6, 'next header'=0x11
+ * | 'UDP header'>
+ * will be recognized on i40e hardware as packet type combination of,
+ * RTE_PTYPE_L2_ETHER |
+ * RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_TUNNEL_GRENAT |
+ * RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+ * RTE_PTYPE_INNER_L4_UDP.
+ */
+#define RTE_PTYPE_UNKNOWN                   0x00000000
+/**
+ * Ethernet packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=[0x0800|0x86DD]>
+ */
+#define RTE_PTYPE_L2_ETHER                  0x00000001
+/**
+ * Ethernet packet type for time sync.
+ *
+ * Packet format:
+ * <'ether type'=0x88F7>
+ */
+#define RTE_PTYPE_L2_ETHER_TIMESYNC         0x00000002
+/**
+ * ARP (Address Resolution Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0806>
+ */
+#define RTE_PTYPE_L2_ETHER_ARP              0x00000003
+/**
+ * LLDP (Link Layer Discovery Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x88CC>
+ */
+#define RTE_PTYPE_L2_ETHER_LLDP             0x00000004
+/**
+ * Mask of layer 2 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L2_MASK                   0x0000000f
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and does not contain any
+ * header option.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=5>
+ */
+#define RTE_PTYPE_L3_IPV4                   0x00000010
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and contains header
+ * options.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[6-15], 'options'>
+ */
+#define RTE_PTYPE_L3_IPV4_EXT               0x00000030
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for outer packet for tunneling cases, and does not contain any
+ * extension header.
+ *
+ * Packet format:
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x3B>
+ */
+#define RTE_PTYPE_L3_IPV6                   0x00000040
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for outer packet for tunneling cases, and may or maynot contain
+ * header options.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[5-15], <'options'>>
+ */
+#define RTE_PTYPE_L3_IPV4_EXT_UNKNOWN       0x00000090
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for outer packet for tunneling cases, and contains extension
+ * headers.
+ *
+ * Packet format:
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   'extension headers'>
+ */
+#define RTE_PTYPE_L3_IPV6_EXT               0x000000c0
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for outer packet for tunneling cases, and may or maynot contain
+ * extension headers.
+ *
+ * Packet format:
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x3B|0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   <'extension headers'>>
+ */
+#define RTE_PTYPE_L3_IPV6_EXT_UNKNOWN       0x000000e0
+/**
+ * Mask of layer 3 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L3_MASK                   0x000000f0
+/**
+ * TCP (Transmission Control Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=6, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=6>
+ */
+#define RTE_PTYPE_L4_TCP                    0x00000100
+/**
+ * UDP (User Datagram Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17>
+ */
+#define RTE_PTYPE_L4_UDP                    0x00000200
+/**
+ * Fragmented IP (Internet Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * It refers to those packets of any IP types, which can be recognized as
+ * fragmented. A fragmented packet cannot be recognized as any other L4 types
+ * (RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP,
+ * RTE_PTYPE_L4_NONFRAG).
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'MF'=1>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=44>
+ */
+#define RTE_PTYPE_L4_FRAG                   0x00000300
+/**
+ * SCTP (Stream Control Transmission Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=132, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=132>
+ */
+#define RTE_PTYPE_L4_SCTP                   0x00000400
+/**
+ * ICMP (Internet Control Message Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=1, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=1>
+ */
+#define RTE_PTYPE_L4_ICMP                   0x00000500
+/**
+ * Non-fragmented IP (Internet Protocol) packet type.
+ * It is used for outer packet for tunneling cases.
+ *
+ * It refers to those packets of any IP types, while cannot be recognized as
+ * any of above L4 types (RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP,
+ * RTE_PTYPE_L4_FRAG, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP).
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'!=[6|17|44|132|1]>
+ */
+#define RTE_PTYPE_L4_NONFRAG                0x00000600
+/**
+ * Mask of layer 4 packet types.
+ * It is used for outer packet for tunneling cases.
+ */
+#define RTE_PTYPE_L4_MASK                   0x00000f00
+/**
+ * IP (Internet Protocol) in IP (Internet Protocol) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=[4|41]>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[4|41]>
+ */
+#define RTE_PTYPE_TUNNEL_IP                 0x00001000
+/**
+ * GRE (Generic Routing Encapsulation) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=47>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=47>
+ */
+#define RTE_PTYPE_TUNNEL_GRE                0x00002000
+/**
+ * VXLAN (Virtual eXtensible Local Area Network) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17
+ * | 'destination port'=4798>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17
+ * | 'destination port'=4798>
+ */
+#define RTE_PTYPE_TUNNEL_VXLAN              0x00003000
+/**
+ * NVGRE (Network Virtualization using Generic Routing Encapsulation) tunneling
+ * packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=47
+ * | 'protocol type'=0x6558>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=47
+ * | 'protocol type'=0x6558'>
+ */
+#define RTE_PTYPE_TUNNEL_NVGRE              0x00004000
+/**
+ * GENEVE (Generic Network Virtualization Encapsulation) tunneling packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17
+ * | 'destination port'=6081>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17
+ * | 'destination port'=6081>
+ */
+#define RTE_PTYPE_TUNNEL_GENEVE             0x00005000
+/**
+ * Tunneling packet type of Teredo, VXLAN (Virtual eXtensible Local Area
+ * Network) or GRE (Generic Routing Encapsulation) could be recognized as this
+ * packet type, if they can not be recognized independently as of hardware
+ * capability.
+ */
+#define RTE_PTYPE_TUNNEL_GRENAT             0x00006000
+/**
+ * Mask of tunneling packet types.
+ */
+#define RTE_PTYPE_TUNNEL_MASK               0x0000f000
+/**
+ * Ethernet packet type.
+ * It is used for inner packet type only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=[0x800|0x86DD]>
+ */
+#define RTE_PTYPE_INNER_L2_ETHER            0x00010000
+/**
+ * Ethernet packet type with VLAN (Virtual Local Area Network) tag.
+ *
+ * Packet format (inner only):
+ * <'ether type'=[0x800|0x86DD], vlan=[1-4095]>
+ */
+#define RTE_PTYPE_INNER_L2_ETHER_VLAN       0x00020000
+/**
+ * Mask of inner layer 2 packet types.
+ */
+#define RTE_PTYPE_INNER_L2_MASK             0x000f0000
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for inner packet only, and does not contain any header option.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=5>
+ */
+#define RTE_PTYPE_INNER_L3_IPV4             0x00100000
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for inner packet only, and contains header options.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[6-15], 'options'>
+ */
+#define RTE_PTYPE_INNER_L3_IPV4_EXT         0x00200000
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for inner packet only, and does not contain any extension header.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=0x3B>
+ */
+#define RTE_PTYPE_INNER_L3_IPV6             0x00300000
+/**
+ * IP (Internet Protocol) version 4 packet type.
+ * It is used for inner packet only, and may or maynot contain header options.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'ihl'=[5-15], <'options'>>
+ */
+#define RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN 0x00400000
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for inner packet only, and contains extension headers.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   'extension headers'>
+ */
+#define RTE_PTYPE_INNER_L3_IPV6_EXT         0x00500000
+/**
+ * IP (Internet Protocol) version 6 packet type.
+ * It is used for inner packet only, and may or maynot contain extension
+ * headers.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=[0x3B|0x0|0x2B|0x2C|0x32|0x33|0x3C|0x87],
+ *   <'extension headers'>>
+ */
+#define RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN 0x00600000
+/**
+ * Mask of inner layer 3 packet types.
+ */
+#define RTE_PTYPE_INNER_INNER_L3_MASK       0x00f00000
+/**
+ * TCP (Transmission Control Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=6, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=6>
+ */
+#define RTE_PTYPE_INNER_L4_TCP              0x01000000
+/**
+ * UDP (User Datagram Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=17, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=17>
+ */
+#define RTE_PTYPE_INNER_L4_UDP              0x02000000
+/**
+ * Fragmented IP (Internet Protocol) packet type.
+ * It is used for inner packet only, and may or maynot have layer 4 packet.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'MF'=1>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=44>
+ */
+#define RTE_PTYPE_INNER_L4_FRAG             0x03000000
+/**
+ * SCTP (Stream Control Transmission Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=132, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=132>
+ */
+#define RTE_PTYPE_INNER_L4_SCTP             0x04000000
+/**
+ * ICMP (Internet Control Message Protocol) packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=1, 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=1>
+ */
+#define RTE_PTYPE_INNER_L4_ICMP             0x05000000
+/**
+ * Non-fragmented IP (Internet Protocol) packet type.
+ * It is used for inner packet only, and may or maynot have other unknown layer
+ * 4 packet types.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'!=[6|17|44|132|1]>
+ */
+#define RTE_PTYPE_INNER_L4_NONFRAG          0x06000000
+/**
+ * Mask of inner layer 4 packet types.
+ */
+#define RTE_PTYPE_INNER_L4_MASK             0x0f000000
+
+/**
+ * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4 types one by
+ * one, bit 4 is selected to be used for IPv4 only. Then checking bit 4 can
+ * determin if it is an IPV4 packet.
+ */
+#define  RTE_ETH_IS_IPV4_HDR(ptype) ((ptype) & RTE_PTYPE_L3_IPV4)
+
+/**
+ * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4 types one by
+ * one, bit 6 is selected to be used for IPv4 only. Then checking bit 6 can
+ * determin if it is an IPV4 packet.
+ */
+#define  RTE_ETH_IS_IPV6_HDR(ptype) ((ptype) & RTE_PTYPE_L3_IPV6)
+
+/* Check if it is a tunneling packet */
+#define RTE_ETH_IS_TUNNEL_PKT(ptype) ((ptype) & RTE_PTYPE_TUNNEL_MASK)
+#endif /* RTE_NEXT_ABI */
+
 /**
  * Get the name of a RX offload flag
  *
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 04/19] ixgbe: " Helin Zhang
                     ` (16 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/e1000/igb_rxtx.c | 104 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 43d6703..165144c 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -590,6 +590,101 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
  *  RX functions
  *
  **********************************************************************/
+#ifdef RTE_NEXT_ABI
+#define IGB_PACKET_TYPE_IPV4              0X01
+#define IGB_PACKET_TYPE_IPV4_TCP          0X11
+#define IGB_PACKET_TYPE_IPV4_UDP          0X21
+#define IGB_PACKET_TYPE_IPV4_SCTP         0X41
+#define IGB_PACKET_TYPE_IPV4_EXT          0X03
+#define IGB_PACKET_TYPE_IPV4_EXT_SCTP     0X43
+#define IGB_PACKET_TYPE_IPV6              0X04
+#define IGB_PACKET_TYPE_IPV6_TCP          0X14
+#define IGB_PACKET_TYPE_IPV6_UDP          0X24
+#define IGB_PACKET_TYPE_IPV6_EXT          0X0C
+#define IGB_PACKET_TYPE_IPV6_EXT_TCP      0X1C
+#define IGB_PACKET_TYPE_IPV6_EXT_UDP      0X2C
+#define IGB_PACKET_TYPE_IPV4_IPV6         0X05
+#define IGB_PACKET_TYPE_IPV4_IPV6_TCP     0X15
+#define IGB_PACKET_TYPE_IPV4_IPV6_UDP     0X25
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT     0X0D
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT_TCP 0X1D
+#define IGB_PACKET_TYPE_IPV4_IPV6_EXT_UDP 0X2D
+#define IGB_PACKET_TYPE_MAX               0X80
+#define IGB_PACKET_TYPE_MASK              0X7F
+#define IGB_PACKET_TYPE_SHIFT             0X04
+static inline uint32_t
+igb_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
+{
+	static const uint32_t
+		ptype_table[IGB_PACKET_TYPE_MAX] __rte_cache_aligned = {
+		[IGB_PACKET_TYPE_IPV4] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4,
+		[IGB_PACKET_TYPE_IPV4_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT,
+		[IGB_PACKET_TYPE_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6,
+		[IGB_PACKET_TYPE_IPV4_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6,
+		[IGB_PACKET_TYPE_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT,
+		[IGB_PACKET_TYPE_IPV4_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT,
+		[IGB_PACKET_TYPE_IPV4_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+		[IGB_PACKET_TYPE_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_TCP,
+		[IGB_PACKET_TYPE_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_TCP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_TCP,
+		[IGB_PACKET_TYPE_IPV4_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+		[IGB_PACKET_TYPE_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_UDP] =  RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_UDP,
+		[IGB_PACKET_TYPE_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_UDP,
+		[IGB_PACKET_TYPE_IPV4_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_UDP,
+		[IGB_PACKET_TYPE_IPV4_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_SCTP,
+		[IGB_PACKET_TYPE_IPV4_EXT_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT | RTE_PTYPE_L4_SCTP,
+	};
+	if (unlikely(pkt_info & E1000_RXDADV_PKTTYPE_ETQF))
+		return RTE_PTYPE_UNKNOWN;
+
+	pkt_info = (pkt_info >> IGB_PACKET_TYPE_SHIFT) & IGB_PACKET_TYPE_MASK;
+
+	return ptype_table[pkt_info];
+}
+
+static inline uint64_t
+rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
+{
+	uint64_t pkt_flags = ((hl_tp_rs & 0x0F) == 0) ?  0 : PKT_RX_RSS_HASH;
+
+#if defined(RTE_LIBRTE_IEEE1588)
+	static uint32_t ip_pkt_etqf_map[8] = {
+		0, 0, 0, PKT_RX_IEEE1588_PTP,
+		0, 0, 0, 0,
+	};
+
+	pkt_flags |= ip_pkt_etqf_map[(hl_tp_rs >> 4) & 0x07];
+#endif
+
+	return pkt_flags;
+}
+#else /* RTE_NEXT_ABI */
 static inline uint64_t
 rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 {
@@ -617,6 +712,7 @@ rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 #endif
 	return pkt_flags | (((hl_tp_rs & 0x0F) == 0) ?  0 : PKT_RX_RSS_HASH);
 }
+#endif /* RTE_NEXT_ABI */
 
 static inline uint64_t
 rx_desc_status_to_pkt_flags(uint32_t rx_status)
@@ -790,6 +886,10 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
 		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
 		rxm->ol_flags = pkt_flags;
+#ifdef RTE_NEXT_ABI
+		rxm->packet_type = igb_rxd_pkt_info_to_pkt_type(rxd.wb.lower.
+						lo_dword.hs_rss.pkt_info);
+#endif
 
 		/*
 		 * Store the mbuf address into the next entry of the array
@@ -1024,6 +1124,10 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
 		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
 		first_seg->ol_flags = pkt_flags;
+#ifdef RTE_NEXT_ABI
+		first_seg->packet_type = igb_rxd_pkt_info_to_pkt_type(rxd.wb.
+					lower.lo_dword.hs_rss.pkt_info);
+#endif
 
 		/* Prefetch data of first segment, if configured to do so. */
 		rte_packet_prefetch((char *)first_seg->buf_addr +
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-13 15:53  0%     ` Thomas Monjalon
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
                     ` (18 subsequent siblings)
  19 siblings, 1 reply; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

In order to unify the packet type, the field of 'packet_type' in 'struct rte_mbuf'
needs to be extended from 16 to 32 bits. Accordingly, some fields in 'struct rte_mbuf'
are re-organized to support this change for Vector PMD. As 'struct rte_kni_mbuf' for
KNI should be right mapped to 'struct rte_mbuf', it should be modified accordingly.
In ixgbe PMD driver, corresponding changes are added for the mbuf changes, especially
the bit masks of packet type for 'ol_flags' are replaced by unified packet type. In
addition, more packet types (UDP, TCP and SCTP) are supported in vectorized ixgbe PMD.
To avoid breaking ABI compatibility, all the changes would be enabled by RTE_NEXT_ABI,
which is disabled by default.
Note that around 2% performance drop (64B) was observed of doing 4 ports (1 port per
82599 card) IO forwarding on the same SNB core.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 drivers/net/ixgbe/ixgbe_rxtx_vec.c                 | 75 +++++++++++++++++++++-
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  6 ++
 lib/librte_mbuf/rte_mbuf.h                         | 26 ++++++++
 3 files changed, 105 insertions(+), 2 deletions(-)

v2 changes:
* Enlarged the packet_type field from 16 bits to 32 bits.
* Redefined the packet type sub-fields.
* Updated the 'struct rte_kni_mbuf' for KNI according to the mbuf changes.

v3 changes:
* Put the mbuf layout changes into a single patch.
* Disabled vector ixgbe PMD by default, as mbuf layout changed.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.
* Integrated with changes of QinQ stripping/insertion.

v8 changes:
* Moved the field of 'vlan_tci_outer' in 'struct rte_mbuf' to the end
  of the 1st cache line, to avoid breaking any vectorized PMD storing.

v9 changes:
* Put the mbuf changes and vector PMD changes together, as they are
  tightly relevant.

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 912d3b4..d3ac74a 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -134,6 +134,12 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
  */
 #ifdef RTE_IXGBE_RX_OLFLAGS_ENABLE
 
+#ifdef RTE_NEXT_ABI
+#define OLFLAGS_MASK_V  (((uint64_t)PKT_RX_VLAN_PKT << 48) | \
+			((uint64_t)PKT_RX_VLAN_PKT << 32) | \
+			((uint64_t)PKT_RX_VLAN_PKT << 16) | \
+			((uint64_t)PKT_RX_VLAN_PKT))
+#else
 #define OLFLAGS_MASK     ((uint16_t)(PKT_RX_VLAN_PKT | PKT_RX_IPV4_HDR |\
 				     PKT_RX_IPV4_HDR_EXT | PKT_RX_IPV6_HDR |\
 				     PKT_RX_IPV6_HDR_EXT))
@@ -142,11 +148,26 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
 			  ((uint64_t)OLFLAGS_MASK << 16) | \
 			  ((uint64_t)OLFLAGS_MASK))
 #define PTYPE_SHIFT    (1)
+#endif /* RTE_NEXT_ABI */
+
 #define VTAG_SHIFT     (3)
 
 static inline void
 desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 {
+#ifdef RTE_NEXT_ABI
+	__m128i vtag0, vtag1;
+	union {
+		uint16_t e[4];
+		uint64_t dword;
+	} vol;
+
+	vtag0 = _mm_unpackhi_epi16(descs[0], descs[1]);
+	vtag1 = _mm_unpackhi_epi16(descs[2], descs[3]);
+	vtag1 = _mm_unpacklo_epi32(vtag0, vtag1);
+	vtag1 = _mm_srli_epi16(vtag1, VTAG_SHIFT);
+	vol.dword = _mm_cvtsi128_si64(vtag1) & OLFLAGS_MASK_V;
+#else
 	__m128i ptype0, ptype1, vtag0, vtag1;
 	union {
 		uint16_t e[4];
@@ -166,6 +187,7 @@ desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 
 	ptype1 = _mm_or_si128(ptype1, vtag1);
 	vol.dword = _mm_cvtsi128_si64(ptype1) & OLFLAGS_MASK_V;
+#endif /* RTE_NEXT_ABI */
 
 	rx_pkts[0]->ol_flags = vol.e[0];
 	rx_pkts[1]->ol_flags = vol.e[1];
@@ -196,6 +218,18 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 	int pos;
 	uint64_t var;
 	__m128i shuf_msk;
+#ifdef RTE_NEXT_ABI
+	__m128i crc_adjust = _mm_set_epi16(
+				0, 0, 0,    /* ignore non-length fields */
+				-rxq->crc_len, /* sub crc on data_len */
+				0,          /* ignore high-16bits of pkt_len */
+				-rxq->crc_len, /* sub crc on pkt_len */
+				0, 0            /* ignore pkt_type field */
+			);
+	__m128i dd_check, eop_check;
+	__m128i desc_mask = _mm_set_epi32(0xFFFFFFFF, 0xFFFFFFFF,
+					  0xFFFFFFFF, 0xFFFF07F0);
+#else
 	__m128i crc_adjust = _mm_set_epi16(
 				0, 0, 0, 0, /* ignore non-length fields */
 				0,          /* ignore high-16bits of pkt_len */
@@ -204,6 +238,7 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 				0            /* ignore pkt_type field */
 			);
 	__m128i dd_check, eop_check;
+#endif /* RTE_NEXT_ABI */
 
 	if (unlikely(nb_pkts < RTE_IXGBE_VPMD_RX_BURST))
 		return 0;
@@ -232,6 +267,18 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 	eop_check = _mm_set_epi64x(0x0000000200000002LL, 0x0000000200000002LL);
 
 	/* mask to shuffle from desc. to mbuf */
+#ifdef RTE_NEXT_ABI
+	shuf_msk = _mm_set_epi8(
+		7, 6, 5, 4,  /* octet 4~7, 32bits rss */
+		15, 14,      /* octet 14~15, low 16 bits vlan_macip */
+		13, 12,      /* octet 12~13, 16 bits data_len */
+		0xFF, 0xFF,  /* skip high 16 bits pkt_len, zero out */
+		13, 12,      /* octet 12~13, low 16 bits pkt_len */
+		0xFF, 0xFF,  /* skip high 16 bits pkt_type */
+		1,           /* octet 1, 8 bits pkt_type field */
+		0            /* octet 0, 4 bits offset 4 pkt_type field */
+		);
+#else
 	shuf_msk = _mm_set_epi8(
 		7, 6, 5, 4,  /* octet 4~7, 32bits rss */
 		0xFF, 0xFF,  /* skip high 16 bits vlan_macip, zero out */
@@ -241,18 +288,28 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		13, 12,      /* octet 12~13, 16 bits data_len */
 		0xFF, 0xFF   /* skip pkt_type field */
 		);
+#endif /* RTE_NEXT_ABI */
 
 	/* Cache is empty -> need to scan the buffer rings, but first move
 	 * the next 'n' mbufs into the cache */
 	sw_ring = &rxq->sw_ring[rxq->rx_tail];
 
-	/*
-	 * A. load 4 packet in one loop
+#ifdef RTE_NEXT_ABI
+	/* A. load 4 packet in one loop
+	 * [A*. mask out 4 unused dirty field in desc]
 	 * B. copy 4 mbuf point from swring to rx_pkts
 	 * C. calc the number of DD bits among the 4 packets
 	 * [C*. extract the end-of-packet bit, if requested]
 	 * D. fill info. from desc to mbuf
 	 */
+#else
+	/* A. load 4 packet in one loop
+	 * B. copy 4 mbuf point from swring to rx_pkts
+	 * C. calc the number of DD bits among the 4 packets
+	 * [C*. extract the end-of-packet bit, if requested]
+	 * D. fill info. from desc to mbuf
+	 */
+#endif /* RTE_NEXT_ABI */
 	for (pos = 0, nb_pkts_recd = 0; pos < RTE_IXGBE_VPMD_RX_BURST;
 			pos += RTE_IXGBE_DESCS_PER_LOOP,
 			rxdp += RTE_IXGBE_DESCS_PER_LOOP) {
@@ -289,6 +346,16 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* B.2 copy 2 mbuf point into rx_pkts  */
 		_mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2);
 
+#ifdef RTE_NEXT_ABI
+		/* A* mask out 0~3 bits RSS type */
+		descs[3] = _mm_and_si128(descs[3], desc_mask);
+		descs[2] = _mm_and_si128(descs[2], desc_mask);
+
+		/* A* mask out 0~3 bits RSS type */
+		descs[1] = _mm_and_si128(descs[1], desc_mask);
+		descs[0] = _mm_and_si128(descs[0], desc_mask);
+#endif /* RTE_NEXT_ABI */
+
 		/* avoid compiler reorder optimization */
 		rte_compiler_barrier();
 
@@ -301,7 +368,11 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		/* C.1 4=>2 filter staterr info only */
 		sterr_tmp1 = _mm_unpackhi_epi32(descs[1], descs[0]);
 
+#ifdef RTE_NEXT_ABI
+		/* set ol_flags with vlan packet type */
+#else
 		/* set ol_flags with packet type and vlan tag */
+#endif /* RTE_NEXT_ABI */
 		desc_to_olflags_v(descs, &rx_pkts[pos]);
 
 		/* D.2 pkt 3,4 set in_port/nb_seg and remove crc */
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 1e55c2d..e9f38bd 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -117,9 +117,15 @@ struct rte_kni_mbuf {
 	uint16_t data_off;      /**< Start address of data in segment buffer. */
 	char pad1[4];
 	uint64_t ol_flags;      /**< Offload features. */
+#ifdef RTE_NEXT_ABI
+	char pad2[4];
+	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
+	uint16_t data_len;      /**< Amount of data in segment buffer. */
+#else
 	char pad2[2];
 	uint16_t data_len;      /**< Amount of data in segment buffer. */
 	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
+#endif
 
 	/* fields on second cache line */
 	char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 80419df..ac29da3 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -276,6 +276,28 @@ struct rte_mbuf {
 	/* remaining bytes are set on RX when pulling packet from descriptor */
 	MARKER rx_descriptor_fields1;
 
+#ifdef RTE_NEXT_ABI
+	/*
+	 * The packet type, which is the combination of outer/inner L2, L3, L4
+	 * and tunnel types.
+	 */
+	union {
+		uint32_t packet_type; /**< L2/L3/L4 and tunnel information. */
+		struct {
+			uint32_t l2_type:4; /**< (Outer) L2 type. */
+			uint32_t l3_type:4; /**< (Outer) L3 type. */
+			uint32_t l4_type:4; /**< (Outer) L4 type. */
+			uint32_t tun_type:4; /**< Tunnel type. */
+			uint32_t inner_l2_type:4; /**< Inner L2 type. */
+			uint32_t inner_l3_type:4; /**< Inner L3 type. */
+			uint32_t inner_l4_type:4; /**< Inner L4 type. */
+		};
+	};
+
+	uint32_t pkt_len;         /**< Total pkt len: sum of all segments. */
+	uint16_t data_len;        /**< Amount of data in segment buffer. */
+	uint16_t vlan_tci;        /**< VLAN Tag Control Identifier (CPU order) */
+#else /* RTE_NEXT_ABI */
 	/**
 	 * The packet type, which is used to indicate ordinary packet and also
 	 * tunneled packet format, i.e. each number is represented a type of
@@ -287,6 +309,7 @@ struct rte_mbuf {
 	uint32_t pkt_len;         /**< Total pkt len: sum of all segments. */
 	uint16_t vlan_tci;        /**< VLAN Tag Control Identifier (CPU order) */
 	uint16_t vlan_tci_outer;  /**< Outer VLAN Tag Control Identifier (CPU order) */
+#endif /* RTE_NEXT_ABI */
 	union {
 		uint32_t rss;     /**< RSS hash result if RSS enabled */
 		struct {
@@ -307,6 +330,9 @@ struct rte_mbuf {
 	} hash;                   /**< hash information */
 
 	uint32_t seqn; /**< Sequence number. See also rte_reorder_insert() */
+#ifdef RTE_NEXT_ABI
+	uint16_t vlan_tci_outer;  /**< Outer VLAN Tag Control Identifier (CPU order) */
+#endif /* RTE_NEXT_ABI */
 
 	/* second cache line - fields only used in slow path or on TX */
 	MARKER cacheline1 __rte_cache_aligned;
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 04/19] ixgbe: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (2 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 05/19] i40e: " Helin Zhang
                     ` (15 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet type among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.
Note that around 2.5% performance drop (64B) was observed of doing
4 ports (1 port per 82599 card) IO forwarding on the same SNB core.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 163 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index b1db57f..9e99e80 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -859,6 +859,110 @@ end_of_tx:
  *  RX functions
  *
  **********************************************************************/
+#ifdef RTE_NEXT_ABI
+#define IXGBE_PACKET_TYPE_IPV4              0X01
+#define IXGBE_PACKET_TYPE_IPV4_TCP          0X11
+#define IXGBE_PACKET_TYPE_IPV4_UDP          0X21
+#define IXGBE_PACKET_TYPE_IPV4_SCTP         0X41
+#define IXGBE_PACKET_TYPE_IPV4_EXT          0X03
+#define IXGBE_PACKET_TYPE_IPV4_EXT_SCTP     0X43
+#define IXGBE_PACKET_TYPE_IPV6              0X04
+#define IXGBE_PACKET_TYPE_IPV6_TCP          0X14
+#define IXGBE_PACKET_TYPE_IPV6_UDP          0X24
+#define IXGBE_PACKET_TYPE_IPV6_EXT          0X0C
+#define IXGBE_PACKET_TYPE_IPV6_EXT_TCP      0X1C
+#define IXGBE_PACKET_TYPE_IPV6_EXT_UDP      0X2C
+#define IXGBE_PACKET_TYPE_IPV4_IPV6         0X05
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_TCP     0X15
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_UDP     0X25
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT     0X0D
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_TCP 0X1D
+#define IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_UDP 0X2D
+#define IXGBE_PACKET_TYPE_MAX               0X80
+#define IXGBE_PACKET_TYPE_MASK              0X7F
+#define IXGBE_PACKET_TYPE_SHIFT             0X04
+static inline uint32_t
+ixgbe_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
+{
+	static const uint32_t
+		ptype_table[IXGBE_PACKET_TYPE_MAX] __rte_cache_aligned = {
+		[IXGBE_PACKET_TYPE_IPV4] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4,
+		[IXGBE_PACKET_TYPE_IPV4_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT,
+		[IXGBE_PACKET_TYPE_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6,
+		[IXGBE_PACKET_TYPE_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_EXT] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT,
+		[IXGBE_PACKET_TYPE_IPV4_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_TCP,
+		[IXGBE_PACKET_TYPE_IPV4_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6 | RTE_PTYPE_INNER_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT | RTE_PTYPE_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV4_IPV6_EXT_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT | RTE_PTYPE_INNER_L4_UDP,
+		[IXGBE_PACKET_TYPE_IPV4_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_SCTP,
+		[IXGBE_PACKET_TYPE_IPV4_EXT_SCTP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT | RTE_PTYPE_L4_SCTP,
+	};
+	if (unlikely(pkt_info & IXGBE_RXDADV_PKTTYPE_ETQF))
+		return RTE_PTYPE_UNKNOWN;
+
+	pkt_info = (pkt_info >> IXGBE_PACKET_TYPE_SHIFT) &
+				IXGBE_PACKET_TYPE_MASK;
+
+	return ptype_table[pkt_info];
+}
+
+static inline uint64_t
+ixgbe_rxd_pkt_info_to_pkt_flags(uint16_t pkt_info)
+{
+	static uint64_t ip_rss_types_map[16] __rte_cache_aligned = {
+		0, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH,
+		0, PKT_RX_RSS_HASH, 0, PKT_RX_RSS_HASH,
+		PKT_RX_RSS_HASH, 0, 0, 0,
+		0, 0, 0,  PKT_RX_FDIR,
+	};
+#ifdef RTE_LIBRTE_IEEE1588
+	static uint64_t ip_pkt_etqf_map[8] = {
+		0, 0, 0, PKT_RX_IEEE1588_PTP,
+		0, 0, 0, 0,
+	};
+
+	if (likely(pkt_info & IXGBE_RXDADV_PKTTYPE_ETQF))
+		return ip_pkt_etqf_map[(pkt_info >> 4) & 0X07] |
+				ip_rss_types_map[pkt_info & 0XF];
+	else
+		return ip_rss_types_map[pkt_info & 0XF];
+#else
+	return ip_rss_types_map[pkt_info & 0XF];
+#endif
+}
+#else /* RTE_NEXT_ABI */
 static inline uint64_t
 rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 {
@@ -894,6 +998,7 @@ rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 #endif
 	return pkt_flags | ip_rss_types_map[hl_tp_rs & 0xF];
 }
+#endif /* RTE_NEXT_ABI */
 
 static inline uint64_t
 rx_desc_status_to_pkt_flags(uint32_t rx_status)
@@ -949,7 +1054,13 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq)
 	struct rte_mbuf *mb;
 	uint16_t pkt_len;
 	uint64_t pkt_flags;
+#ifdef RTE_NEXT_ABI
+	int nb_dd;
+	uint32_t s[LOOK_AHEAD];
+	uint16_t pkt_info[LOOK_AHEAD];
+#else
 	int s[LOOK_AHEAD], nb_dd;
+#endif /* RTE_NEXT_ABI */
 	int i, j, nb_rx = 0;
 
 
@@ -972,6 +1083,12 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq)
 		for (j = LOOK_AHEAD-1; j >= 0; --j)
 			s[j] = rxdp[j].wb.upper.status_error;
 
+#ifdef RTE_NEXT_ABI
+		for (j = LOOK_AHEAD-1; j >= 0; --j)
+			pkt_info[j] = rxdp[j].wb.lower.lo_dword.
+						hs_rss.pkt_info;
+#endif /* RTE_NEXT_ABI */
+
 		/* Compute how many status bits were set */
 		nb_dd = 0;
 		for (j = 0; j < LOOK_AHEAD; ++j)
@@ -988,12 +1105,22 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq)
 			mb->vlan_tci = rte_le_to_cpu_16(rxdp[j].wb.upper.vlan);
 
 			/* convert descriptor fields to rte mbuf flags */
+#ifdef RTE_NEXT_ABI
+			pkt_flags = rx_desc_status_to_pkt_flags(s[j]);
+			pkt_flags |= rx_desc_error_to_pkt_flags(s[j]);
+			pkt_flags |=
+				ixgbe_rxd_pkt_info_to_pkt_flags(pkt_info[j]);
+			mb->ol_flags = pkt_flags;
+			mb->packet_type =
+				ixgbe_rxd_pkt_info_to_pkt_type(pkt_info[j]);
+#else /* RTE_NEXT_ABI */
 			pkt_flags  = rx_desc_hlen_type_rss_to_pkt_flags(
 					rxdp[j].wb.lower.lo_dword.data);
 			/* reuse status field from scan list */
 			pkt_flags |= rx_desc_status_to_pkt_flags(s[j]);
 			pkt_flags |= rx_desc_error_to_pkt_flags(s[j]);
 			mb->ol_flags = pkt_flags;
+#endif /* RTE_NEXT_ABI */
 
 			if (likely(pkt_flags & PKT_RX_RSS_HASH))
 				mb->hash.rss = rxdp[j].wb.lower.hi_dword.rss;
@@ -1210,7 +1337,11 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	union ixgbe_adv_rx_desc rxd;
 	uint64_t dma_addr;
 	uint32_t staterr;
+#ifdef RTE_NEXT_ABI
+	uint32_t pkt_info;
+#else
 	uint32_t hlen_type_rss;
+#endif
 	uint16_t pkt_len;
 	uint16_t rx_id;
 	uint16_t nb_rx;
@@ -1328,6 +1459,19 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		rxm->data_len = pkt_len;
 		rxm->port = rxq->port_id;
 
+#ifdef RTE_NEXT_ABI
+		pkt_info = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.hs_rss.
+								pkt_info);
+		/* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
+		rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+
+		pkt_flags = rx_desc_status_to_pkt_flags(staterr);
+		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
+		pkt_flags = pkt_flags |
+			ixgbe_rxd_pkt_info_to_pkt_flags(pkt_info);
+		rxm->ol_flags = pkt_flags;
+		rxm->packet_type = ixgbe_rxd_pkt_info_to_pkt_type(pkt_info);
+#else /* RTE_NEXT_ABI */
 		hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
 		/* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
 		rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
@@ -1336,6 +1480,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
 		pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
 		rxm->ol_flags = pkt_flags;
+#endif /* RTE_NEXT_ABI */
 
 		if (likely(pkt_flags & PKT_RX_RSS_HASH))
 			rxm->hash.rss = rxd.wb.lower.hi_dword.rss;
@@ -1409,6 +1554,23 @@ ixgbe_fill_cluster_head_buf(
 	uint8_t port_id,
 	uint32_t staterr)
 {
+#ifdef RTE_NEXT_ABI
+	uint16_t pkt_info;
+	uint64_t pkt_flags;
+
+	head->port = port_id;
+
+	/* The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+	 * set in the pkt_flags field.
+	 */
+	head->vlan_tci = rte_le_to_cpu_16(desc->wb.upper.vlan);
+	pkt_info = rte_le_to_cpu_32(desc->wb.lower.lo_dword.hs_rss.pkt_info);
+	pkt_flags = rx_desc_status_to_pkt_flags(staterr);
+	pkt_flags |= rx_desc_error_to_pkt_flags(staterr);
+	pkt_flags |= ixgbe_rxd_pkt_info_to_pkt_flags(pkt_info);
+	head->ol_flags = pkt_flags;
+	head->packet_type = ixgbe_rxd_pkt_info_to_pkt_type(pkt_info);
+#else /* RTE_NEXT_ABI */
 	uint32_t hlen_type_rss;
 	uint64_t pkt_flags;
 
@@ -1424,6 +1586,7 @@ ixgbe_fill_cluster_head_buf(
 	pkt_flags |= rx_desc_status_to_pkt_flags(staterr);
 	pkt_flags |= rx_desc_error_to_pkt_flags(staterr);
 	head->ol_flags = pkt_flags;
+#endif /* RTE_NEXT_ABI */
 
 	if (likely(pkt_flags & PKT_RX_RSS_HASH))
 		head->hash.rss = rte_le_to_cpu_32(desc->wb.lower.hi_dword.rss);
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 05/19] i40e: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (3 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 04/19] ixgbe: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 06/19] enic: " Helin Zhang
                     ` (14 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/i40e/i40e_rxtx.c | 554 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 554 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 88b015d..c667bbc 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -176,6 +176,540 @@ i40e_rxd_error_to_pkt_flags(uint64_t qword)
 	return flags;
 }
 
+#ifdef RTE_NEXT_ABI
+/* For each value it means, datasheet of hardware can tell more details */
+static inline uint32_t
+i40e_rxd_pkt_type_mapping(uint8_t ptype)
+{
+	static const uint32_t ptype_table[UINT8_MAX] __rte_cache_aligned = {
+		/* L2 types */
+		/* [0] reserved */
+		[1] = RTE_PTYPE_L2_ETHER,
+		[2] = RTE_PTYPE_L2_ETHER_TIMESYNC,
+		/* [3] - [5] reserved */
+		[6] = RTE_PTYPE_L2_ETHER_LLDP,
+		/* [7] - [10] reserved */
+		[11] = RTE_PTYPE_L2_ETHER_ARP,
+		/* [12] - [21] reserved */
+
+		/* Non tunneled IPv4 */
+		[22] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_FRAG,
+		[23] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_NONFRAG,
+		[24] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_UDP,
+		/* [25] reserved */
+		[26] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_TCP,
+		[27] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_SCTP,
+		[28] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_L4_ICMP,
+
+		/* IPv4 --> IPv4 */
+		[29] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[30] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[31] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [32] reserved */
+		[33] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[34] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[35] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> IPv6 */
+		[36] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[37] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[38] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [39] reserved */
+		[40] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[41] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[42] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN */
+		[43] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> IPv4 */
+		[44] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[45] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[46] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [47] reserved */
+		[48] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[49] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[50] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> IPv6 */
+		[51] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[52] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[53] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [54] reserved */
+		[55] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[56] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[57] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC */
+		[58] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC --> IPv4 */
+		[59] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[60] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[61] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [62] reserved */
+		[63] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[64] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[65] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC --> IPv6 */
+		[66] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[67] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[68] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [69] reserved */
+		[70] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[71] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[72] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC/VLAN */
+		[73] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv4 */
+		[74] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[75] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[76] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [77] reserved */
+		[78] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[79] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[80] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv4 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv6 */
+		[81] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[82] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[83] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [84] reserved */
+		[85] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[86] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[87] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* Non tunneled IPv6 */
+		[88] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_FRAG,
+		[89] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_NONFRAG,
+		[90] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_UDP,
+		/* [91] reserved */
+		[92] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_TCP,
+		[93] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_SCTP,
+		[94] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_L4_ICMP,
+
+		/* IPv6 --> IPv4 */
+		[95] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[96] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[97] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [98] reserved */
+		[99] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[100] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[101] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> IPv6 */
+		[102] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[103] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[104] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [105] reserved */
+		[106] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[107] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[108] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_IP |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN */
+		[109] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> IPv4 */
+		[110] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[111] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[112] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [113] reserved */
+		[114] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[115] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[116] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> IPv6 */
+		[117] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[118] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[119] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [120] reserved */
+		[121] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[122] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[123] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC */
+		[124] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC --> IPv4 */
+		[125] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[126] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[127] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [128] reserved */
+		[129] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[130] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[131] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC --> IPv6 */
+		[132] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[133] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[134] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [135] reserved */
+		[136] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[137] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[138] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC/VLAN */
+		[139] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv4 */
+		[140] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[141] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[142] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [143] reserved */
+		[144] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[145] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[146] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* IPv6 --> GRE/Teredo/VXLAN --> MAC/VLAN --> IPv6 */
+		[147] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_FRAG,
+		[148] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_NONFRAG,
+		[149] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_UDP,
+		/* [150] reserved */
+		[151] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_TCP,
+		[152] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_SCTP,
+		[153] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_TUNNEL_GRENAT |
+			RTE_PTYPE_INNER_L2_ETHER_VLAN |
+			RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN |
+			RTE_PTYPE_INNER_L4_ICMP,
+
+		/* All others reserved */
+	};
+
+	return ptype_table[ptype];
+}
+#else /* RTE_NEXT_ABI */
 /* Translate pkt types to pkt flags */
 static inline uint64_t
 i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
@@ -443,6 +977,7 @@ i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
 
 	return ip_ptype_map[ptype];
 }
+#endif /* RTE_NEXT_ABI */
 
 #define I40E_RX_DESC_EXT_STATUS_FLEXBH_MASK   0x03
 #define I40E_RX_DESC_EXT_STATUS_FLEXBH_FD_ID  0x01
@@ -730,11 +1265,18 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
 			i40e_rxd_to_vlan_tci(mb, &rxdp[j]);
 			pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
 			pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
+#ifdef RTE_NEXT_ABI
+			mb->packet_type =
+				i40e_rxd_pkt_type_mapping((uint8_t)((qword1 &
+						I40E_RXD_QW1_PTYPE_MASK) >>
+						I40E_RXD_QW1_PTYPE_SHIFT));
+#else
 			pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
 
 			mb->packet_type = (uint16_t)((qword1 &
 					I40E_RXD_QW1_PTYPE_MASK) >>
 					I40E_RXD_QW1_PTYPE_SHIFT);
+#endif /* RTE_NEXT_ABI */
 			if (pkt_flags & PKT_RX_RSS_HASH)
 				mb->hash.rss = rte_le_to_cpu_32(\
 					rxdp[j].wb.qword0.hi_dword.rss);
@@ -971,9 +1513,15 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 		i40e_rxd_to_vlan_tci(rxm, &rxd);
 		pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
 		pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
+#ifdef RTE_NEXT_ABI
+		rxm->packet_type =
+			i40e_rxd_pkt_type_mapping((uint8_t)((qword1 &
+			I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT));
+#else
 		pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
 		rxm->packet_type = (uint16_t)((qword1 & I40E_RXD_QW1_PTYPE_MASK) >>
 				I40E_RXD_QW1_PTYPE_SHIFT);
+#endif /* RTE_NEXT_ABI */
 		if (pkt_flags & PKT_RX_RSS_HASH)
 			rxm->hash.rss =
 				rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
@@ -1129,10 +1677,16 @@ i40e_recv_scattered_pkts(void *rx_queue,
 		i40e_rxd_to_vlan_tci(first_seg, &rxd);
 		pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
 		pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
+#ifdef RTE_NEXT_ABI
+		first_seg->packet_type =
+			i40e_rxd_pkt_type_mapping((uint8_t)((qword1 &
+			I40E_RXD_QW1_PTYPE_MASK) >> I40E_RXD_QW1_PTYPE_SHIFT));
+#else
 		pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
 		first_seg->packet_type = (uint16_t)((qword1 &
 					I40E_RXD_QW1_PTYPE_MASK) >>
 					I40E_RXD_QW1_PTYPE_SHIFT);
+#endif /* RTE_NEXT_ABI */
 		if (pkt_flags & PKT_RX_RSS_HASH)
 			rxm->hash.rss =
 				rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 09/19] cxgbe: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (7 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 08/19] fm10k: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: " Helin Zhang
                     ` (10 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be enabled
by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/cxgbe/sge.c | 8 ++++++++
 1 file changed, 8 insertions(+)

v9 changes:
* Added unified packet type support in newly added cxgbe driver.

diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
index 359296e..fdae0b4 100644
--- a/drivers/net/cxgbe/sge.c
+++ b/drivers/net/cxgbe/sge.c
@@ -1326,14 +1326,22 @@ int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 
 	mbuf->port = pkt->iff;
 	if (pkt->l2info & htonl(F_RXF_IP)) {
+#ifdef RTE_NEXT_ABI
+		mbuf->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 		mbuf->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 		if (unlikely(!csum_ok))
 			mbuf->ol_flags |= PKT_RX_IP_CKSUM_BAD;
 
 		if ((pkt->l2info & htonl(F_RXF_UDP | F_RXF_TCP)) && !csum_ok)
 			mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 	} else if (pkt->l2info & htonl(F_RXF_IP6)) {
+#ifdef RTE_NEXT_ABI
+		mbuf->packet_type = RTE_PTYPE_L3_IPV6;
+#else
 		mbuf->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 	}
 
 	mbuf->port = pkt->iff;
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 06/19] enic: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (4 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 05/19] i40e: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 07/19] vmxnet3: " Helin Zhang
                     ` (13 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/enic/enic_main.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 15313c2..f47e96c 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -423,7 +423,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 		rx_pkt->pkt_len = bytes_written;
 
 		if (ipv4) {
+#ifdef RTE_NEXT_ABI
+			rx_pkt->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 			rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 			if (!csum_not_calc) {
 				if (unlikely(!ipv4_csum_ok))
 					rx_pkt->ol_flags |= PKT_RX_IP_CKSUM_BAD;
@@ -432,7 +436,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 					rx_pkt->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 			}
 		} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+			rx_pkt->packet_type = RTE_PTYPE_L3_IPV6;
+#else
 			rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 	} else {
 		/* Header split */
 		if (sop && !eop) {
@@ -445,7 +453,11 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 				*rx_pkt_bucket = rx_pkt;
 				rx_pkt->pkt_len = bytes_written;
 				if (ipv4) {
+#ifdef RTE_NEXT_ABI
+					rx_pkt->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 					rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 					if (!csum_not_calc) {
 						if (unlikely(!ipv4_csum_ok))
 							rx_pkt->ol_flags |=
@@ -457,13 +469,22 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 							    PKT_RX_L4_CKSUM_BAD;
 					}
 				} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+					rx_pkt->packet_type = RTE_PTYPE_L3_IPV6;
+#else
 					rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 			} else {
 				/* Payload */
 				hdr_rx_pkt = *rx_pkt_bucket;
 				hdr_rx_pkt->pkt_len += bytes_written;
 				if (ipv4) {
+#ifdef RTE_NEXT_ABI
+					hdr_rx_pkt->packet_type =
+						RTE_PTYPE_L3_IPV4;
+#else
 					hdr_rx_pkt->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 					if (!csum_not_calc) {
 						if (unlikely(!ipv4_csum_ok))
 							hdr_rx_pkt->ol_flags |=
@@ -475,7 +496,12 @@ static int enic_rq_indicate_buf(struct vnic_rq *rq,
 							    PKT_RX_L4_CKSUM_BAD;
 					}
 				} else if (ipv6)
+#ifdef RTE_NEXT_ABI
+					hdr_rx_pkt->packet_type =
+						RTE_PTYPE_L3_IPV6;
+#else
 					hdr_rx_pkt->ol_flags |= PKT_RX_IPV6_HDR;
+#endif
 
 			}
 		}
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 11/19] app/testpmd: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (9 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code Helin Zhang
                     ` (8 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
 app/test-pmd/csumonly.c |  14 ++++
 app/test-pmd/rxonly.c   | 183 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 197 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v4 changes:
* Added printing logs of packet types of each received packet in rxonly mode.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 4287940..1bf3485 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -202,8 +202,14 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct testpmd_offload_info *info)
 
 /* Parse a vxlan header */
 static void
+#ifdef RTE_NEXT_ABI
+parse_vxlan(struct udp_hdr *udp_hdr,
+	    struct testpmd_offload_info *info,
+	    uint32_t pkt_type)
+#else
 parse_vxlan(struct udp_hdr *udp_hdr, struct testpmd_offload_info *info,
 	uint64_t mbuf_olflags)
+#endif
 {
 	struct ether_hdr *eth_hdr;
 
@@ -211,8 +217,12 @@ parse_vxlan(struct udp_hdr *udp_hdr, struct testpmd_offload_info *info,
 	 * (rfc7348) or that the rx offload flag is set (i40e only
 	 * currently) */
 	if (udp_hdr->dst_port != _htons(4789) &&
+#ifdef RTE_NEXT_ABI
+		RTE_ETH_IS_TUNNEL_PKT(pkt_type) == 0)
+#else
 		(mbuf_olflags & (PKT_RX_TUNNEL_IPV4_HDR |
 			PKT_RX_TUNNEL_IPV6_HDR)) == 0)
+#endif
 		return;
 
 	info->is_tunnel = 1;
@@ -549,7 +559,11 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 				struct udp_hdr *udp_hdr;
 				udp_hdr = (struct udp_hdr *)((char *)l3_hdr +
 					info.l3_len);
+#ifdef RTE_NEXT_ABI
+				parse_vxlan(udp_hdr, &info, m->packet_type);
+#else
 				parse_vxlan(udp_hdr, &info, m->ol_flags);
+#endif
 			} else if (info.l4_proto == IPPROTO_GRE) {
 				struct simple_gre_hdr *gre_hdr;
 				gre_hdr = (struct simple_gre_hdr *)
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 4a9f86e..632056d 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -91,7 +91,11 @@ pkt_burst_receive(struct fwd_stream *fs)
 	uint64_t ol_flags;
 	uint16_t nb_rx;
 	uint16_t i, packet_type;
+#ifdef RTE_NEXT_ABI
+	uint16_t is_encapsulation;
+#else
 	uint64_t is_encapsulation;
+#endif
 
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
 	uint64_t start_tsc;
@@ -135,8 +139,12 @@ pkt_burst_receive(struct fwd_stream *fs)
 		ol_flags = mb->ol_flags;
 		packet_type = mb->packet_type;
 
+#ifdef RTE_NEXT_ABI
+		is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type);
+#else
 		is_encapsulation = ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
 				PKT_RX_TUNNEL_IPV6_HDR);
+#endif
 
 		print_ether_addr("  src=", &eth_hdr->s_addr);
 		print_ether_addr(" - dst=", &eth_hdr->d_addr);
@@ -163,6 +171,177 @@ pkt_burst_receive(struct fwd_stream *fs)
 		if (ol_flags & PKT_RX_QINQ_PKT)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 					mb->vlan_tci, mb->vlan_tci_outer);
+#ifdef RTE_NEXT_ABI
+		if (mb->packet_type) {
+			uint32_t ptype;
+
+			/* (outer) L2 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_L2_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_L2_ETHER:
+				printf(" - (outer) L2 type: ETHER");
+				break;
+			case RTE_PTYPE_L2_ETHER_TIMESYNC:
+				printf(" - (outer) L2 type: ETHER_Timesync");
+				break;
+			case RTE_PTYPE_L2_ETHER_ARP:
+				printf(" - (outer) L2 type: ETHER_ARP");
+				break;
+			case RTE_PTYPE_L2_ETHER_LLDP:
+				printf(" - (outer) L2 type: ETHER_LLDP");
+				break;
+			default:
+				printf(" - (outer) L2 type: Unknown");
+				break;
+			}
+
+			/* (outer) L3 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_L3_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_L3_IPV4:
+				printf(" - (outer) L3 type: IPV4");
+				break;
+			case RTE_PTYPE_L3_IPV4_EXT:
+				printf(" - (outer) L3 type: IPV4_EXT");
+				break;
+			case RTE_PTYPE_L3_IPV6:
+				printf(" - (outer) L3 type: IPV6");
+				break;
+			case RTE_PTYPE_L3_IPV4_EXT_UNKNOWN:
+				printf(" - (outer) L3 type: IPV4_EXT_UNKNOWN");
+				break;
+			case RTE_PTYPE_L3_IPV6_EXT:
+				printf(" - (outer) L3 type: IPV6_EXT");
+				break;
+			case RTE_PTYPE_L3_IPV6_EXT_UNKNOWN:
+				printf(" - (outer) L3 type: IPV6_EXT_UNKNOWN");
+				break;
+			default:
+				printf(" - (outer) L3 type: Unknown");
+				break;
+			}
+
+			/* (outer) L4 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_L4_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_L4_TCP:
+				printf(" - (outer) L4 type: TCP");
+				break;
+			case RTE_PTYPE_L4_UDP:
+				printf(" - (outer) L4 type: UDP");
+				break;
+			case RTE_PTYPE_L4_FRAG:
+				printf(" - (outer) L4 type: L4_FRAG");
+				break;
+			case RTE_PTYPE_L4_SCTP:
+				printf(" - (outer) L4 type: SCTP");
+				break;
+			case RTE_PTYPE_L4_ICMP:
+				printf(" - (outer) L4 type: ICMP");
+				break;
+			case RTE_PTYPE_L4_NONFRAG:
+				printf(" - (outer) L4 type: L4_NONFRAG");
+				break;
+			default:
+				printf(" - (outer) L4 type: Unknown");
+				break;
+			}
+
+			/* packet tunnel type */
+			ptype = mb->packet_type & RTE_PTYPE_TUNNEL_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_TUNNEL_IP:
+				printf(" - Tunnel type: IP");
+				break;
+			case RTE_PTYPE_TUNNEL_GRE:
+				printf(" - Tunnel type: GRE");
+				break;
+			case RTE_PTYPE_TUNNEL_VXLAN:
+				printf(" - Tunnel type: VXLAN");
+				break;
+			case RTE_PTYPE_TUNNEL_NVGRE:
+				printf(" - Tunnel type: NVGRE");
+				break;
+			case RTE_PTYPE_TUNNEL_GENEVE:
+				printf(" - Tunnel type: GENEVE");
+				break;
+			case RTE_PTYPE_TUNNEL_GRENAT:
+				printf(" - Tunnel type: GRENAT");
+				break;
+			default:
+				printf(" - Tunnel type: Unknown");
+				break;
+			}
+
+			/* inner L2 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_INNER_L2_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_INNER_L2_ETHER:
+				printf(" - Inner L2 type: ETHER");
+				break;
+			case RTE_PTYPE_INNER_L2_ETHER_VLAN:
+				printf(" - Inner L2 type: ETHER_VLAN");
+				break;
+			default:
+				printf(" - Inner L2 type: Unknown");
+				break;
+			}
+
+			/* inner L3 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_INNER_INNER_L3_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_INNER_L3_IPV4:
+				printf(" - Inner L3 type: IPV4");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV4_EXT:
+				printf(" - Inner L3 type: IPV4_EXT");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV6:
+				printf(" - Inner L3 type: IPV6");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN:
+				printf(" - Inner L3 type: IPV4_EXT_UNKNOWN");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV6_EXT:
+				printf(" - Inner L3 type: IPV6_EXT");
+				break;
+			case RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN:
+				printf(" - Inner L3 type: IPV6_EXT_UNKOWN");
+				break;
+			default:
+				printf(" - Inner L3 type: Unknown");
+				break;
+			}
+
+			/* inner L4 packet type */
+			ptype = mb->packet_type & RTE_PTYPE_INNER_L4_MASK;
+			switch (ptype) {
+			case RTE_PTYPE_INNER_L4_TCP:
+				printf(" - Inner L4 type: TCP");
+				break;
+			case RTE_PTYPE_INNER_L4_UDP:
+				printf(" - Inner L4 type: UDP");
+				break;
+			case RTE_PTYPE_INNER_L4_FRAG:
+				printf(" - Inner L4 type: L4_FRAG");
+				break;
+			case RTE_PTYPE_INNER_L4_SCTP:
+				printf(" - Inner L4 type: SCTP");
+				break;
+			case RTE_PTYPE_INNER_L4_ICMP:
+				printf(" - Inner L4 type: ICMP");
+				break;
+			case RTE_PTYPE_INNER_L4_NONFRAG:
+				printf(" - Inner L4 type: L4_NONFRAG");
+				break;
+			default:
+				printf(" - Inner L4 type: Unknown");
+				break;
+			}
+			printf("\n");
+		} else
+			printf("Unknown packet type\n");
+#endif /* RTE_NEXT_ABI */
 		if (is_encapsulation) {
 			struct ipv4_hdr *ipv4_hdr;
 			struct ipv6_hdr *ipv6_hdr;
@@ -176,7 +355,11 @@ pkt_burst_receive(struct fwd_stream *fs)
 			l2_len  = sizeof(struct ether_hdr);
 
 			 /* Do not support ipv4 option field */
+#ifdef RTE_NEXT_ABI
+			if (RTE_ETH_IS_IPV4_HDR(packet_type)) {
+#else
 			if (ol_flags & PKT_RX_TUNNEL_IPV4_HDR) {
+#endif
 				l3_len = sizeof(struct ipv4_hdr);
 				ipv4_hdr = rte_pktmbuf_mtod_offset(mb,
 								   struct ipv4_hdr *,
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (8 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 09/19] cxgbe: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 11/19] app/testpmd: " Helin Zhang
                     ` (9 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 app/test-pipeline/pipeline_hash.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/app/test-pipeline/pipeline_hash.c b/app/test-pipeline/pipeline_hash.c
index 4598ad4..aa3f9e5 100644
--- a/app/test-pipeline/pipeline_hash.c
+++ b/app/test-pipeline/pipeline_hash.c
@@ -459,20 +459,33 @@ app_main_loop_rx_metadata(void) {
 			signature = RTE_MBUF_METADATA_UINT32_PTR(m, 0);
 			key = RTE_MBUF_METADATA_UINT8_PTR(m, 32);
 
+#ifdef RTE_NEXT_ABI
+			if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 			if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 				ip_hdr = (struct ipv4_hdr *)
 					&m_data[sizeof(struct ether_hdr)];
 				ip_dst = ip_hdr->dst_addr;
 
 				k32 = (uint32_t *) key;
 				k32[0] = ip_dst & 0xFFFFFF00;
+#ifdef RTE_NEXT_ABI
+			} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
 			} else {
+#endif
 				ipv6_hdr = (struct ipv6_hdr *)
 					&m_data[sizeof(struct ether_hdr)];
 				ipv6_dst = ipv6_hdr->dst_addr;
 
 				memcpy(key, ipv6_dst, 16);
+#ifdef RTE_NEXT_ABI
+			} else
+				continue;
+#else
 			}
+#endif
 
 			*signature = test_hash(key, 0, 0);
 		}
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (10 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 11/19] app/testpmd: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type Helin Zhang
                     ` (7 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

Severl useless code lines are added accidently, which blocks packet
type unification. They should be deleted at all.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 app/test/packet_burst_generator.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

v4 changes:
* Removed several useless code lines which block packet type unification.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/app/test/packet_burst_generator.c b/app/test/packet_burst_generator.c
index 28d9e25..d9d808b 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -273,19 +273,21 @@ nomore_mbuf:
 		if (ipv4) {
 			pkt->vlan_tci  = ETHER_TYPE_IPv4;
 			pkt->l3_len = sizeof(struct ipv4_hdr);
-
+#ifndef RTE_NEXT_ABI
 			if (vlan_enabled)
 				pkt->ol_flags = PKT_RX_IPV4_HDR | PKT_RX_VLAN_PKT;
 			else
 				pkt->ol_flags = PKT_RX_IPV4_HDR;
+#endif
 		} else {
 			pkt->vlan_tci  = ETHER_TYPE_IPv6;
 			pkt->l3_len = sizeof(struct ipv6_hdr);
-
+#ifndef RTE_NEXT_ABI
 			if (vlan_enabled)
 				pkt->ol_flags = PKT_RX_IPV6_HDR | PKT_RX_VLAN_PKT;
 			else
 				pkt->ol_flags = PKT_RX_IPV6_HDR;
+#endif
 		}
 
 		pkts_burst[nb_pkt] = pkt;
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 08/19] fm10k: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (6 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 07/19] vmxnet3: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 09/19] cxgbe: " Helin Zhang
                     ` (11 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/fm10k/fm10k_rxtx.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

v4 changes:
* Supported unified packet type of fm10k from v4.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

v9 changes:
* Renamed MAC to ETHER in packet type names.

diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index 7d5e32c..b5fa2e6 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -68,12 +68,37 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
 static inline void
 rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
 {
+#ifdef RTE_NEXT_ABI
+	static const uint32_t
+		ptype_table[FM10K_RXD_PKTTYPE_MASK >> FM10K_RXD_PKTTYPE_SHIFT]
+			__rte_cache_aligned = {
+		[FM10K_PKTTYPE_OTHER] = RTE_PTYPE_L2_ETHER,
+		[FM10K_PKTTYPE_IPV4] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
+		[FM10K_PKTTYPE_IPV4_EX] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4_EXT,
+		[FM10K_PKTTYPE_IPV6] = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6,
+		[FM10K_PKTTYPE_IPV6_EX] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6_EXT,
+		[FM10K_PKTTYPE_IPV4 | FM10K_PKTTYPE_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP,
+		[FM10K_PKTTYPE_IPV6 | FM10K_PKTTYPE_TCP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_TCP,
+		[FM10K_PKTTYPE_IPV4 | FM10K_PKTTYPE_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP,
+		[FM10K_PKTTYPE_IPV6 | FM10K_PKTTYPE_UDP] = RTE_PTYPE_L2_ETHER |
+			RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP,
+	};
+
+	m->packet_type = ptype_table[(d->w.pkt_info & FM10K_RXD_PKTTYPE_MASK)
+						>> FM10K_RXD_PKTTYPE_SHIFT];
+#else /* RTE_NEXT_ABI */
 	uint16_t ptype;
 	static const uint16_t pt_lut[] = { 0,
 		PKT_RX_IPV4_HDR, PKT_RX_IPV4_HDR_EXT,
 		PKT_RX_IPV6_HDR, PKT_RX_IPV6_HDR_EXT,
 		0, 0, 0
 	};
+#endif /* RTE_NEXT_ABI */
 
 	if (d->w.pkt_info & FM10K_RXD_RSSTYPE_MASK)
 		m->ol_flags |= PKT_RX_RSS_HASH;
@@ -97,9 +122,11 @@ rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
 	if (unlikely(d->d.staterr & FM10K_RXD_STATUS_RXE))
 		m->ol_flags |= PKT_RX_RECIP_ERR;
 
+#ifndef RTE_NEXT_ABI
 	ptype = (d->d.data & FM10K_RXD_PKTTYPE_MASK_L3) >>
 						FM10K_RXD_PKTTYPE_SHIFT;
 	m->ol_flags |= pt_lut[(uint8_t)ptype];
+#endif
 }
 
 uint16_t
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (12 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: " Helin Zhang
                     ` (5 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/ip_reassembly/main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 9ecb6f9..f1c47ad 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -356,7 +356,11 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 	dst_port = portid;
 
 	/* if packet is IPv4 */
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & (PKT_RX_IPV4_HDR)) {
+#endif
 		struct ipv4_hdr *ip_hdr;
 		uint32_t ip_dst;
 
@@ -396,9 +400,14 @@ reassemble(struct rte_mbuf *m, uint8_t portid, uint32_t queue,
 		}
 
 		eth_hdr->ether_type = rte_be_to_cpu_16(ETHER_TYPE_IPv4);
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+		/* if packet is IPv6 */
+#else
 	}
 	/* if packet is IPv6 */
 	else if (m->ol_flags & (PKT_RX_IPV6_HDR | PKT_RX_IPV6_HDR_EXT)) {
+#endif
 		struct ipv6_extension_fragment *frag_hdr;
 		struct ipv6_hdr *ip_hdr;
 
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (13 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: " Helin Zhang
                     ` (4 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/l3fwd-acl/main.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd-acl/main.c b/examples/l3fwd-acl/main.c
index 29cb25e..b2bdf2f 100644
--- a/examples/l3fwd-acl/main.c
+++ b/examples/l3fwd-acl/main.c
@@ -645,10 +645,13 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
 	struct ipv4_hdr *ipv4_hdr;
 	struct rte_mbuf *pkt = pkts_in[index];
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
 	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
 
 	if (type == PKT_RX_IPV4_HDR) {
-
+#endif
 		ipv4_hdr = rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *,
 						   sizeof(struct ether_hdr));
 
@@ -667,9 +670,11 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
 			/* Not a valid IPv4 packet */
 			rte_pktmbuf_free(pkt);
 		}
-
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
 	} else if (type == PKT_RX_IPV6_HDR) {
-
+#endif
 		/* Fill acl structure */
 		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
 		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
@@ -687,17 +692,22 @@ prepare_one_packet(struct rte_mbuf **pkts_in, struct acl_search_t *acl,
 {
 	struct rte_mbuf *pkt = pkts_in[index];
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
 	int type = pkt->ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV6_HDR);
 
 	if (type == PKT_RX_IPV4_HDR) {
-
+#endif
 		/* Fill acl structure */
 		acl->data_ipv4[acl->num_ipv4] = MBUF_IPV4_2PROTO(pkt);
 		acl->m_ipv4[(acl->num_ipv4)++] = pkt;
 
-
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
 	} else if (type == PKT_RX_IPV6_HDR) {
-
+#endif
 		/* Fill acl structure */
 		acl->data_ipv6[acl->num_ipv6] = MBUF_IPV6_2PROTO(pkt);
 		acl->m_ipv6[(acl->num_ipv6)++] = pkt;
@@ -745,10 +755,17 @@ send_one_packet(struct rte_mbuf *m, uint32_t res)
 		/* in the ACL list, drop it */
 #ifdef L3FWDACL_DEBUG
 		if ((res & ACL_DENY_SIGNATURE) != 0) {
+#ifdef RTE_NEXT_ABI
+			if (RTE_ETH_IS_IPV4_HDR(m->packet_type))
+				dump_acl4_rule(m, res);
+			else if (RTE_ETH_IS_IPV6_HDR(m->packet_type))
+				dump_acl6_rule(m, res);
+#else
 			if (m->ol_flags & PKT_RX_IPV4_HDR)
 				dump_acl4_rule(m, res);
 			else
 				dump_acl6_rule(m, res);
+#endif /* RTE_NEXT_ABI */
 		}
 #endif
 		rte_pktmbuf_free(m);
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (11 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: " Helin Zhang
                     ` (6 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/ip_fragmentation/main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 0922ba6..b71d05f 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -283,7 +283,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 	len = qconf->tx_mbufs[port_out].len;
 
 	/* if this is an IPv4 packet */
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		struct ipv4_hdr *ip_hdr;
 		uint32_t ip_dst;
 		/* Read the lookup key (i.e. ip_dst) from the input packet */
@@ -317,9 +321,14 @@ l3fwd_simple_forward(struct rte_mbuf *m, struct lcore_queue_conf *qconf,
 			if (unlikely (len2 < 0))
 				return;
 		}
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+		/* if this is an IPv6 packet */
+#else
 	}
 	/* if this is an IPv6 packet */
 	else if (m->ol_flags & PKT_RX_IPV6_HDR) {
+#endif
 		struct ipv6_hdr *ip_hdr;
 
 		ipv6 = 1;
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (16 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks Helin Zhang
  2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be enabled
by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/tep_termination/vxlan.c | 4 ++++
 1 file changed, 4 insertions(+)

v9 changes:
* Used unified packet type to check if it is a VXLAN packet, included in
  RTE_NEXT_ABI which is disabled by default.

v10 changes:
* Fixed a compile error.

diff --git a/examples/tep_termination/vxlan.c b/examples/tep_termination/vxlan.c
index b2a2f53..e98a29f 100644
--- a/examples/tep_termination/vxlan.c
+++ b/examples/tep_termination/vxlan.c
@@ -180,8 +180,12 @@ decapsulation(struct rte_mbuf *pkt)
 	 * (rfc7348) or that the rx offload flag is set (i40e only
 	 * currently)*/
 	if (udp_hdr->dst_port != rte_cpu_to_be_16(DEFAULT_VXLAN_PORT) &&
+#ifdef RTE_NEXT_ABI
+		(pkt->packet_type & RTE_PTYPE_TUNNEL_MASK) == 0)
+#else
 			(pkt->ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
 				PKT_RX_TUNNEL_IPV6_HDR)) == 0)
+#endif
 		return -1;
 	outer_header_len = info.outer_l2_len + info.outer_l3_len
 		+ sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr);
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (14 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: " Helin Zhang
                     ` (3 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/l3fwd-power/main.c | 8 ++++++++
 1 file changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index d4eba1a..dbbebdd 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -635,7 +635,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 
 	eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		/* Handle IPv4 headers.*/
 		ipv4_hdr =
 			rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
@@ -670,8 +674,12 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 		ether_addr_copy(&ports_eth_addr[dst_port], &eth_hdr->s_addr);
 
 		send_single_packet(m, dst_port);
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
 	}
 	else {
+#endif
 		/* Handle IPv6 headers.*/
 #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
 		struct ipv6_hdr *ipv6_hdr;
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (15 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: " Helin Zhang
@ 2015-07-09 16:31  3%   ` Helin Zhang
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: " Helin Zhang
                     ` (2 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 examples/l3fwd/main.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 120 insertions(+), 3 deletions(-)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v3 changes:
* Minor bug fixes and enhancements.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 5c22ed1..b1bcb35 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -939,7 +939,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
 
 	eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) {
+#else
 	if (m->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		/* Handle IPv4 headers.*/
 		ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
 						   sizeof(struct ether_hdr));
@@ -970,8 +974,11 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
 		ether_addr_copy(&ports_eth_addr[dst_port], &eth_hdr->s_addr);
 
 		send_single_packet(m, dst_port);
-
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) {
+#else
 	} else {
+#endif
 		/* Handle IPv6 headers.*/
 		struct ipv6_hdr *ipv6_hdr;
 
@@ -990,8 +997,13 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
 		ether_addr_copy(&ports_eth_addr[dst_port], &eth_hdr->s_addr);
 
 		send_single_packet(m, dst_port);
+#ifdef RTE_NEXT_ABI
+	} else
+		/* Free the mbuf that contains non-IPV4/IPV6 packet */
+		rte_pktmbuf_free(m);
+#else
 	}
-
+#endif
 }
 
 #ifdef DO_RFC_1812_CHECKS
@@ -1015,12 +1027,19 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid, struct lcore_conf *qcon
  * to BAD_PORT value.
  */
 static inline __attribute__((always_inline)) void
+#ifdef RTE_NEXT_ABI
+rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t ptype)
+#else
 rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t flags)
+#endif
 {
 	uint8_t ihl;
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(ptype)) {
+#else
 	if ((flags & PKT_RX_IPV4_HDR) != 0) {
-
+#endif
 		ihl = ipv4_hdr->version_ihl - IPV4_MIN_VER_IHL;
 
 		ipv4_hdr->time_to_live--;
@@ -1050,11 +1069,19 @@ get_dst_port(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
 	struct ipv6_hdr *ipv6_hdr;
 	struct ether_hdr *eth_hdr;
 
+#ifdef RTE_NEXT_ABI
+	if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+#else
 	if (pkt->ol_flags & PKT_RX_IPV4_HDR) {
+#endif
 		if (rte_lpm_lookup(qconf->ipv4_lookup_struct, dst_ipv4,
 				&next_hop) != 0)
 			next_hop = portid;
+#ifdef RTE_NEXT_ABI
+	} else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+#else
 	} else if (pkt->ol_flags & PKT_RX_IPV6_HDR) {
+#endif
 		eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
 		ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
 		if (rte_lpm6_lookup(qconf->ipv6_lookup_struct,
@@ -1088,12 +1115,52 @@ process_packet(struct lcore_conf *qconf, struct rte_mbuf *pkt,
 	ve = val_eth[dp];
 
 	dst_port[0] = dp;
+#ifdef RTE_NEXT_ABI
+	rfc1812_process(ipv4_hdr, dst_port, pkt->packet_type);
+#else
 	rfc1812_process(ipv4_hdr, dst_port, pkt->ol_flags);
+#endif
 
 	te =  _mm_blend_epi16(te, ve, MASK_ETH);
 	_mm_store_si128((__m128i *)eth_hdr, te);
 }
 
+#ifdef RTE_NEXT_ABI
+/*
+ * Read packet_type and destination IPV4 addresses from 4 mbufs.
+ */
+static inline void
+processx4_step1(struct rte_mbuf *pkt[FWDSTEP],
+		__m128i *dip,
+		uint32_t *ipv4_flag)
+{
+	struct ipv4_hdr *ipv4_hdr;
+	struct ether_hdr *eth_hdr;
+	uint32_t x0, x1, x2, x3;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[0], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x0 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] = pkt[0]->packet_type & RTE_PTYPE_L3_IPV4;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[1], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x1 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] &= pkt[1]->packet_type;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[2], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x2 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] &= pkt[2]->packet_type;
+
+	eth_hdr = rte_pktmbuf_mtod(pkt[3], struct ether_hdr *);
+	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+	x3 = ipv4_hdr->dst_addr;
+	ipv4_flag[0] &= pkt[3]->packet_type;
+
+	dip[0] = _mm_set_epi32(x3, x2, x1, x0);
+}
+#else /* RTE_NEXT_ABI */
 /*
  * Read ol_flags and destination IPV4 addresses from 4 mbufs.
  */
@@ -1126,14 +1193,24 @@ processx4_step1(struct rte_mbuf *pkt[FWDSTEP], __m128i *dip, uint32_t *flag)
 
 	dip[0] = _mm_set_epi32(x3, x2, x1, x0);
 }
+#endif /* RTE_NEXT_ABI */
 
 /*
  * Lookup into LPM for destination port.
  * If lookup fails, use incoming port (portid) as destination port.
  */
 static inline void
+#ifdef RTE_NEXT_ABI
+processx4_step2(const struct lcore_conf *qconf,
+		__m128i dip,
+		uint32_t ipv4_flag,
+		uint8_t portid,
+		struct rte_mbuf *pkt[FWDSTEP],
+		uint16_t dprt[FWDSTEP])
+#else
 processx4_step2(const struct lcore_conf *qconf, __m128i dip, uint32_t flag,
 	uint8_t portid, struct rte_mbuf *pkt[FWDSTEP], uint16_t dprt[FWDSTEP])
+#endif /* RTE_NEXT_ABI */
 {
 	rte_xmm_t dst;
 	const  __m128i bswap_mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11,
@@ -1143,7 +1220,11 @@ processx4_step2(const struct lcore_conf *qconf, __m128i dip, uint32_t flag,
 	dip = _mm_shuffle_epi8(dip, bswap_mask);
 
 	/* if all 4 packets are IPV4. */
+#ifdef RTE_NEXT_ABI
+	if (likely(ipv4_flag)) {
+#else
 	if (likely(flag != 0)) {
+#endif
 		rte_lpm_lookupx4(qconf->ipv4_lookup_struct, dip, dprt, portid);
 	} else {
 		dst.x = dip;
@@ -1193,6 +1274,16 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
 	_mm_store_si128(p[2], te[2]);
 	_mm_store_si128(p[3], te[3]);
 
+#ifdef RTE_NEXT_ABI
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[0] + 1),
+		&dst_port[0], pkt[0]->packet_type);
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[1] + 1),
+		&dst_port[1], pkt[1]->packet_type);
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[2] + 1),
+		&dst_port[2], pkt[2]->packet_type);
+	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[3] + 1),
+		&dst_port[3], pkt[3]->packet_type);
+#else /* RTE_NEXT_ABI */
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[0] + 1),
 		&dst_port[0], pkt[0]->ol_flags);
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[1] + 1),
@@ -1201,6 +1292,7 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
 		&dst_port[2], pkt[2]->ol_flags);
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[3] + 1),
 		&dst_port[3], pkt[3]->ol_flags);
+#endif /* RTE_NEXT_ABI */
 }
 
 /*
@@ -1387,7 +1479,11 @@ main_loop(__attribute__((unused)) void *dummy)
 	uint16_t *lp;
 	uint16_t dst_port[MAX_PKT_BURST];
 	__m128i dip[MAX_PKT_BURST / FWDSTEP];
+#ifdef RTE_NEXT_ABI
+	uint32_t ipv4_flag[MAX_PKT_BURST / FWDSTEP];
+#else
 	uint32_t flag[MAX_PKT_BURST / FWDSTEP];
+#endif
 	uint16_t pnum[MAX_PKT_BURST + 1];
 #endif
 
@@ -1457,6 +1553,18 @@ main_loop(__attribute__((unused)) void *dummy)
 				 */
 				int32_t n = RTE_ALIGN_FLOOR(nb_rx, 4);
 				for (j = 0; j < n ; j+=4) {
+#ifdef RTE_NEXT_ABI
+					uint32_t pkt_type =
+						pkts_burst[j]->packet_type &
+						pkts_burst[j+1]->packet_type &
+						pkts_burst[j+2]->packet_type &
+						pkts_burst[j+3]->packet_type;
+					if (pkt_type & RTE_PTYPE_L3_IPV4) {
+						simple_ipv4_fwd_4pkts(
+						&pkts_burst[j], portid, qconf);
+					} else if (pkt_type &
+						RTE_PTYPE_L3_IPV6) {
+#else /* RTE_NEXT_ABI */
 					uint32_t ol_flag = pkts_burst[j]->ol_flags
 							& pkts_burst[j+1]->ol_flags
 							& pkts_burst[j+2]->ol_flags
@@ -1465,6 +1573,7 @@ main_loop(__attribute__((unused)) void *dummy)
 						simple_ipv4_fwd_4pkts(&pkts_burst[j],
 									portid, qconf);
 					} else if (ol_flag & PKT_RX_IPV6_HDR) {
+#endif /* RTE_NEXT_ABI */
 						simple_ipv6_fwd_4pkts(&pkts_burst[j],
 									portid, qconf);
 					} else {
@@ -1489,13 +1598,21 @@ main_loop(__attribute__((unused)) void *dummy)
 			for (j = 0; j != k; j += FWDSTEP) {
 				processx4_step1(&pkts_burst[j],
 					&dip[j / FWDSTEP],
+#ifdef RTE_NEXT_ABI
+					&ipv4_flag[j / FWDSTEP]);
+#else
 					&flag[j / FWDSTEP]);
+#endif
 			}
 
 			k = RTE_ALIGN_FLOOR(nb_rx, FWDSTEP);
 			for (j = 0; j != k; j += FWDSTEP) {
 				processx4_step2(qconf, dip[j / FWDSTEP],
+#ifdef RTE_NEXT_ABI
+					ipv4_flag[j / FWDSTEP], portid,
+#else
 					flag[j / FWDSTEP], portid,
+#endif
 					&pkts_burst[j], &dst_port[j]);
 			}
 
-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (17 preceding siblings ...)
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

As unified packet types are used instead, those old bit masks and
the relevant macros for packet type indication need to be removed.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 lib/librte_mbuf/rte_mbuf.c | 4 ++++
 lib/librte_mbuf/rte_mbuf.h | 4 ++++
 2 files changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.
* Redefined the bit masks for packet RX offload flags.

v5 changes:
* Rolled back the bit masks of RX flags, for ABI compatibility.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index f506517..4320dd4 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -251,14 +251,18 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	/* case PKT_RX_HBUF_OVERFLOW: return "PKT_RX_HBUF_OVERFLOW"; */
 	/* case PKT_RX_RECIP_ERR: return "PKT_RX_RECIP_ERR"; */
 	/* case PKT_RX_MAC_ERR: return "PKT_RX_MAC_ERR"; */
+#ifndef RTE_NEXT_ABI
 	case PKT_RX_IPV4_HDR: return "PKT_RX_IPV4_HDR";
 	case PKT_RX_IPV4_HDR_EXT: return "PKT_RX_IPV4_HDR_EXT";
 	case PKT_RX_IPV6_HDR: return "PKT_RX_IPV6_HDR";
 	case PKT_RX_IPV6_HDR_EXT: return "PKT_RX_IPV6_HDR_EXT";
+#endif /* RTE_NEXT_ABI */
 	case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
 	case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
+#ifndef RTE_NEXT_ABI
 	case PKT_RX_TUNNEL_IPV4_HDR: return "PKT_RX_TUNNEL_IPV4_HDR";
 	case PKT_RX_TUNNEL_IPV6_HDR: return "PKT_RX_TUNNEL_IPV6_HDR";
+#endif /* RTE_NEXT_ABI */
 	default: return NULL;
 	}
 }
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 3a17d95..b90c73f 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -92,14 +92,18 @@ extern "C" {
 #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
 #define PKT_RX_RECIP_ERR     (0ULL << 0)  /**< Hardware processing error. */
 #define PKT_RX_MAC_ERR       (0ULL << 0)  /**< MAC error. */
+#ifndef RTE_NEXT_ABI
 #define PKT_RX_IPV4_HDR      (1ULL << 5)  /**< RX packet with IPv4 header. */
 #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended IPv4 header. */
 #define PKT_RX_IPV6_HDR      (1ULL << 7)  /**< RX packet with IPv6 header. */
 #define PKT_RX_IPV6_HDR_EXT  (1ULL << 8)  /**< RX packet with extended IPv6 header. */
+#endif /* RTE_NEXT_ABI */
 #define PKT_RX_IEEE1588_PTP  (1ULL << 9)  /**< RX IEEE1588 L2 Ethernet PT Packet. */
 #define PKT_RX_IEEE1588_TMST (1ULL << 10) /**< RX IEEE1588 L2/L4 timestamped packet.*/
+#ifndef RTE_NEXT_ABI
 #define PKT_RX_TUNNEL_IPV4_HDR (1ULL << 11) /**< RX tunnel packet with IPv4 header.*/
 #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 header. */
+#endif /* RTE_NEXT_ABI */
 #define PKT_RX_FDIR_ID       (1ULL << 13) /**< FD id reported if FDIR match. */
 #define PKT_RX_FDIR_FLX      (1ULL << 14) /**< Flexible bytes reported if FDIR match. */
 #define PKT_RX_QINQ_PKT      (1ULL << 15)  /**< RX packet with double VLAN stripped. */
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v10 07/19] vmxnet3: replace bit mask based packet type with unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (5 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 06/19] enic: " Helin Zhang
@ 2015-07-09 16:31  4%   ` Helin Zhang
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 08/19] fm10k: " Helin Zhang
                     ` (12 subsequent siblings)
  19 siblings, 0 replies; 200+ results
From: Helin Zhang @ 2015-07-09 16:31 UTC (permalink / raw)
  To: dev

To unify packet types among all PMDs, bit masks of packet type for
'ol_flags' are replaced by unified packet type.
To avoid breaking ABI compatibility, all the changes would be
enabled by RTE_NEXT_ABI, which is disabled by default.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
---
 drivers/net/vmxnet3/vmxnet3_rxtx.c | 8 ++++++++
 1 file changed, 8 insertions(+)

v2 changes:
* Used redefined packet types and enlarged packet_type field in mbuf.

v5 changes:
* Re-worded the commit logs.

v6 changes:
* Disabled the code changes for unified packet type by default, to
  avoid breaking ABI compatibility.

v7 changes:
* Renamed RTE_UNIFIED_PKT_TYPE to RTE_NEXT_ABI.

diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index a1eac45..25ae2f6 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -649,9 +649,17 @@ vmxnet3_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
 			struct ipv4_hdr *ip = (struct ipv4_hdr *)(eth + 1);
 
 			if (((ip->version_ihl & 0xf) << 2) > (int)sizeof(struct ipv4_hdr))
+#ifdef RTE_NEXT_ABI
+				rxm->packet_type = RTE_PTYPE_L3_IPV4_EXT;
+#else
 				rxm->ol_flags |= PKT_RX_IPV4_HDR_EXT;
+#endif
 			else
+#ifdef RTE_NEXT_ABI
+				rxm->packet_type = RTE_PTYPE_L3_IPV4;
+#else
 				rxm->ol_flags |= PKT_RX_IPV4_HDR;
+#endif
 
 			if (!rcd->cnc) {
 				if (!rcd->ipc)
-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
  2015-07-09  8:12  3%       ` Bruce Richardson
@ 2015-07-09 20:42  3%         ` Matthew Hall
  0 siblings, 0 replies; 200+ results
From: Matthew Hall @ 2015-07-09 20:42 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Thu, Jul 09, 2015 at 09:12:23AM +0100, Bruce Richardson wrote:
> Thanks for the feedback Matthew. Can you suggest a function prototype for such
> a walk operation that would make it useful for you. While we can keep the
> hash structure public, I'd prefer if we could avoid it, as it makes making changes
> hard due to ABI issues.
> 
> /Bruce

Hi Bruce,

I understand about the ABI issues. Hence my suggestion of an iterator if the 
structs are opaque. The names could be something like these:

rte_hash_iterate(_safe)
rte_hash_foreach(_safe)

If required due to the implementation, the safe version would be similar to 
what's seen in BSD queue.h, where you can do a slower iteration that allows 
removing a current entry without corrupting the table or iterator.

Then the function would look something like this:

rte_hash_iterate(rte_hash_t* h, rte_hash_callback_t callback, void* data)
rte_hash_iterate(rte_hash_t* h, rte_hash_callback_t callback, void* data)
rte_hash_iterate(rte_hash_t* h, rte_hash_callback_t callback, void* data)

rte_hash_callback_t would be a typedef of a function pointer for a callback 
function, something like this on the function depending how it works inside 
the hash:

int application_hash_callback(void* key, void* value, void* data)
int application_hash_callback(void* key, rte_hash_entry_t* entry, void* data)
int application_hash_callback(void* key, void* key, void* value, void* data)

The data pointer will contain the same pointer passed to the iterator. If the 
iteration function returns non-zero, iteration could be discontinued, as the 
client code found what it wanted already.

Threading synchronization responsibility will fall on the client as before. 
The iterator should say if it's thread-safe for read-only, read-write, or 
unsafe for anything, etc.

Matthew.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping
  2015-07-09 13:30  3% [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping John McNamara
@ 2015-07-10  0:43  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10  0:43 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

2015-07-09 14:30, John McNamara:
> This patchset adds ethdev API to enable and read IEEE1588/802.1AS PTP
> timestamps from devices that support it. The following functions are added:
> 
>     rte_eth_timesync_enable()
>     rte_eth_timesync_disable()
>     rte_eth_timesync_read_rx_timestamp()
>     rte_eth_timesync_read_tx_timestamp()
> 
> The "ieee1588" forwarding mode in testpmd is also refactored to demonstrate
> the new API and to clean up the code.
> 
> Adds support for igb, ixgbe and i40e.
> 
> V4:
> * Added timesync field to end of mbuf to pass IEEE1588 registers and flags.
>   Removed previous ABI deprecation notice.
> 
> V3:
> * Fixed issued with version.map.
> 
> V2:
> * Added i40e support.
> 
> * Renamed ethdev functions from rte_eth_ieee15888_*() to rte_eth_timesync_*()
>   since 802.1AS can be supported through the same interfaces.
> 
> V1:
> * Initial version for igb and ixgbe.
> 
> 
> John McNamara (7):
>   ethdev: add support for ieee1588 timestamping
>   mbuf: add field for ieee1588 timesync index
>   e1000: add support for ieee1588 timestamping
>   ixgbe: add support for ieee1588 timestamping
>   i40e: add support for ieee1588 timestamping
>   app/testpmd: refactor ieee1588 forwarding
>   doc: document ieee1588 forwarding mode

Was previously acked by Wenzhuo except the new mbuf field.
Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks
  2015-07-09  2:47 21% [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks Wenzhuo Lu
  2015-07-09  7:32  4% ` Thomas Monjalon
@ 2015-07-10  2:24 21% ` Wenzhuo Lu
  1 sibling, 0 replies; 200+ results
From: Wenzhuo Lu @ 2015-07-10  2:24 UTC (permalink / raw)
  To: dev

For x550 supports 2 new flow director modes, MAC VLAN and Cloud. The MAC VLAN
mode means the MAC and VLAN are monitored. The Cloud mode is for VxLAN and
NVGRE, and the tunnel type, TNI/VNI, inner MAC and inner VLAN are monitored. So,
there're a few new lookup fields for these 2 new modes, like MAC, tunnel type,
TNI/VNI. We have to change the ABI to support these new lookup fields.

v2 changes:
* Correct the names of the structures.
* Wrap the words.
* Add explanation for the new modes.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
---
 doc/guides/rel_notes/abi.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..37bcad9 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,13 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The ABI changes are planned for struct rte_eth_fdir_filter and
+  rte_eth_fdir_masks in order to support new flow director modes,
+  MAC VLAN and Cloud, on x550. The MAC VLAN mode means the MAC and
+  VLAN are monitored. The Cloud mode is for VxLAN and NVGRE, and
+  the tunnel type, TNI/VNI, inner MAC and inner VLAN are monitored.
+  The upcoming release 2.1 will not contain these ABI changes, but
+  release 2.2 will, and no backwards compatibility is planed due to
+  this change.
+  Binaries using this library build prior to version 2.2 will require
+  updating and recompilation.
-- 
1.9.3

^ permalink raw reply	[relevance 21%]

* Re: [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal
    @ 2015-07-10 10:27  0%     ` Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10 10:27 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-08 14:21, Bruce Richardson:
> On Wed, Jul 08, 2015 at 12:27:34PM +0100, Pablo de Lara wrote:
> > rte_hash structure should not be a public structure,
> > and therefore it should be moved to the C file and be declared
> > as internal. rte_hash_hash implementation is also moved
> > to the C file, as it uses the structure.
> > 
> > This patch also removes part of a unit test that was checking
> > a field of the structure.
> > 
> > Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> 
> Irrespective of whether or not we change the underlying hash table implementation
> this looks a good change to me. The rte_hash structure should not be used directly
> by any applications - the APIs all take pointers to the structure,
> so there should be no ABI breakage from this, I think.
> 
> Therefore:
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2 2/3] doc: added guidelines on dpdk documentation
  @ 2015-07-10 15:45  3%   ` John McNamara
  0 siblings, 0 replies; 200+ results
From: John McNamara @ 2015-07-10 15:45 UTC (permalink / raw)
  To: dev

Added guidelines on the purpose and structure of the DPDK
documentation, how to build it and guidelines for creating it.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/guidelines/documentation.rst | 579 ++++++++++++++++++++++++++++++++
 doc/guides/guidelines/index.rst         |   2 +
 2 files changed, 581 insertions(+)
 create mode 100644 doc/guides/guidelines/documentation.rst

diff --git a/doc/guides/guidelines/documentation.rst b/doc/guides/guidelines/documentation.rst
new file mode 100644
index 0000000..b47f028
--- /dev/null
+++ b/doc/guides/guidelines/documentation.rst
@@ -0,0 +1,579 @@
+.. doc_guidelines:
+
+DPDK Documentation Guidelines
+=============================
+
+This document outlines the guidelines for writing the DPDK Guides and API documentation in RST and Doxygen format.
+
+It also explains the structure of the DPDK documentation and shows how to build the Html and PDF versions of the documents.
+
+
+Structure of the Documentation
+------------------------------
+
+The DPDK source code repository contains input files to build the API documentation and User Guides.
+
+The main directories that contain files related to documentation are shown below::
+
+   lib
+   |-- librte_acl
+   |-- librte_cfgfile
+   |-- librte_cmdline
+   |-- librte_compat
+   |-- librte_eal
+   |   |-- ...
+   ...
+   doc
+   |-- api
+   +-- guides
+       |-- freebsd_gsg
+       |-- linux_gsg
+       |-- prog_guide
+       |-- sample_app_ug
+       |-- guidelines
+       |-- testpmd_app_ug
+       |-- rel_notes
+       |-- nics
+       |-- xen
+       |-- ...
+
+
+The API documentation is built from `Doxygen <http://www.stack.nl/~dimitri/doxygen/>`_ comments in the header files.
+These files are mainly in the ``lib/lib_rte_*`` directories although some of the Poll Mode Drivers in ``drivers/net``
+are also documented with Doxygen.
+
+The configuration files that are used to control the Doxygen output are in the ``doc/api`` directory.
+
+The user guides such as *The Programmers Guide* and the *FreeBSD* and *Linux Getting Started* Guides are generated
+from RST markup text files using the `Sphinx <http://sphinx-doc.org/index.html>`_ Documentation Generator.
+
+These files are included in the ``doc/guides/`` directory.
+The output is controlled by the ``doc/guides/conf.py`` file.
+
+
+Role of the Documentation
+-------------------------
+
+The following items outline the roles of the different parts of the documentation and when they need to be updated or
+added to by the developer.
+
+* **Release Notes**
+
+  The Release Notes document which features have been added in the current and previous releases of DPDK and highlight
+  any known issues.
+  The Releases Notes also contain notifications of features that will change ABI compatibility in the next major release.
+
+  Developers should update the Release Notes to add a short description of new or updated features.
+  Developers should also update the Release Notes to add ABI announcements if necessary,
+  (see :doc:`/guidelines/versioning` for details).
+
+* **API documentation**
+
+  The API documentation explains how to use the public DPDK functions.
+  The `API index page <http://dpdk.org/doc/api/>`_ shows the generated API documentation with related groups of functions.
+
+  The API documentation should be updated via Doxygen comments when new functions are added.
+
+* **Getting Started Guides**
+
+  The Getting Started Guides show how to install and configure DPDK and how to run DPDK based applications on different OSes.
+
+  A Getting Started Guide should be added when DPDK is ported to a new OS.
+
+* **The Programmers Guide**
+
+  The Programmers Guide explains how the API components of DPDK such as the EAL, Memzone, Rings and the Hash Library work.
+  It also explains how some higher level functionality such as Packet Distributor, Packet Framework and KNI work.
+  It also shows the build system and explains how to add applications.
+
+  The Programmers Guide should be expanded when new functionality is added to DPDK.
+
+* **App Guides**
+
+  The app guides document the DPDK applications in the ``app`` directory such as ``testpmd``.
+
+  The app guides should be updated if functionality is changed or added.
+
+* **Sample App Guides**
+
+  The sample app guides document the DPDK example applications in the examples directory.
+  Generally they demonstrate a major feature such as L2 or L3 Forwarding, Multi Process or Power Management.
+  They explain the purpose of the sample application, how to run it and step through some of the code to explain the
+  major functionality.
+
+  A new sample application should be accompanied by a new sample app guide.
+  The guide for the Skeleton Forwarding app is a good starting reference.
+
+* **Network Interface Controller Drivers**
+
+  The NIC Drivers document explains the features of the individual Poll Mode Drivers, such as software requirements,
+  configuration and initialization.
+
+  New documentation should be added for new Poll Mode Drivers.
+
+* **Guidelines**
+
+  The guideline documents record community process, expectations and design directions.
+
+  They can be extended, amended or discussed by submitting a patch and getting community approval.
+
+
+Building the Documentation
+--------------------------
+
+Dependencies
+~~~~~~~~~~~~
+
+
+The following dependencies must be installed to build the documentation:
+
+* Doxygen.
+
+* Sphinx (also called python-sphinx).
+
+* TexLive (at least TexLive-core, extra Latex support and extra fonts).
+
+* Inkscape.
+
+`Doxygen`_ generates documentation from commented source code.
+It can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install doxygen
+
+   # Red Hat/Fedora.
+   sudo yum     -y install doxygen
+
+`Sphinx`_ is a Python documentation tool for converting RST files to Html or to PDF (via LaTeX).
+It can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install python-sphinx
+
+   # Red Hat/Fedora.
+   sudo yum     -y install python-sphinx
+
+   # Or, on any system with Python installed.
+   sudo easy_install -U sphinx
+
+For further information on getting started with Sphinx see the `Sphinx Tutorial <http://sphinx-doc.org/tutorial.html>`_.
+
+.. Note::
+
+   To get full support for Figure and Table numbering it is best to install Sphinx 1.3.1 or later.
+
+
+`Inkscape`_ is a vector based graphics program which is used to create SVG images and also to convert SVG images to PDF images.
+It can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install inkscape
+
+   # Red Hat/Fedora.
+   sudo yum     -y install inkscape
+
+`TexLive <http://www.tug.org/texlive/>`_ is an installation package for Tex/LaTeX.
+It is used to generate the PDF versions of the documentation.
+The main required packages can be installed as follows:
+
+.. code-block:: console
+
+   # Ubuntu/Debian.
+   sudo apt-get -y install texlive-latex-extra texlive-fonts-extra \
+                           texlive-fonts-recommended
+
+
+   # Red Hat/Fedora, selective install.
+   sudo yum     -y install texlive-collection-latexextra \
+                           texlive-collection-fontsextra
+
+
+Build commands
+~~~~~~~~~~~~~~
+
+The documentation is built using the standard DPDK build system.
+Some examples are shown below:
+
+* Generate all the documentation targets::
+
+     make doc
+
+* Generate the Doxygen API documentation in Html::
+
+     make doc-api-html
+
+* Generate the guides documentation in Html::
+
+     make doc-guides-html
+
+* Generate the guides documentation in Pdf::
+
+     make doc-guides-pdf
+
+The output of these commands is generated in the ``build`` directory::
+
+   build/doc
+         |-- html
+         |   |-- api
+         |   +-- guides
+         |
+         +-- pdf
+             +-- guides
+
+
+.. Note::
+
+   Make sure to fix any Sphinx or Doxygen warnings when adding or updating documentation.
+
+The documentation output files can be removed as follows::
+
+   make doc-clean
+
+
+Document Guidelines
+-------------------
+
+Here are some guidelines in relation to the style of the documentation:
+
+* Document the obvious as well as the obscure since it won't always be obvious to the reader.
+  For example an instruction like "Set up 64 2MB Hugepages" is better when followed by a sample commandline or a link to
+  the appropriate section of the documentation.
+
+* Use American English spellings throughout.
+  This can be checked using the ``aspell`` utility::
+
+       aspell --lang=en_US --check doc/guides/sample_app_ug/mydoc.rst
+
+
+RST Guidelines
+--------------
+
+The RST (reStructuredText) format is a plain text markup format that can be converted to Html, PDF or other formats.
+It is most closely associated with Python but it can be used to document any language.
+It is used in DPDK to document everything apart from the API.
+
+The Sphinx documentation contains a very useful `RST Primer <http://sphinx-doc.org/rest.html#rst-primer>`_ which is a
+good place to learn the minimal set of syntax required to format a document.
+
+The official `reStructuredText <http://docutils.sourceforge.net/rst.html>`_ website contains the specification for the
+RST format and also examples of how to use it.
+However, for most developers the RST Primer is a better resource.
+
+The most common guidelines for writing RST text are detailed in the
+`Documenting Python <https://docs.python.org/devguide/documenting.html>`_ guidelines.
+The additional guidelines below reiterate or expand upon those guidelines.
+
+
+Line Length
+~~~~~~~~~~~
+
+* The recommended style for the DPDK documentation is to put sentences on separate lines.
+  This allows for easier reviewing of patches.
+  Multiple sentences which are not separated by a blank line are joined automatically into paragraphs, for example::
+
+     Here is an example sentence.
+     Long sentences over the limit shown below can be wrapped onto
+     a new line.
+     These three sentences will be joined into the same paragraph.
+
+     This is a new paragraph, since it is separated from the
+     previous paragraph by a blank line.
+
+  This would be rendered as follows:
+
+     *Here is an example sentence.
+     Long sentences over the limit shown below can be wrapped onto
+     a new line.
+     These three sentences will be joined into the same paragraph.*
+
+     *This is a new paragraph, since it is separated from the
+     previous paragraph by a blank line.*
+
+
+* Long sentences should be wrapped at 120 characters +/- 10 characters. They should be wrapped at words.
+
+* Lines in literal blocks must by less than 80 characters since they aren't wrapped by the document formatters
+  and can exceed the page width in PDF documents.
+
+
+Whitespace
+~~~~~~~~~~
+
+* Standard RST indentation is 3 spaces.
+  Code can be indented 4 spaces, especially if it is copied from source files.
+
+* No tabs.
+  Convert tabs in embedded code to 4 or 8 spaces.
+
+* No trailing whitespace.
+
+* Add 2 blank lines before each section header.
+
+* Add 1 blank line after each section header.
+
+* Add 1 blank line between each line of a list.
+
+
+Section Headers
+~~~~~~~~~~~~~~~
+
+* Section headers should use the use the following underline formats::
+
+   Level 1 Heading
+   ===============
+
+
+   Level 2 Heading
+   ---------------
+
+
+   Level 3 Heading
+   ~~~~~~~~~~~~~~~
+
+
+   Level 4 Heading
+   ^^^^^^^^^^^^^^^
+
+
+* Level 4 headings should be used sparingly.
+
+* The underlines should match the length of the text.
+
+* In general, the heading should be less than 80 characters, for conciseness.
+
+* As noted above:
+
+   * Add 2 blank lines before each section header.
+
+   * Add 1 blank line after each section header.
+
+
+Lists
+~~~~~
+
+* Bullet lists should be formatted with a leading ``*`` as follows::
+
+     * Item one.
+
+     * Item two is a long line that is wrapped and then indented to match
+       the start of the previous line.
+
+     * One space character between the bullet and the text is preferred.
+
+* Numbered lists can be formatted with a leading number but the preference is to use ``#.`` which will give automatic numbering.
+  This is more convenient when adding or removing items::
+
+     #. Item one.
+
+     #. Item two is a long line that is wrapped and then indented
+        to match the start of the e first line.
+
+     #. Item two is a long line that is wrapped and then indented to match
+        the start of the previous line.
+
+* Definition lists can be written with or without a bullet::
+
+     * Item one.
+
+       Some text about item one.
+
+     * Item two.
+
+       Some text about item two.
+
+* All lists, and sub-lists, must be separated from the preceding text by a blank line.
+  This is a syntax requirement.
+
+* All list items should be separated by a blank line for readability.
+
+
+Code and Literal block sections
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Inline text that is required to be rendered with a fixed width font should be enclosed in backquotes like this:
+  \`\`text\`\`, so that it appears like this: ``text``.
+
+* Fixed width, literal blocks of texts should be indented at least 3 spaces and prefixed with ``::`` like this::
+
+     Here is some fixed width text::
+
+        0x0001 0x0001 0x00FF 0x00FF
+
+* It is also possible to specify an encoding for a literal block using the ``.. code-block::`` directive so that syntax
+  highlighting can be applied.
+  Examples of supported highlighting are::
+
+     .. code-block:: console
+     .. code-block:: c
+     .. code-block:: python
+     .. code-block:: diff
+     .. code-block:: none
+
+  That can be applied as follows::
+
+      .. code-block:: c
+
+         #include<stdio.h>
+
+         int main() {
+
+            printf("Hello World\n");
+
+            return 0;
+         }
+
+  Which would be rendered as:
+
+  .. code-block:: c
+
+      #include<stdio.h>
+
+      int main() {
+
+         printf("Hello World\n");
+
+         return 0;
+      }
+
+
+* The default encoding for a literal block using the simplified ``::``
+  directive is ``none``.
+
+* Lines in literal blocks must be less than 80 characters since they can exceed the page width when converted to PDF documentation.
+  For long literal lines that exceed that limit try to wrap the text at sensible locations.
+  For example a long command line could be documented like this and still work if copied directly from the docs::
+
+     build/app/testpmd -c7 -n3 --vdev=eth_pcap0,iface=eth0     \
+                               --vdev=eth_pcap1,iface=eth1     \
+                               -- -i --nb-cores=2 --nb-ports=2 \
+                                  --total-num-mbufs=2048
+
+* Long lines that cannot be wrapped, such as application output, should be truncated to be less than 80 characters.
+
+
+Images
+~~~~~~
+
+* All images should be in SVG scalar graphics format.
+  They should be true SVG XML files and should not include binary formats embedded in a SVG wrapper.
+
+* The DPDK documentation contains some legacy images in PNG format.
+  These will be converted to SVG in time.
+
+* `Inkscape <inkscape.org>`_ is the recommended graphics editor for creating the images.
+  Use some of the older images in ``doc/guides/prog_guide/img/`` as a template, for example ``mbuf1.svg``
+  or ``ring-enqueue.svg``.
+
+* The SVG images should include a copyright notice, as an XML comment.
+
+* Images in the documentation should be formatted as follows:
+
+   * The image should be preceded by a label in the format ``.. _figure_XXXX:`` with a leading underscore and
+     where ``XXXX`` is a unique descriptive name.
+
+   * Images should be included using the ``.. figure::`` directive and the file type should be set to ``*`` (not ``.svg``).
+     This allows the format of the image to be changed if required, without updating the documentation.
+
+   * Images must have a caption as part of the ``.. figure::`` directive.
+
+* Here is an example of the previous three guidelines::
+
+     .. _figure_mempool:
+
+     .. figure:: img/mempool.*
+
+        A mempool in memory with its associated ring.
+
+.. _mock_label:
+
+* Images can then be linked to using the ``:numref:`` directive::
+
+     The mempool layout is shown in :numref:`figure_mempool`.
+
+  This would be rendered as: *The mempool layout is shown in* :ref:`Fig 6.3 <mock_label>`.
+
+  **Note**: The ``:numref:`` directive requires Sphinx 1.3.1 or later.
+  With earlier versions it will still be rendered as a link but won't have an automatically generated number.
+
+* The caption of the image can be generated, with a link, using the ``:ref:`` directive::
+
+     :ref:`figure_mempool`
+
+  This would be rendered as: *A mempool in memory with its associated ring.*
+
+Tables
+~~~~~~
+
+* RST tables should be used sparingly.
+  They are hard to format and to edit, they are often rendered incorrectly in PDF format, and the same information
+  can usually be shown just as clearly with a definition or bullet list.
+
+* Tables in the documentation should be formatted as follows:
+
+   * The table should be preceded by a label in the format ``.. _table_XXXX:`` with a leading underscore and where
+     ``XXXX`` is a unique descriptive name.
+
+   * Tables should be included using the ``.. table::`` directive and must have a caption.
+
+* Here is an example of the previous two guidelines::
+
+     .. _table_qos_pipes:
+
+     .. table:: Sample configuration for QOS pipes.
+
+        +----------+----------+----------+
+        | Header 1 | Header 2 | Header 3 |
+        |          |          |          |
+        +==========+==========+==========+
+        | Text     | Text     | Text     |
+        +----------+----------+----------+
+        | ...      | ...      | ...      |
+        +----------+----------+----------+
+
+* Tables can be linked to using the ``:numref:`` and ``:ref:`` directives, as shown in the previous section for images.
+  For example::
+
+     The QOS configuration is shown in :numref:`table_qos_pipes`.
+
+* Tables should not include merged cells since they are not supported by the PDF renderer.
+
+
+.. _links:
+
+Hyperlinks
+~~~~~~~~~~
+
+* Links to external websites can be plain URLs.
+  The following is rendered as http://dpdk.org::
+
+     http://dpdk.org
+
+* They can contain alternative text.
+  The following is rendered as `Check out DPDK <http://dpdk.org>`_::
+
+     `Check out DPDK <http://dpdk.org>`_
+
+* An internal link can be generated by placing labels in the document with the format ``.. _label_name``.
+
+* The following links to the top of this section: :ref:`links`::
+
+     .. _links:
+
+     Hyperlinks
+     ~~~~~~~~~~
+
+     * The following links to the top of this section: :ref:`links`:
+
+.. Note::
+
+   The label must have a leading underscore but the reference to it must omit it.
+   This is a frequent cause of errors and warnings.
+
+* The use of a label is preferred since it works across files and will still work if the header text changes.
+
diff --git a/doc/guides/guidelines/index.rst b/doc/guides/guidelines/index.rst
index bfb9fa3..1050d99 100644
--- a/doc/guides/guidelines/index.rst
+++ b/doc/guides/guidelines/index.rst
@@ -8,3 +8,5 @@ Guidelines
     coding_style
     design
     versioning
+    documentation
+
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  @ 2015-07-10 16:07  4%         ` Mcnamara, John
  2015-07-11 14:19  7%           ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-10 16:07 UTC (permalink / raw)
  To: Neil Horman, Thomas Monjalon; +Cc: dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> Sent: Tuesday, July 7, 2015 2:44 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> 
> On Tue, Jul 07, 2015 at 05:46:08AM -0700, Thomas Monjalon wrote:
> > Neil, in the meantime, could you please help to check ABI breakage in
> the HEAD?
> >
> Took a look, the only ABI break I see that we need to worry about is the
> one introduced in commit 8eecb3295aed0a979def52245564d03be172a83c. It adds
> a bitfield called lro into the existing uint8_t there, but does so in the
> middle of the set, which pushes the other bits around, breaking ABI.  It
> should have been added to the end.

Hi,

Is it okay to submit a patch to move it to the end?

John.
-- 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash
    @ 2015-07-10 17:24  4%   ` Pablo de Lara
  2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
                       ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 17:24 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation tipically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not reseting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table 
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  305 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1812 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.2

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
@ 2015-07-10 17:24 14%     ` Pablo de Lara
  2015-07-10 20:52  0%     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
  2 siblings, 0 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 17:24 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..312348e 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.2

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
  2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-10 20:52  0%     ` Bruce Richardson
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
  2 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-10 20:52 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

On Fri, Jul 10, 2015 at 06:24:17PM +0100, Pablo de Lara wrote:
> This patchset is to replace the existing hash library with
> a more efficient and functional approach, using the Cuckoo hash
> method to deal with collisions. This method is based on using
> two different hash functions to have two possible locations
> in the hash table where an entry can be.
> So, if a bucket is full, a new entry can push one of the items
> in that bucket to its alternative location, making space for itself.
> 
> Advantages
> ~~~~~
> - Offers the option to store more entries when the target bucket is full
>   (unlike the previous implementation)
> - Memory efficient: for storing those entries, it is not necessary to
>   request new memory, as the entries will be stored in the same table
> - Constant worst lookup time: in worst case scenario, it always takes
>   the same time to look up an entry, as there are only two possible locations
>   where an entry can be.
> - Storing data: user can store data in the hash table, unlike the
>   previous implementation, but he can still use the old API
> 
> This implementation tipically offers over 90% utilization.
> Notice that API has been extended, but old API remains.
> Check documentation included to know more about this new implementation
> (including how entries are distributed as table utilization increases).
> 
> Changes in v4:
> - Unit tests enhancements are not part of this patchset anymore.
> - rte_hash structure has been made internal in another patch,
>   so it is not part of this patchset anymore.
> - Add function to iterate through the hash table, as rte_hash
>   structure has been made private.
> - Added extra_flag parameter in rte_hash_parameter to be able
>   to add new parameters in the future without breaking the ABI
> - Remove proposed lookup_bulk_with_hash function, as it is
>   not of much use with the existing hash functions
>   (there are no vector hash functions).
> - User can store 8-byte integer or pointer as data, instead
>   of variable size data, as discussed in the mailing list.
>

Hi Pablo,

I'm getting some compile errors with this code, perhaps you could recheck e.g
32-bit and with clang.
On the plus side, I like the docs included with this set.

Regards,
/Bruce

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v5 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
  2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
  2015-07-10 20:52  0%     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
@ 2015-07-10 21:57  4%     ` Pablo de Lara
  2015-07-10 21:57 14%       ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  2 siblings, 2 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 21:57 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation tipically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v5:
- Fix documentation
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not reseting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  305 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1812 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
@ 2015-07-10 21:57 14%       ` Pablo de Lara
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  1 sibling, 0 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 21:57 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 110c486..312348e 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -12,3 +12,4 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.3

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] config: revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256
  @ 2015-07-10 21:58  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10 21:58 UTC (permalink / raw)
  To: Jijiang Liu; +Cc: dev

2015-07-08 09:24, Jijiang Liu:
> Revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256.
> 
> The previous commit changed the size and the offsets of struct rte_eth_dev,
> so it is an ABI breakage. I revert it, and will send a deprecation notice for this.
> 
> Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port
  @ 2015-07-10 22:14  4%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-10 22:14 UTC (permalink / raw)
  To: Jijiang Liu; +Cc: dev

> > The significant ABI change of all shared libraries is planned for struct rte_eth_dev to support up to 1024 queues per port which will be taken effect from release 2.2.
> > 
> > Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
[...]
> >  Deprecation Notices
> >  -------------------
> > +* Significant ABI changes are planned for struct rte_eth_dev to support up to 1024 queues per port. This change will be taken effect for shared libraries from release 2.2. There is no backward compatibility planned from release 2.2. All binaries will need to be rebuilt from release 2.2.

not only for shared libraries

> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Wrapped and applied with above fix, thanks

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
  2015-07-10 21:57 14%       ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-10 23:30  4%       ` Pablo de Lara
  2015-07-10 23:30 14%         ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  1 sibling, 2 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 23:30 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation typically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v6:
- Replace datatype for functions from uintptr_t to void *

Changes in v5:
- Fix documentation
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not resetting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  303 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1810 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
@ 2015-07-10 23:30 14%         ` Pablo de Lara
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  1 sibling, 0 replies; 200+ results
From: Pablo de Lara @ 2015-07-10 23:30 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9d60a74..b45017c 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -21,3 +21,4 @@ Deprecation Notices
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
   All binaries will need to be rebuilt from release 2.2.
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.3

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  2015-07-10 23:30 14%         ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-11  0:18  4%         ` Pablo de Lara
                               ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Pablo de Lara @ 2015-07-11  0:18 UTC (permalink / raw)
  To: dev

This patchset is to replace the existing hash library with
a more efficient and functional approach, using the Cuckoo hash
method to deal with collisions. This method is based on using
two different hash functions to have two possible locations
in the hash table where an entry can be.
So, if a bucket is full, a new entry can push one of the items
in that bucket to its alternative location, making space for itself.

Advantages
~~
- Offers the option to store more entries when the target bucket is full
  (unlike the previous implementation)
- Memory efficient: for storing those entries, it is not necessary to
  request new memory, as the entries will be stored in the same table
- Constant worst lookup time: in worst case scenario, it always takes
  the same time to look up an entry, as there are only two possible locations
  where an entry can be.
- Storing data: user can store data in the hash table, unlike the
  previous implementation, but he can still use the old API

This implementation typically offers over 90% utilization.
Notice that API has been extended, but old API remains.
Check documentation included to know more about this new implementation
(including how entries are distributed as table utilization increases).

Changes in v7:
- Fix inaccurate documentation

Changes in v6:
- Replace datatype for functions from uintptr_t to void *

Changes in v5:
- Fix 32-bit compilation issues

Changes in v4:
- Unit tests enhancements are not part of this patchset anymore.
- rte_hash structure has been made internal in another patch,
  so it is not part of this patchset anymore.
- Add function to iterate through the hash table, as rte_hash
  structure has been made private.
- Added extra_flag parameter in rte_hash_parameter to be able
  to add new parameters in the future without breaking the ABI
- Remove proposed lookup_bulk_with_hash function, as it is
  not of much use with the existing hash functions
  (there are no vector hash functions).
- User can store 8-byte integer or pointer as data, instead
  of variable size data, as discussed in the mailing list.

Changes in v3:

- Now user can store variable size data, instead of 32 or 64-bit size data,
  using the new parameter "data_len" in rte_hash_parameters
- Add lookup_bulk_with_hash function in performance  unit tests
- Add new functions that handle data in performance unit tests
- Remove duplicates in performance unit tests
- Fix rte_hash_reset, which was not resetting the last entry

Changes in v2:

- Fixed issue where table could not store maximum number of entries
- Fixed issue where lookup burst could not be more than 32 (instead of 64)
- Remove unnecessary macros and add other useful ones
- Added missing library dependencies
- Used directly rte_hash_secondary instead of rte_hash_alt
- Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
- Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
- Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
  make them private
- Corrected copyright dates
- Added an optimized function to compare keys that are multiple of 16 bytes
- Improved the way to use primary/secondary signatures. Now both are stored in
  the bucket, so there is no need to calculate them if required.
  Also, there is no need to use the MSB of a signature to differenciate between
  an empty entry and signature 0, since we are storing both signatures,
  which cannot be both 0.
- Removed rte_hash_rehash, as it was a very expensive operation.
  Therefore, the add function returns now -ENOSPC if key cannot be added
  because of a loop.
- Prefetched new slot for new key in add function to improve performance.
- Made doxygen comments more clear.
- Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
  as we can use the lookup functions if we want to get the data before deleting.
- Removed some unnecessary includes in rte_hash.h
- Removed some unnecessary variables in rte_cuckoo_hash.c
- Removed some unnecessary checks before creating a new hash table
- Added documentation (in release notes and programmers guide)
- Added new unit tests and replaced the performance one for hash tables

Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Pablo de Lara (7):
  hash: replace existing hash library with cuckoo hash implementation
  hash: add new function rte_hash_reset
  hash: add new functionality to store data in hash table
  hash: add iterate function
  MAINTAINERS: claim responsability for hash library
  doc: announce ABI change of librte_hash
  doc: update hash documentation

 MAINTAINERS                          |    1 +
 app/test/test_hash.c                 |  189 +++---
 app/test/test_hash_perf.c            |  303 ++++++---
 doc/guides/prog_guide/hash_lib.rst   |  138 +++-
 doc/guides/prog_guide/index.rst      |    4 +
 doc/guides/rel_notes/abi.rst         |    1 +
 lib/librte_hash/Makefile             |    8 +-
 lib/librte_hash/rte_cuckoo_hash.c    | 1194 ++++++++++++++++++++++++++++++++++
 lib/librte_hash/rte_hash.c           |  499 --------------
 lib/librte_hash/rte_hash.h           |  198 +++++-
 lib/librte_hash/rte_hash_version.map |   15 +
 11 files changed, 1810 insertions(+), 740 deletions(-)
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.c
 delete mode 100644 lib/librte_hash/rte_hash.c

-- 
2.4.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
  @ 2015-07-11  0:18 14%           ` Pablo de Lara
  2015-07-12 22:38  8%             ` Thomas Monjalon
  2015-07-12 22:46  0%           ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Thomas Monjalon
  2 siblings, 1 reply; 200+ results
From: Pablo de Lara @ 2015-07-11  0:18 UTC (permalink / raw)
  To: dev

Two of the macros in rte_hash.h are now deprecated, so this patch
adds notice that they will be removed in 2.2.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/abi.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9d60a74..b45017c 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -21,3 +21,4 @@ Deprecation Notices
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
   All binaries will need to be rebuilt from release 2.2.
+* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.
-- 
2.4.3

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-10 16:07  4%         ` Mcnamara, John
@ 2015-07-11 14:19  7%           ` Neil Horman
  2015-07-13 10:14  8%             ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-11 14:19 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Fri, Jul 10, 2015 at 04:07:53PM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > Sent: Tuesday, July 7, 2015 2:44 PM
> > To: Thomas Monjalon
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> > 
> > On Tue, Jul 07, 2015 at 05:46:08AM -0700, Thomas Monjalon wrote:
> > > Neil, in the meantime, could you please help to check ABI breakage in
> > the HEAD?
> > >
> > Took a look, the only ABI break I see that we need to worry about is the
> > one introduced in commit 8eecb3295aed0a979def52245564d03be172a83c. It adds
> > a bitfield called lro into the existing uint8_t there, but does so in the
> > middle of the set, which pushes the other bits around, breaking ABI.  It
> > should have been added to the end.
> 
> Hi,
> 
> Is it okay to submit a patch to move it to the end?
> 
Assuming that fixes the problem, I think thats the only thing you can do right
now.  I expect that will work, but I would run it through the ABI checker to be
certain
Neil


> John.
> -- 
> 
> 
> 
> 

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  @ 2015-07-12 22:29  3%             ` Thomas Monjalon
  2015-07-13 16:11  3%               ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-12 22:29 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-11 01:18, Pablo de Lara:
> The main change when creating a new table is that the number of entries
> per bucket is fixed now, so its parameter is ignored now
> (still there to maintain the same parameters structure).

Why not rename the "bucket_entries" field to "reserved"?
The API of this field has changed (now ignored) so it should be reflected
without changing the ABI.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash
  2015-07-11  0:18 14%           ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-12 22:38  8%             ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-12 22:38 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-11 01:18, Pablo de Lara:
> Two of the macros in rte_hash.h are now deprecated, so this patch
> adds notice that they will be removed in 2.2.
[...]
> +* The Macros #RTE_HASH_BUCKET_ENTRIES_MAX and #RTE_HASH_KEY_LENGTH_MAX are deprecated and will be removed with version 2.2.

These macros have no impact on the ABI.
I suggest to rename doc/guides/rel_notes/abi.rst to deprecation.rst
and add a chapter about API in doc/guides/guidelines/versioning.rst.

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash
  2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
    2015-07-11  0:18 14%           ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
@ 2015-07-12 22:46  0%           ` Thomas Monjalon
  2 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-12 22:46 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

2015-07-11 01:18, Pablo de Lara:
> This patchset is to replace the existing hash library with
> a more efficient and functional approach, using the Cuckoo hash
> method to deal with collisions. This method is based on using
> two different hash functions to have two possible locations
> in the hash table where an entry can be.
> So, if a bucket is full, a new entry can push one of the items
> in that bucket to its alternative location, making space for itself.
> 
> Advantages
> ~~
> - Offers the option to store more entries when the target bucket is full
>   (unlike the previous implementation)
> - Memory efficient: for storing those entries, it is not necessary to
>   request new memory, as the entries will be stored in the same table
> - Constant worst lookup time: in worst case scenario, it always takes
>   the same time to look up an entry, as there are only two possible locations
>   where an entry can be.
> - Storing data: user can store data in the hash table, unlike the
>   previous implementation, but he can still use the old API
> 
> This implementation typically offers over 90% utilization.
> Notice that API has been extended, but old API remains.
> Check documentation included to know more about this new implementation
> (including how entries are distributed as table utilization increases).
> 
> Changes in v7:
> - Fix inaccurate documentation
> 
> Changes in v6:
> - Replace datatype for functions from uintptr_t to void *
> 
> Changes in v5:
> - Fix 32-bit compilation issues
> 
> Changes in v4:
> - Unit tests enhancements are not part of this patchset anymore.
> - rte_hash structure has been made internal in another patch,
>   so it is not part of this patchset anymore.
> - Add function to iterate through the hash table, as rte_hash
>   structure has been made private.
> - Added extra_flag parameter in rte_hash_parameter to be able
>   to add new parameters in the future without breaking the ABI
> - Remove proposed lookup_bulk_with_hash function, as it is
>   not of much use with the existing hash functions
>   (there are no vector hash functions).
> - User can store 8-byte integer or pointer as data, instead
>   of variable size data, as discussed in the mailing list.
> 
> Changes in v3:
> 
> - Now user can store variable size data, instead of 32 or 64-bit size data,
>   using the new parameter "data_len" in rte_hash_parameters
> - Add lookup_bulk_with_hash function in performance  unit tests
> - Add new functions that handle data in performance unit tests
> - Remove duplicates in performance unit tests
> - Fix rte_hash_reset, which was not resetting the last entry
> 
> Changes in v2:
> 
> - Fixed issue where table could not store maximum number of entries
> - Fixed issue where lookup burst could not be more than 32 (instead of 64)
> - Remove unnecessary macros and add other useful ones
> - Added missing library dependencies
> - Used directly rte_hash_secondary instead of rte_hash_alt
> - Renamed rte_hash.c to rte_cuckoo_hash.c to ease the view of the new library
> - Renamed test_hash_perf.c temporarily to ease the view of the improved unit test
> - Moved rte_hash, rte_bucket and rte_hash_key structures to rte_cuckoo_hash.c to
>   make them private
> - Corrected copyright dates
> - Added an optimized function to compare keys that are multiple of 16 bytes
> - Improved the way to use primary/secondary signatures. Now both are stored in
>   the bucket, so there is no need to calculate them if required.
>   Also, there is no need to use the MSB of a signature to differenciate between
>   an empty entry and signature 0, since we are storing both signatures,
>   which cannot be both 0.
> - Removed rte_hash_rehash, as it was a very expensive operation.
>   Therefore, the add function returns now -ENOSPC if key cannot be added
>   because of a loop.
> - Prefetched new slot for new key in add function to improve performance.
> - Made doxygen comments more clear.
> - Removed unnecessary rte_hash_del_key_data and rte_hash_del_key_with_data,
>   as we can use the lookup functions if we want to get the data before deleting.
> - Removed some unnecessary includes in rte_hash.h
> - Removed some unnecessary variables in rte_cuckoo_hash.c
> - Removed some unnecessary checks before creating a new hash table
> - Added documentation (in release notes and programmers guide)
> - Added new unit tests and replaced the performance one for hash tables
> 
> Series Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

Some bugs/formatting were fixed in fly.
Some remaining comments may be addressed in further patches.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] mk: enable next abi preview
  @ 2015-07-13  7:32  7%   ` Mcnamara, John
  2015-07-13  8:48  7%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-13  7:32 UTC (permalink / raw)
  To: Thomas Monjalon, nhorman; +Cc: dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, July 8, 2015 5:44 PM
> To: nhorman@tuxdriver.com
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3] mk: enable next abi preview
> 
> --- a/scripts/validate-abi.sh
> +++ b/scripts/validate-abi.sh
> @@ -157,6 +157,7 @@ git checkout $TAG1
>  # Make sure we configure SHARED libraries  # Also turn off IGB and KNI as
> those require kernel headers to build  sed -i -e"$
> a\CONFIG_RTE_BUILD_SHARED_LIB=y" config/defconfig_$TARGET
> +sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" config/defconfig_$TARGET
>  sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" config/defconfig_$TARGET  sed -i
> -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" config/defconfig_$TARGET
> 

Hi,

This change to enable CONFIG_RTE_NEXT_ABI=n breaks validate-abi.sh because master won't compile with CONFIG_RTE_BUILD_SHARED_LIB=y and CONFIG_RTE_NEXT_ABI=n:

     make uninstall && \
     make T=x86_64-native-linuxapp-gcc install -j \
          CONFIG_RTE_BUILD_SHARED_LIB=y CONFIG_RTE_NEXT_ABI=n

Change was made in: 506f51cc0da7e057ac31e15048ba3b8015112226

John.
-- 

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v3] mk: enable next abi preview
  2015-07-13  7:32  7%   ` Mcnamara, John
@ 2015-07-13  8:48  7%     ` Thomas Monjalon
  2015-07-13  9:02  8%       ` [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-13  8:48 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

2015-07-13 07:32, Mcnamara, John:
> This change to enable CONFIG_RTE_NEXT_ABI=n breaks validate-abi.sh
> because master won't compile with CONFIG_RTE_BUILD_SHARED_LIB=y and
> CONFIG_RTE_NEXT_ABI=n:

My bad. I thought I was testing both cases (next ABI and stable one) but
it appears only the "next one" was tested.

The error is trivial:
-       $(Q)ln -s -f $< $(RTE_OUTPUT)/lib/$(LIBSONAME)
+       $(Q)ln -s -f $< $(basename $(basename $@))

The double basename should apply to NEXT_ABI case only.

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi
  2015-07-13  8:48  7%     ` Thomas Monjalon
@ 2015-07-13  9:02  8%       ` Thomas Monjalon
  2015-07-13  9:24  4%         ` Mcnamara, John
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-13  9:02 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

When next ABI is enabled, the shared lib extension is .so.x.1.
That's why a double basename was introduced.
But the "ifeq NEXT_ABI" was forgotten, removing the .so
extension when NEXT_ABI is disabled.
It was preventing the linker from finding the .so libraries.

Fixes: 506f51cc0da7 ("mk: enable next abi preview")

Reported-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 mk/rte.lib.mk | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index f15de9b..9ff5cce 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -173,7 +173,11 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
 	@[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
 	$(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+ifeq ($(CONFIG_RTE_NEXT_ABI),y)
 	$(Q)ln -s -f $< $(basename $(basename $@))
+else
+	$(Q)ln -s -f $< $(basename $@)
+endif
 endif
 
 #
-- 
2.4.2

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi
  2015-07-13  9:02  8%       ` [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi Thomas Monjalon
@ 2015-07-13  9:24  4%         ` Mcnamara, John
  2015-07-13  9:32  7%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-13  9:24 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, July 13, 2015 10:02 AM
> To: Mcnamara, John
> Cc: dev@dpdk.org
> Subject: [PATCH] mk: fix shared lib build with stable abi
> 
> When next ABI is enabled, the shared lib extension is .so.x.1.
> That's why a double basename was introduced.
> But the "ifeq NEXT_ABI" was forgotten, removing the .so extension when
> NEXT_ABI is disabled.
> It was preventing the linker from finding the .so libraries.
> 
> Fixes: 506f51cc0da7 ("mk: enable next abi preview")
> 
> Reported-by: John McNamara <john.mcnamara@intel.com>
> Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>

That fixes it. Thanks.

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi
  2015-07-13  9:24  4%         ` Mcnamara, John
@ 2015-07-13  9:32  7%           ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13  9:32 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

> > When next ABI is enabled, the shared lib extension is .so.x.1.
> > That's why a double basename was introduced.
> > But the "ifeq NEXT_ABI" was forgotten, removing the .so extension when
> > NEXT_ABI is disabled.
> > It was preventing the linker from finding the .so libraries.
> > 
> > Fixes: 506f51cc0da7 ("mk: enable next abi preview")
> > 
> > Reported-by: John McNamara <john.mcnamara@intel.com>
> > Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> 
> That fixes it. Thanks.
> 
> Acked-by: John McNamara <john.mcnamara@intel.com>

Applied, thanks.
Now that next ABI can be disabled, patches using it may be applied :)

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
  2015-07-11 14:19  7%           ` Neil Horman
@ 2015-07-13 10:14  8%             ` Mcnamara, John
  0 siblings, 0 replies; 200+ results
From: Mcnamara, John @ 2015-07-13 10:14 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Saturday, July 11, 2015 3:20 PM
> To: Mcnamara, John
> Cc: Thomas Monjalon; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> 
> On Fri, Jul 10, 2015 at 04:07:53PM +0000, Mcnamara, John wrote:
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Neil Horman
> > > Sent: Tuesday, July 7, 2015 2:44 PM
> > > To: Thomas Monjalon
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] mk: enable next abi in static libs
> > >
> ...
>
> > Is it okay to submit a patch to move it to the end?
> >
> Assuming that fixes the problem, I think thats the only thing you can do
> right now.  I expect that will work, but I would run it through the ABI
> checker to be certain.

Hi Neil,

I ran the change through the abi checker before and after and the lro field no longer generates a Medium Severity ABI warning. I'll submit a patch to move the field.

John.
-- 

^ permalink raw reply	[relevance 8%]

* [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
@ 2015-07-13 10:26  7% John McNamara
  2015-07-13 10:42  7% ` Neil Horman
  2015-07-16 22:22  4% ` Vlad Zolotarov
  0 siblings, 2 replies; 200+ results
From: John McNamara @ 2015-07-13 10:26 UTC (permalink / raw)
  To: dev, vladz

Fix for ABI breakage introduced in LRO addition. Moves
lro bitfield to the end of the struct/member.

Fixes: 8eecb3295aed (ixgbe: add LRO support)

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 lib/librte_ether/rte_ethdev.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 79bde89..1c3ace1 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
 	uint8_t port_id;           /**< Device [external] port identifier. */
 	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
 		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
-		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
 		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
-		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
+		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
+		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
 };
 
 /**
-- 
1.8.1.4

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:26  7% [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code John McNamara
@ 2015-07-13 10:42  7% ` Neil Horman
  2015-07-13 10:46  4%   ` Thomas Monjalon
  2015-07-13 10:47  4%   ` Mcnamara, John
  2015-07-16 22:22  4% ` Vlad Zolotarov
  1 sibling, 2 replies; 200+ results
From: Neil Horman @ 2015-07-13 10:42 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> Fix for ABI breakage introduced in LRO addition. Moves
> lro bitfield to the end of the struct/member.
> 
> Fixes: 8eecb3295aed (ixgbe: add LRO support)
> 
> Signed-off-by: John McNamara <john.mcnamara@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 79bde89..1c3ace1 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>  	uint8_t port_id;           /**< Device [external] port identifier. */
>  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
>  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
> -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
>  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
>  };
>  
>  /**
> -- 
> 1.8.1.4
> 
> 
I presume the ABI checker stopped complaining about this with the patch, yes?

Also, it would be great if someone could check this on ppc or a ppc cross
compile, as I recall bitfields follow endianess order.

Neil

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:42  7% ` Neil Horman
  2015-07-13 10:46  4%   ` Thomas Monjalon
@ 2015-07-13 10:47  4%   ` Mcnamara, John
  2015-07-13 13:59  4%     ` Neil Horman
  1 sibling, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-13 10:47 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, July 13, 2015 11:42 AM
> To: Mcnamara, John
> Cc: dev@dpdk.org; vladz@cloudius-systems.com
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> 
> On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > Fix for ABI breakage introduced in LRO addition. Moves lro bitfield to
> > the end of the struct/member.
> >
> > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> >
> > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.h
> > b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> >  	uint8_t port_id;           /**< Device [external] port identifier.
> */
> >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0).
> */
> >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) /
> OFF(0) */
> > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0).
> */
> > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0).
> */
> > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0).
> */
> > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> >  };
> >
> >  /**
> > --
> > 1.8.1.4
> >
> >
> I presume the ABI checker stopped complaining about this with the patch,
> yes?

Hi Neil,

Yes, I replied about that in the previous thread.

John.
-- 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:42  7% ` Neil Horman
@ 2015-07-13 10:46  4%   ` Thomas Monjalon
  2015-07-13 10:47  4%   ` Mcnamara, John
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13 10:46 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

2015-07-13 06:42, Neil Horman:
> On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > Fix for ABI breakage introduced in LRO addition. Moves
> > lro bitfield to the end of the struct/member.
> > 
> > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> > 
> > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 79bde89..1c3ace1 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> >  	uint8_t port_id;           /**< Device [external] port identifier. */
> >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
> >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
> > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> >  };
> >  
> >  /**
> I presume the ABI checker stopped complaining about this with the patch, yes?
> 
> Also, it would be great if someone could check this on ppc or a ppc cross
> compile, as I recall bitfields follow endianess order.

+ Chao, IBM POWER maintainer.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:47  4%   ` Mcnamara, John
@ 2015-07-13 13:59  4%     ` Neil Horman
  2015-07-17 11:45  7%       ` Mcnamara, John
  2015-07-31  9:03  7%       ` Mcnamara, John
  0 siblings, 2 replies; 200+ results
From: Neil Horman @ 2015-07-13 13:59 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Mon, Jul 13, 2015 at 10:47:03AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, July 13, 2015 11:42 AM
> > To: Mcnamara, John
> > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > 
> > On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > > Fix for ABI breakage introduced in LRO addition. Moves lro bitfield to
> > > the end of the struct/member.
> > >
> > > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> > >
> > > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > > ---
> > >  lib/librte_ether/rte_ethdev.h | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> > >  	uint8_t port_id;           /**< Device [external] port identifier.
> > */
> > >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0).
> > */
> > >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) /
> > OFF(0) */
> > > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> > >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0).
> > */
> > > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0).
> > */
> > > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0).
> > */
> > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > >  };
> > >
> > >  /**
> > > --
> > > 1.8.1.4
> > >
> > >
> > I presume the ABI checker stopped complaining about this with the patch,
> > yes?
> 
> Hi Neil,
> 
> Yes, I replied about that in the previous thread.
> 
Thank you, I'll ack as soon as Chao confirms its not a problem on ppc
Neil

> John.
> -- 
> 
> 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5 0/9] Expose IXGBE extended stats to DPDK apps
@ 2015-07-13 14:17  3% Maryam Tahhan
  2015-07-13 14:17  9% ` [dpdk-dev] [PATCH v5 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
  0 siblings, 1 reply; 200+ results
From: Maryam Tahhan @ 2015-07-13 14:17 UTC (permalink / raw)
  To: dev

This patch set implements xstats_get() and xstats_reset() in dev_ops for
ixgbe to expose detailed error statistics to DPDK applications. The
dump_cfg application was extended to demonstrate the usage of
retrieving statistics for DPDK interfaces and renamed to proc_info
in order reflect this new functionality. This patch set also removes non
generic statistics from the statistics strings at the ethdev level and
marks the relevant registers as depricated in struct rte_eth_stats.

v2:
 - Fixed patch dependencies.
 - Broke down patches into smaller logical changes.

v3:
 - Removes non-generic stats fields in rte_stats_strings and deprecates
   the fields related to them in struct rte_eth_stats.
 - Modifies rte_eth_xstats_get() to return generic stats and extended stats.
 
v4:
 - Replace count use in the loop in ixgbe_dev_xstats_get() function definition with i.
 - Breakdown "ixgbe: add NIC specific stats removed from ethdev" into two patches, one
   that adds the stats and another that extends ierrors to include more error stats.
 - Remove second call to ixgbe_dev_xstats_get() from rte_eth_xstats_get().

v5:
 - Added documentation for proc_info.
 - Fixed proc_info copyright year.
 - Display queue stats for all devices in proc_info.

Maryam Tahhan (9):
  ixgbe: move stats register reads to a new function
  ixgbe: add functions to get and reset xstats
  ethdev: expose extended error stats
  ethdev: remove HW specific stats in stats structs
  ixgbe: add NIC specific stats removed from ethdev
  ixgbe: return more errors in ierrors
  app: remove dump_cfg
  app: add a new app proc_info
  doc: Add documentation for proc_info

 MAINTAINERS                            |   4 +
 app/Makefile                           |   2 +-
 app/dump_cfg/Makefile                  |  45 -----
 app/dump_cfg/main.c                    |  92 ---------
 app/proc_info/Makefile                 |  45 +++++
 app/proc_info/main.c                   | 354 +++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/abi.rst           |  12 ++
 doc/guides/sample_app_ug/index.rst     |   1 +
 doc/guides/sample_app_ug/proc_info.rst |  71 +++++++
 drivers/net/ixgbe/ixgbe_ethdev.c       | 194 ++++++++++++++----
 lib/librte_ether/rte_ethdev.c          |  34 ++--
 lib/librte_ether/rte_ethdev.h          |  30 ++-
 mk/rte.sdktest.mk                      |   4 +-
 13 files changed, 682 insertions(+), 206 deletions(-)
 delete mode 100644 app/dump_cfg/Makefile
 delete mode 100644 app/dump_cfg/main.c
 create mode 100644 app/proc_info/Makefile
 create mode 100644 app/proc_info/main.c
 create mode 100644 doc/guides/sample_app_ug/proc_info.rst

-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v5 4/9] ethdev: remove HW specific stats in stats structs
  2015-07-13 14:17  3% [dpdk-dev] [PATCH v5 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
@ 2015-07-13 14:17  9% ` Maryam Tahhan
  0 siblings, 0 replies; 200+ results
From: Maryam Tahhan @ 2015-07-13 14:17 UTC (permalink / raw)
  To: dev

Remove non generic stats in rte_stats_strings and mark the relevant
fields in struct rte_eth_stats as deprecated.

Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
---
 doc/guides/rel_notes/abi.rst  | 12 ++++++++++++
 lib/librte_ether/rte_ethdev.c |  9 ---------
 lib/librte_ether/rte_ethdev.h | 30 ++++++++++++++++++++----------
 3 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..d5bf625 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -24,3 +24,15 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* The following fields have been deprecated in rte_eth_stats:
+  * uint64_t imissed
+  * uint64_t ibadcrc
+  * uint64_t ibadlen
+  * uint64_t imcasts
+  * uint64_t fdirmatch
+  * uint64_t fdirmiss
+  * uint64_t tx_pause_xon
+  * uint64_t rx_pause_xon
+  * uint64_t tx_pause_xoff
+  * uint64_t rx_pause_xoff
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 2e62f43..976ce5f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,17 +142,8 @@ static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_bytes", offsetof(struct rte_eth_stats, ibytes)},
 	{"tx_bytes", offsetof(struct rte_eth_stats, obytes)},
 	{"tx_errors", offsetof(struct rte_eth_stats, oerrors)},
-	{"rx_missed_errors", offsetof(struct rte_eth_stats, imissed)},
-	{"rx_crc_errors", offsetof(struct rte_eth_stats, ibadcrc)},
-	{"rx_bad_length_errors", offsetof(struct rte_eth_stats, ibadlen)},
 	{"rx_errors", offsetof(struct rte_eth_stats, ierrors)},
 	{"alloc_rx_buff_failed", offsetof(struct rte_eth_stats, rx_nombuf)},
-	{"fdir_match", offsetof(struct rte_eth_stats, fdirmatch)},
-	{"fdir_miss", offsetof(struct rte_eth_stats, fdirmiss)},
-	{"tx_flow_control_xon", offsetof(struct rte_eth_stats, tx_pause_xon)},
-	{"rx_flow_control_xon", offsetof(struct rte_eth_stats, rx_pause_xon)},
-	{"tx_flow_control_xoff", offsetof(struct rte_eth_stats, tx_pause_xoff)},
-	{"rx_flow_control_xoff", offsetof(struct rte_eth_stats, rx_pause_xoff)},
 };
 #define RTE_NB_STATS (sizeof(rte_stats_strings) / sizeof(rte_stats_strings[0]))
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d76bbb3..a862027 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -193,19 +193,29 @@ struct rte_eth_stats {
 	uint64_t opackets;  /**< Total number of successfully transmitted packets.*/
 	uint64_t ibytes;    /**< Total number of successfully received bytes. */
 	uint64_t obytes;    /**< Total number of successfully transmitted bytes. */
-	uint64_t imissed;   /**< Total of RX missed packets (e.g full FIFO). */
-	uint64_t ibadcrc;   /**< Total of RX packets with CRC error. */
-	uint64_t ibadlen;   /**< Total of RX packets with bad length. */
+	/**< Deprecated; Total of RX missed packets (e.g full FIFO). */
+	uint64_t imissed;
+	/**< Deprecated; Total of RX packets with CRC error. */
+	uint64_t ibadcrc;
+	/**< Deprecated; Total of RX packets with bad length. */
+	uint64_t ibadlen;
 	uint64_t ierrors;   /**< Total number of erroneous received packets. */
 	uint64_t oerrors;   /**< Total number of failed transmitted packets. */
-	uint64_t imcasts;   /**< Total number of multicast received packets. */
+	uint64_t imcasts;
+	/**< Deprecated; Total number of multicast received packets. */
 	uint64_t rx_nombuf; /**< Total number of RX mbuf allocation failures. */
-	uint64_t fdirmatch; /**< Total number of RX packets matching a filter. */
-	uint64_t fdirmiss;  /**< Total number of RX packets not matching any filter. */
-	uint64_t tx_pause_xon;  /**< Total nb. of XON pause frame sent. */
-	uint64_t rx_pause_xon;  /**< Total nb. of XON pause frame received. */
-	uint64_t tx_pause_xoff; /**< Total nb. of XOFF pause frame sent. */
-	uint64_t rx_pause_xoff; /**< Total nb. of XOFF pause frame received. */
+	uint64_t fdirmatch;
+	/**< Deprecated; Total number of RX packets matching a filter. */
+	uint64_t fdirmiss;
+	/**< Deprecated; Total number of RX packets not matching any filter. */
+	uint64_t tx_pause_xon;
+	 /**< Deprecated; Total nb. of XON pause frame sent. */
+	uint64_t rx_pause_xon;
+	/**< Deprecated; Total nb. of XON pause frame received. */
+	uint64_t tx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame sent. */
+	uint64_t rx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame received. */
 	uint64_t q_ipackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
 	/**< Total number of queue RX packets. */
 	uint64_t q_opackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
-- 
2.4.3

^ permalink raw reply	[relevance 9%]

* Re: [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
@ 2015-07-13 15:53  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13 15:53 UTC (permalink / raw)
  To: Helin Zhang; +Cc: dev

2015-07-10 00:31, Helin Zhang:
> To avoid breaking ABI compatibility, all the changes would be enabled by RTE_NEXT_ABI,
> which is disabled by default.

It is enabled by default.
This comment will be removed from all patches of the series.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-12 22:29  3%             ` Thomas Monjalon
@ 2015-07-13 16:11  3%               ` Bruce Richardson
  2015-07-13 16:14  0%                 ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-13 16:11 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> 2015-07-11 01:18, Pablo de Lara:
> > The main change when creating a new table is that the number of entries
> > per bucket is fixed now, so its parameter is ignored now
> > (still there to maintain the same parameters structure).
> 
> Why not rename the "bucket_entries" field to "reserved"?
> The API of this field has changed (now ignored) so it should be reflected
> without changing the ABI.

Since the hash_create function is itself already versionned to take account of the
new struct parameter, there is no reason to keep the field at all, as far as I can see.
We can just drop it, and let the ABI versionning handle the change.

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-13 16:11  3%               ` Bruce Richardson
@ 2015-07-13 16:14  0%                 ` Bruce Richardson
  2015-07-13 16:20  0%                   ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2015-07-13 16:14 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 13, 2015 at 05:11:54PM +0100, Bruce Richardson wrote:
> On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> > 2015-07-11 01:18, Pablo de Lara:
> > > The main change when creating a new table is that the number of entries
> > > per bucket is fixed now, so its parameter is ignored now
> > > (still there to maintain the same parameters structure).
> > 
> > Why not rename the "bucket_entries" field to "reserved"?
> > The API of this field has changed (now ignored) so it should be reflected
> > without changing the ABI.
> 
> Since the hash_create function is itself already versionned to take account of the
> new struct parameter, there is no reason to keep the field at all, as far as I can see.
> We can just drop it, and let the ABI versionning handle the change.
> 
> /Bruce

Sorry, my mistake. It's no longer versioned in the patchset that was merged, so
the field does need to be kept. :-(

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-13 16:14  0%                 ` Bruce Richardson
@ 2015-07-13 16:20  0%                   ` Thomas Monjalon
  2015-07-13 16:26  0%                     ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-13 16:20 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2015-07-13 17:14, Bruce Richardson:
> On Mon, Jul 13, 2015 at 05:11:54PM +0100, Bruce Richardson wrote:
> > On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> > > 2015-07-11 01:18, Pablo de Lara:
> > > > The main change when creating a new table is that the number of entries
> > > > per bucket is fixed now, so its parameter is ignored now
> > > > (still there to maintain the same parameters structure).
> > > 
> > > Why not rename the "bucket_entries" field to "reserved"?
> > > The API of this field has changed (now ignored) so it should be reflected
> > > without changing the ABI.
> > 
> > Since the hash_create function is itself already versionned to take account of the
> > new struct parameter, there is no reason to keep the field at all, as far as I can see.
> > We can just drop it, and let the ABI versionning handle the change.
> > 
> > /Bruce
> 
> Sorry, my mistake. It's no longer versioned in the patchset that was merged, so
> the field does need to be kept. :-(

So do you agree to submit a patch which rename the unused field?

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] hash: rename unused field to "reserved"
@ 2015-07-13 16:25  3% Bruce Richardson
  2015-07-13 16:28  0% ` Bruce Richardson
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  0 siblings, 2 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:25 UTC (permalink / raw)
  To: dev

The cuckoo hash has a fixed number of entries per bucket, so the
configuration parameter for this is unused. We change this field in the
parameters struct to "reserved" to indicate that there is now no such
parameter value, while at the same time keeping ABI consistency.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_hash/rte_hash.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 68109d5..1cddc07 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -75,7 +75,7 @@ typedef uint32_t (*rte_hash_function)(const void *key, uint32_t key_len,
 struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
-	uint32_t bucket_entries;        /**< Bucket entries. */
+	uint32_t reserved;		/**< Unused field. Should be set to 0 */
 	uint32_t key_len;		/**< Length of hash key. */
 	rte_hash_function hash_func;	/**< Primary Hash function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation
  2015-07-13 16:20  0%                   ` Thomas Monjalon
@ 2015-07-13 16:26  0%                     ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:26 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, Jul 13, 2015 at 06:20:08PM +0200, Thomas Monjalon wrote:
> 2015-07-13 17:14, Bruce Richardson:
> > On Mon, Jul 13, 2015 at 05:11:54PM +0100, Bruce Richardson wrote:
> > > On Mon, Jul 13, 2015 at 12:29:53AM +0200, Thomas Monjalon wrote:
> > > > 2015-07-11 01:18, Pablo de Lara:
> > > > > The main change when creating a new table is that the number of entries
> > > > > per bucket is fixed now, so its parameter is ignored now
> > > > > (still there to maintain the same parameters structure).
> > > > 
> > > > Why not rename the "bucket_entries" field to "reserved"?
> > > > The API of this field has changed (now ignored) so it should be reflected
> > > > without changing the ABI.
> > > 
> > > Since the hash_create function is itself already versionned to take account of the
> > > new struct parameter, there is no reason to keep the field at all, as far as I can see.
> > > We can just drop it, and let the ABI versionning handle the change.
> > > 
> > > /Bruce
> > 
> > Sorry, my mistake. It's no longer versioned in the patchset that was merged, so
> > the field does need to be kept. :-(
> 
> So do you agree to submit a patch which rename the unused field?

Yes. It should be in your inbox now... :-)

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] hash: rename unused field to "reserved"
  2015-07-13 16:25  3% [dpdk-dev] [PATCH] hash: rename unused field to "reserved" Bruce Richardson
@ 2015-07-13 16:28  0% ` Bruce Richardson
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:28 UTC (permalink / raw)
  To: dev

On Mon, Jul 13, 2015 at 05:25:51PM +0100, Bruce Richardson wrote:
> The cuckoo hash has a fixed number of entries per bucket, so the
> configuration parameter for this is unused. We change this field in the
> parameters struct to "reserved" to indicate that there is now no such
> parameter value, while at the same time keeping ABI consistency.
> 
> Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
> 
> Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Self-NAK. 

Missed some extra code dependencies using this field. V2 to follow.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"
  2015-07-13 16:25  3% [dpdk-dev] [PATCH] hash: rename unused field to "reserved" Bruce Richardson
  2015-07-13 16:28  0% ` Bruce Richardson
@ 2015-07-13 16:38  3% ` Bruce Richardson
  2015-07-13 17:29  0%   ` Thomas Monjalon
  2015-07-15  8:08  3%   ` Olga Shern
  1 sibling, 2 replies; 200+ results
From: Bruce Richardson @ 2015-07-13 16:38 UTC (permalink / raw)
  To: dev

The cuckoo hash has a fixed number of entries per bucket, so the
configuration parameter for this is unused. We change this field in the
parameters struct to "reserved" to indicate that there is now no such
parameter value, while at the same time keeping ABI consistency.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_func_reentrancy.c | 1 -
 app/test/test_hash_perf.c       | 1 -
 app/test/test_hash_scaling.c    | 1 -
 drivers/net/enic/enic_clsf.c    | 2 --
 examples/l3fwd-power/main.c     | 2 --
 examples/l3fwd-vf/main.c        | 1 -
 examples/l3fwd/main.c           | 2 --
 lib/librte_hash/rte_hash.h      | 2 +-
 8 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
index 85504c0..be61773 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -226,7 +226,6 @@ hash_create_free(__attribute__((unused)) void *arg)
 	struct rte_hash_parameters hash_params = {
 		.name = NULL,
 		.entries = 16,
-		.bucket_entries = 4,
 		.key_len = 4,
 		.hash_func = (rte_hash_function)rte_jhash_32b,
 		.hash_func_init_val = 0,
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c
index e9a522b..a87fc80 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -100,7 +100,6 @@ int32_t positions[KEYS_TO_ADD];
 /* Parameters used for hash table in unit test functions. */
 static struct rte_hash_parameters ut_params = {
 	.entries = MAX_ENTRIES,
-	.bucket_entries = BUCKET_SIZE,
 	.hash_func = rte_jhash,
 	.hash_func_init_val = 0,
 };
diff --git a/app/test/test_hash_scaling.c b/app/test/test_hash_scaling.c
index 682ae94..39602cb 100644
--- a/app/test/test_hash_scaling.c
+++ b/app/test/test_hash_scaling.c
@@ -129,7 +129,6 @@ test_hash_scaling(int locking_mode)
 	uint64_t i, key;
 	struct rte_hash_parameters hash_params = {
 		.entries = num_iterations*2,
-		.bucket_entries = 16,
 		.key_len = sizeof(key),
 		.hash_func = rte_hash_crc,
 		.hash_func_init_val = 0,
diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c
index ca12d2d..9c2abfb 100644
--- a/drivers/net/enic/enic_clsf.c
+++ b/drivers/net/enic/enic_clsf.c
@@ -63,7 +63,6 @@
 
 #define SOCKET_0                0
 #define ENICPMD_CLSF_HASH_ENTRIES       ENICPMD_FDIR_MAX
-#define ENICPMD_CLSF_BUCKET_ENTRIES     4
 
 void enic_fdir_stats_get(struct enic *enic, struct rte_eth_fdir_stats *stats)
 {
@@ -245,7 +244,6 @@ int enic_clsf_init(struct enic *enic)
 	struct rte_hash_parameters hash_params = {
 		.name = "enicpmd_clsf_hash",
 		.entries = ENICPMD_CLSF_HASH_ENTRIES,
-		.bucket_entries = ENICPMD_CLSF_BUCKET_ENTRIES,
 		.key_len = RTE_HASH_KEY_LENGTH_MAX,
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index d4eba1a..6eb459d 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1247,7 +1247,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv4_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv4_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
@@ -1256,7 +1255,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv6_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv6_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-vf/main.c b/examples/l3fwd-vf/main.c
index ccbb02f..01f610e 100644
--- a/examples/l3fwd-vf/main.c
+++ b/examples/l3fwd-vf/main.c
@@ -251,7 +251,6 @@ static lookup_struct_t *l3fwd_lookup_struct[NB_SOCKETS];
 struct rte_hash_parameters l3fwd_hash_params = {
 	.name = "l3fwd_hash_0",
 	.entries = L3FWD_HASH_ENTRIES,
-	.bucket_entries = 4,
 	.key_len = sizeof(struct ipv4_5tuple),
 	.hash_func = DEFAULT_HASH_FUNC,
 	.hash_func_init_val = 0,
diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 5c22ed1..def9594 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -2162,7 +2162,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv4_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv4_5tuple_host),
         .hash_func = ipv4_hash_crc,
         .hash_func_init_val = 0,
@@ -2171,7 +2170,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv6_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv6_5tuple_host),
         .hash_func = ipv6_hash_crc,
         .hash_func_init_val = 0,
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index 68109d5..1cddc07 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -75,7 +75,7 @@ typedef uint32_t (*rte_hash_function)(const void *key, uint32_t key_len,
 struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
-	uint32_t bucket_entries;        /**< Bucket entries. */
+	uint32_t reserved;		/**< Unused field. Should be set to 0 */
 	uint32_t key_len;		/**< Length of hash key. */
 	rte_hash_function hash_func;	/**< Primary Hash function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
@ 2015-07-13 17:29  0%   ` Thomas Monjalon
  2015-07-15  8:08  3%   ` Olga Shern
  1 sibling, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-13 17:29 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2015-07-13 17:38, Bruce Richardson:
> The cuckoo hash has a fixed number of entries per bucket, so the
> configuration parameter for this is unused. We change this field in the
> parameters struct to "reserved" to indicate that there is now no such
> parameter value, while at the same time keeping ABI consistency.
> 
> Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
> 
> Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v8 0/9] Dynamic memzones
  @ 2015-07-14  8:57  4% ` Sergio Gonzalez Monroy
  2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
                     ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-14  8:57 UTC (permalink / raw)
  To: dev

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v8:
 - Rebase against current HEAD to factor for changes made by new Tile-Gx arch

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests


v6 Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>


Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS                                       |  22 +-
 app/test/test_malloc.c                            |  86 ----
 app/test/test_memzone.c                           | 456 ++++------------------
 config/common_bsdapp                              |   8 +-
 config/common_linuxapp                            |   8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   | 220 ++++++++++-
 doc/guides/prog_guide/img/malloc_heap.png         | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst                   |   1 -
 doc/guides/prog_guide/malloc_lib.rst              | 233 -----------
 doc/guides/prog_guide/overview.rst                |  11 +-
 doc/guides/rel_notes/abi.rst                      |   6 +-
 drivers/net/af_packet/Makefile                    |   1 -
 drivers/net/bonding/Makefile                      |   1 -
 drivers/net/e1000/Makefile                        |   2 +-
 drivers/net/enic/Makefile                         |   2 +-
 drivers/net/fm10k/Makefile                        |   2 +-
 drivers/net/i40e/Makefile                         |   2 +-
 drivers/net/ixgbe/Makefile                        |   2 +-
 drivers/net/mlx4/Makefile                         |   1 -
 drivers/net/null/Makefile                         |   1 -
 drivers/net/pcap/Makefile                         |   1 -
 drivers/net/virtio/Makefile                       |   2 +-
 drivers/net/vmxnet3/Makefile                      |   2 +-
 drivers/net/xenvirt/Makefile                      |   2 +-
 lib/Makefile                                      |   2 +-
 lib/librte_acl/Makefile                           |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile                |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  19 +
 lib/librte_eal/common/Makefile                    |   1 +
 lib/librte_eal/common/eal_common_memzone.c        | 353 +++++++----------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   5 +-
 lib/librte_eal/common/include/rte_malloc.h        | 342 ++++++++++++++++
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/include/rte_memzone.h       |  11 +
 lib/librte_eal/common/malloc_elem.c               | 344 ++++++++++++++++
 lib/librte_eal/common/malloc_elem.h               | 192 +++++++++
 lib/librte_eal/common/malloc_heap.c               | 227 +++++++++++
 lib/librte_eal/common/malloc_heap.h               |  70 ++++
 lib/librte_eal/common/rte_malloc.c                | 259 ++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c         |  17 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  19 +
 lib/librte_hash/Makefile                          |   2 +-
 lib/librte_lpm/Makefile                           |   2 +-
 lib/librte_malloc/Makefile                        |   6 +-
 lib/librte_malloc/malloc_elem.c                   | 320 ---------------
 lib/librte_malloc/malloc_elem.h                   | 190 ---------
 lib/librte_malloc/malloc_heap.c                   | 208 ----------
 lib/librte_malloc/malloc_heap.h                   |  70 ----
 lib/librte_malloc/rte_malloc.c                    | 228 +----------
 lib/librte_malloc/rte_malloc.h                    | 342 ----------------
 lib/librte_malloc/rte_malloc_version.map          |  16 -
 lib/librte_mempool/Makefile                       |   2 -
 lib/librte_port/Makefile                          |   1 -
 lib/librte_ring/Makefile                          |   3 +-
 lib/librte_table/Makefile                         |   1 -
 56 files changed, 1965 insertions(+), 2372 deletions(-)
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc
  2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
  2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
@ 2015-07-14  8:57 19%   ` Sergio Gonzalez Monroy
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-14  8:57 UTC (permalink / raw)
  To: dev

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 doc/guides/rel_notes/abi.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..76e0ae2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -16,7 +16,6 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
-
 * Significant ABI changes are planned for struct rte_eth_dev to support up to
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
@@ -24,3 +23,8 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* librte_malloc library has been integrated into librte_eal. The 2.1 release
+  creates a dummy/empty malloc library to fulfill binaries with dynamic linking
+  dependencies on librte_malloc.so. Such dummy library will not be created from
+  release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

^ permalink raw reply	[relevance 19%]

* [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc
  2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
@ 2015-07-14  8:57  1%   ` Sergio Gonzalez Monroy
  2015-07-14  8:57 19%   ` [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-14  8:57 UTC (permalink / raw)
  To: dev

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/common/eal_common_memzone.c        | 289 +++++-----------------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c               |  68 +++--
 lib/librte_eal/common/malloc_elem.h               |  14 +-
 lib/librte_eal/common/malloc_heap.c               | 161 ++++++------
 lib/librte_eal/common/malloc_heap.h               |   6 +-
 lib/librte_eal/common/rte_malloc.c                |   7 +-
 8 files changed, 220 insertions(+), 330 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 9c1da71..fd7e73f 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 
+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"
 
-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
 	const struct rte_mem_config *mcfg;
+	const struct rte_memzone *mz;
 	unsigned i = 0;
 
 	/* get pointer to global configuration */
@@ -68,62 +68,62 @@ memzone_lookup_thread_unsafe(const char *name)
 	 * the algorithm is not optimal (linear), but there are few
 	 * zones and this function should be called at init only
 	 */
-	for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-		if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+	for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+		mz = &mcfg->memzone[i];
+		if (mz->addr != NULL && !strncmp(name, mz->name, RTE_MEMZONE_NAMESIZE))
 			return &mcfg->memzone[i];
 	}
 
 	return NULL;
 }
 
-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-	size_t bound)
+/* Find the heap with the greatest free block size */
+static void
+find_heap_max_free_elem(int *s, size_t *len, unsigned align)
 {
-	phys_addr_t addr_offset, bmask, end, start;
-	size_t step;
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-	step = RTE_MAX(align, bound);
-	bmask = ~((phys_addr_t)bound - 1);
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* calculate offset to closest alignment */
-	start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-	addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size > *len) {
+			*len = stats.greatest_free_size;
+			*s = i;
+		}
+	}
+	*len -= (MALLOC_ELEM_OVERHEAD + align);
+}
 
-	while (addr_offset + len < ms->len) {
+/* Find a heap that can allocate the requested size */
+static void
+find_heap_suitable(int *s, size_t len, unsigned align)
+{
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-		/* check, do we meet boundary condition */
-		end = start + len - (len != 0);
-		if ((start & bmask) == (end & bmask))
-			break;
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-		/* calculate next offset */
-		start = RTE_ALIGN_CEIL(start + 1, step);
-		addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size >= len + MALLOC_ELEM_OVERHEAD + align) {
+			*s = i;
+			break;
+		}
 	}
-
-	return addr_offset;
 }
 
 static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
-		int socket_id, uint64_t size_mask, unsigned align,
-		unsigned bound)
+		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
 	struct rte_mem_config *mcfg;
-	unsigned i = 0;
-	int memseg_idx = -1;
-	uint64_t addr_offset, seg_offset = 0;
 	size_t requested_len;
-	size_t memseg_len = 0;
-	phys_addr_t memseg_physaddr;
-	void *memseg_addr;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -155,7 +155,6 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	if (align < RTE_CACHE_LINE_SIZE)
 		align = RTE_CACHE_LINE_SIZE;
 
-
 	/* align length on cache boundary. Check for overflow before doing so */
 	if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) {
 		rte_errno = EINVAL; /* requested size too big */
@@ -169,108 +168,50 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	requested_len = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE,  len);
 
 	/* check that boundary condition is valid */
-	if (bound != 0 &&
-			(requested_len > bound || !rte_is_power_of_2(bound))) {
+	if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
 
-	/* find the smallest segment matching requirements */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		/* last segment */
-		if (free_memseg[i].addr == NULL)
-			break;
+	if (len == 0) {
+		if (bound != 0)
+			requested_len = bound;
+		else
+			requested_len = 0;
+	}
 
-		/* empty segment, skip it */
-		if (free_memseg[i].len == 0)
-			continue;
-
-		/* bad socket ID */
-		if (socket_id != SOCKET_ID_ANY &&
-		    free_memseg[i].socket_id != SOCKET_ID_ANY &&
-		    socket_id != free_memseg[i].socket_id)
-			continue;
-
-		/*
-		 * calculate offset to closest alignment that
-		 * meets boundary conditions.
-		 */
-		addr_offset = align_phys_boundary(free_memseg + i,
-			requested_len, align, bound);
-
-		/* check len */
-		if ((requested_len + addr_offset) > free_memseg[i].len)
-			continue;
-
-		if ((size_mask & free_memseg[i].hugepage_sz) == 0)
-			continue;
-
-		/* this segment is the best until now */
-		if (memseg_idx == -1) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
-		}
-		/* find the biggest contiguous zone */
-		else if (len == 0) {
-			if (free_memseg[i].len > memseg_len) {
-				memseg_idx = i;
-				memseg_len = free_memseg[i].len;
-				seg_offset = addr_offset;
-			}
-		}
-		/*
-		 * find the smallest (we already checked that current
-		 * zone length is > len
-		 */
-		else if (free_memseg[i].len + align < memseg_len ||
-				(free_memseg[i].len <= memseg_len + align &&
-				addr_offset < seg_offset)) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
+	if (socket_id == SOCKET_ID_ANY) {
+		if (requested_len == 0)
+			find_heap_max_free_elem(&socket_id, &requested_len, align);
+		else
+			find_heap_suitable(&socket_id, requested_len, align);
+
+		if (socket_id == SOCKET_ID_ANY) {
+			rte_errno = ENOMEM;
+			return NULL;
 		}
 	}
 
-	/* no segment found */
-	if (memseg_idx == -1) {
+	/* allocate memory on heap */
+	void *mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[socket_id], NULL,
+			requested_len, flags, align, bound);
+	if (mz_addr == NULL) {
 		rte_errno = ENOMEM;
 		return NULL;
 	}
 
-	/* save aligned physical and virtual addresses */
-	memseg_physaddr = free_memseg[memseg_idx].phys_addr + seg_offset;
-	memseg_addr = RTE_PTR_ADD(free_memseg[memseg_idx].addr,
-			(uintptr_t) seg_offset);
-
-	/* if we are looking for a biggest memzone */
-	if (len == 0) {
-		if (bound == 0)
-			requested_len = memseg_len - seg_offset;
-		else
-			requested_len = RTE_ALIGN_CEIL(memseg_physaddr + 1,
-				bound) - memseg_physaddr;
-	}
-
-	/* set length to correct value */
-	len = (size_t)seg_offset + requested_len;
-
-	/* update our internal state */
-	free_memseg[memseg_idx].len -= len;
-	free_memseg[memseg_idx].phys_addr += len;
-	free_memseg[memseg_idx].addr =
-		(char *)free_memseg[memseg_idx].addr + len;
+	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
 	struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
 	snprintf(mz->name, sizeof(mz->name), "%s", name);
-	mz->phys_addr = memseg_physaddr;
-	mz->addr = memseg_addr;
-	mz->len = requested_len;
-	mz->hugepage_sz = free_memseg[memseg_idx].hugepage_sz;
-	mz->socket_id = free_memseg[memseg_idx].socket_id;
+	mz->phys_addr = rte_malloc_virt2phy(mz_addr);
+	mz->addr = mz_addr;
+	mz->len = (requested_len == 0 ? elem->size : requested_len);
+	mz->hugepage_sz = elem->ms->hugepage_sz;
+	mz->socket_id = elem->ms->socket_id;
 	mz->flags = 0;
-	mz->memseg_id = memseg_idx;
+	mz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;
 
 	return mz;
 }
@@ -282,26 +223,6 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memzone *mz = NULL;
-	uint64_t size_mask = 0;
-
-	if (flags & RTE_MEMZONE_256KB)
-		size_mask |= RTE_PGSIZE_256K;
-	if (flags & RTE_MEMZONE_2MB)
-		size_mask |= RTE_PGSIZE_2M;
-	if (flags & RTE_MEMZONE_16MB)
-		size_mask |= RTE_PGSIZE_16M;
-	if (flags & RTE_MEMZONE_256MB)
-		size_mask |= RTE_PGSIZE_256M;
-	if (flags & RTE_MEMZONE_512MB)
-		size_mask |= RTE_PGSIZE_512M;
-	if (flags & RTE_MEMZONE_1GB)
-		size_mask |= RTE_PGSIZE_1G;
-	if (flags & RTE_MEMZONE_4GB)
-		size_mask |= RTE_PGSIZE_4G;
-	if (flags & RTE_MEMZONE_16GB)
-		size_mask |= RTE_PGSIZE_16G;
-	if (!size_mask)
-		size_mask = UINT64_MAX;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -309,18 +230,7 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 	rte_rwlock_write_lock(&mcfg->mlock);
 
 	mz = memzone_reserve_aligned_thread_unsafe(
-		name, len, socket_id, size_mask, align, bound);
-
-	/*
-	 * If we failed to allocate the requested page size, and the
-	 * RTE_MEMZONE_SIZE_HINT_ONLY flag is specified, try allocating
-	 * again.
-	 */
-	if (!mz && rte_errno == ENOMEM && size_mask != UINT64_MAX &&
-	    flags & RTE_MEMZONE_SIZE_HINT_ONLY) {
-		mz = memzone_reserve_aligned_thread_unsafe(
-			name, len, socket_id, UINT64_MAX, align, bound);
-	}
+		name, len, socket_id, flags, align, bound);
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
@@ -412,45 +322,6 @@ rte_memzone_dump(FILE *f)
 }
 
 /*
- * called by init: modify the free memseg list to have cache-aligned
- * addresses and cache-aligned lengths
- */
-static int
-memseg_sanitize(struct rte_memseg *memseg)
-{
-	unsigned phys_align;
-	unsigned virt_align;
-	unsigned off;
-
-	phys_align = memseg->phys_addr & RTE_CACHE_LINE_MASK;
-	virt_align = (unsigned long)memseg->addr & RTE_CACHE_LINE_MASK;
-
-	/*
-	 * sanity check: phys_addr and addr must have the same
-	 * alignment
-	 */
-	if (phys_align != virt_align)
-		return -1;
-
-	/* memseg is really too small, don't bother with it */
-	if (memseg->len < (2 * RTE_CACHE_LINE_SIZE)) {
-		memseg->len = 0;
-		return 0;
-	}
-
-	/* align start address */
-	off = (RTE_CACHE_LINE_SIZE - phys_align) & RTE_CACHE_LINE_MASK;
-	memseg->phys_addr += off;
-	memseg->addr = (char *)memseg->addr + off;
-	memseg->len -= off;
-
-	/* align end address */
-	memseg->len &= ~((uint64_t)RTE_CACHE_LINE_MASK);
-
-	return 0;
-}
-
-/*
  * Init the memzone subsystem
  */
 int
@@ -458,14 +329,10 @@ rte_eal_memzone_init(void)
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memseg *memseg;
-	unsigned i = 0;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* mirror the runtime memsegs from config */
-	free_memseg = mcfg->free_memseg;
-
 	/* secondary processes don't need to initialise anything */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return 0;
@@ -478,33 +345,13 @@ rte_eal_memzone_init(void)
 
 	rte_rwlock_write_lock(&mcfg->mlock);
 
-	/* fill in uninitialized free_memsegs */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (memseg[i].addr == NULL)
-			break;
-		if (free_memseg[i].addr != NULL)
-			continue;
-		memcpy(&free_memseg[i], &memseg[i], sizeof(struct rte_memseg));
-	}
-
-	/* make all zones cache-aligned */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (free_memseg[i].addr == NULL)
-			break;
-		if (memseg_sanitize(&free_memseg[i]) < 0) {
-			RTE_LOG(ERR, EAL, "%s(): Sanity check failed\n", __func__);
-			rte_rwlock_write_unlock(&mcfg->mlock);
-			return -1;
-		}
-	}
-
 	/* delete all zones */
 	mcfg->memzone_idx = 0;
 	memset(mcfg->memzone, 0, sizeof(mcfg->memzone));
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
-	return 0;
+	return rte_eal_malloc_heap_init();
 }
 
 /* Walk all reserved memory zones */
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 34f5abc..055212a 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,7 +73,7 @@ struct rte_mem_config {
 	struct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */
 	struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */
 
-	/* Runtime Physmem descriptors. */
+	/* Runtime Physmem descriptors - NOT USED */
 	struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
 
 	struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */
diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h
index 716216f..b270356 100644
--- a/lib/librte_eal/common/include/rte_malloc_heap.h
+++ b/lib/librte_eal/common/include/rte_malloc_heap.h
@@ -40,7 +40,7 @@
 #include <rte_memory.h>
 
 /* Number of free lists per heap, grouped by size. */
-#define RTE_HEAP_NUM_FREELISTS  5
+#define RTE_HEAP_NUM_FREELISTS  13
 
 /**
  * Structure to hold malloc heap
@@ -48,7 +48,6 @@
 struct malloc_heap {
 	rte_spinlock_t lock;
 	LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
-	unsigned mz_count;
 	unsigned alloc_count;
 	size_t total_size;
 } __rte_cache_aligned;
diff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c
index a5e1248..b54ee33 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -37,7 +37,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_per_lcore.h>
@@ -56,10 +55,10 @@
  */
 void
 malloc_elem_init(struct malloc_elem *elem,
-		struct malloc_heap *heap, const struct rte_memzone *mz, size_t size)
+		struct malloc_heap *heap, const struct rte_memseg *ms, size_t size)
 {
 	elem->heap = heap;
-	elem->mz = mz;
+	elem->ms = ms;
 	elem->prev = NULL;
 	memset(&elem->free_list, 0, sizeof(elem->free_list));
 	elem->state = ELEM_FREE;
@@ -70,12 +69,12 @@ malloc_elem_init(struct malloc_elem *elem,
 }
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
 {
-	malloc_elem_init(elem, prev->heap, prev->mz, 0);
+	malloc_elem_init(elem, prev->heap, prev->ms, 0);
 	elem->prev = prev;
 	elem->state = ELEM_BUSY; /* mark busy so its never merged */
 }
@@ -86,12 +85,24 @@ malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
  * fit, return NULL.
  */
 static void *
-elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
+elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	const uintptr_t end_pt = (uintptr_t)elem +
+	const size_t bmask = ~(bound - 1);
+	uintptr_t end_pt = (uintptr_t)elem +
 			elem->size - MALLOC_ELEM_TRAILER_LEN;
-	const uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
-	const uintptr_t new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
+	uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+	uintptr_t new_elem_start;
+
+	/* check boundary */
+	if ((new_data_start & bmask) != ((end_pt - 1) & bmask)) {
+		end_pt = RTE_ALIGN_FLOOR(end_pt, bound);
+		new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+		if (((end_pt - 1) & bmask) != (new_data_start & bmask))
+			return NULL;
+	}
+
+	new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
 
 	/* if the new start point is before the exist start, it won't fit */
 	return (new_elem_start < (uintptr_t)elem) ? NULL : (void *)new_elem_start;
@@ -102,9 +113,10 @@ elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
  * alignment request from the current element
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,	unsigned align,
+		size_t bound)
 {
-	return elem_start_pt(elem, size, align) != NULL;
+	return elem_start_pt(elem, size, align, bound) != NULL;
 }
 
 /*
@@ -115,10 +127,10 @@ static void
 split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)
 {
 	struct malloc_elem *next_elem = RTE_PTR_ADD(elem, elem->size);
-	const unsigned old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
-	const unsigned new_elem_size = elem->size - old_elem_size;
+	const size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
+	const size_t new_elem_size = elem->size - old_elem_size;
 
-	malloc_elem_init(split_pt, elem->heap, elem->mz, new_elem_size);
+	malloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);
 	split_pt->prev = elem;
 	next_elem->prev = split_pt;
 	elem->size = old_elem_size;
@@ -168,8 +180,9 @@ malloc_elem_free_list_index(size_t size)
 void
 malloc_elem_free_list_insert(struct malloc_elem *elem)
 {
-	size_t idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
+	size_t idx;
 
+	idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
 	elem->state = ELEM_FREE;
 	LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
 }
@@ -190,12 +203,26 @@ elem_free_list_remove(struct malloc_elem *elem)
  * is not done here, as it's done there previously.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	struct malloc_elem *new_elem = elem_start_pt(elem, size, align);
-	const unsigned old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	struct malloc_elem *new_elem = elem_start_pt(elem, size, align, bound);
+	const size_t old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	const size_t trailer_size = elem->size - old_elem_size - size -
+		MALLOC_ELEM_OVERHEAD;
+
+	elem_free_list_remove(elem);
 
-	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE){
+	if (trailer_size > MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
+		/* split it, too much free space after elem */
+		struct malloc_elem *new_free_elem =
+				RTE_PTR_ADD(new_elem, size + MALLOC_ELEM_OVERHEAD);
+
+		split_elem(elem, new_free_elem);
+		malloc_elem_free_list_insert(new_free_elem);
+	}
+
+	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
 		/* don't split it, pad the element instead */
 		elem->state = ELEM_BUSY;
 		elem->pad = old_elem_size;
@@ -208,8 +235,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 			new_elem->size = elem->size - elem->pad;
 			set_header(new_elem);
 		}
-		/* remove element from free list */
-		elem_free_list_remove(elem);
 
 		return new_elem;
 	}
@@ -219,7 +244,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 	 * Re-insert original element, in case its new size makes it
 	 * belong on a different list.
 	 */
-	elem_free_list_remove(elem);
 	split_elem(elem, new_elem);
 	new_elem->state = ELEM_BUSY;
 	malloc_elem_free_list_insert(elem);
diff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h
index 9790b1a..e05d2ea 100644
--- a/lib/librte_eal/common/malloc_elem.h
+++ b/lib/librte_eal/common/malloc_elem.h
@@ -47,9 +47,9 @@ enum elem_state {
 
 struct malloc_elem {
 	struct malloc_heap *heap;
-	struct malloc_elem *volatile prev;      /* points to prev elem in memzone */
+	struct malloc_elem *volatile prev;      /* points to prev elem in memseg */
 	LIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */
-	const struct rte_memzone *mz;
+	const struct rte_memseg *ms;
 	volatile enum elem_state state;
 	uint32_t pad;
 	size_t size;
@@ -136,11 +136,11 @@ malloc_elem_from_data(const void *data)
 void
 malloc_elem_init(struct malloc_elem *elem,
 		struct malloc_heap *heap,
-		const struct rte_memzone *mz,
+		const struct rte_memseg *ms,
 		size_t size);
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem,
@@ -151,14 +151,16 @@ malloc_elem_mkend(struct malloc_elem *elem,
  * of the requested size and with the requested alignment
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * reserve a block of data in an existing malloc_elem. If the malloc_elem
  * is much larger than the data block requested, we split the element in two.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_alloc(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * free a malloc_elem block by adding it to the free list. If the
diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index 8861d27..2496b77 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -39,7 +39,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_launch.h>
@@ -54,123 +53,125 @@
 #include "malloc_elem.h"
 #include "malloc_heap.h"
 
-/* since the memzone size starts with a digit, it will appear unquoted in
- * rte_config.h, so quote it so it can be passed to rte_str_to_size */
-#define MALLOC_MEMZONE_SIZE RTE_STR(RTE_MALLOC_MEMZONE_SIZE)
-
-/*
- * returns the configuration setting for the memzone size as a size_t value
- */
-static inline size_t
-get_malloc_memzone_size(void)
+static unsigned
+check_hugepage_sz(unsigned flags, size_t hugepage_sz)
 {
-	return rte_str_to_size(MALLOC_MEMZONE_SIZE);
+	unsigned check_flag = 0;
+
+	if (!(flags & ~RTE_MEMZONE_SIZE_HINT_ONLY))
+		return 1;
+
+	switch (hugepage_sz) {
+	case RTE_PGSIZE_256K:
+		check_flag = RTE_MEMZONE_256KB;
+		break;
+	case RTE_PGSIZE_2M:
+		check_flag = RTE_MEMZONE_2MB;
+		break;
+	case RTE_PGSIZE_16M:
+		check_flag = RTE_MEMZONE_16MB;
+		break;
+	case RTE_PGSIZE_256M:
+		check_flag = RTE_MEMZONE_256MB;
+		break;
+	case RTE_PGSIZE_512M:
+		check_flag = RTE_MEMZONE_512MB;
+		break;
+	case RTE_PGSIZE_1G:
+		check_flag = RTE_MEMZONE_1GB;
+		break;
+	case RTE_PGSIZE_4G:
+		check_flag = RTE_MEMZONE_4GB;
+		break;
+	case RTE_PGSIZE_16G:
+		check_flag = RTE_MEMZONE_16GB;
+	}
+
+	return (check_flag & flags);
 }
 
 /*
- * reserve an extra memory zone and make it available for use by a particular
- * heap. This reserves the zone and sets a dummy malloc_elem header at the end
+ * Expand the heap with a memseg.
+ * This reserves the zone and sets a dummy malloc_elem header at the end
  * to prevent overflow. The rest of the zone is added to free list as a single
  * large free block
  */
-static int
-malloc_heap_add_memzone(struct malloc_heap *heap, size_t size, unsigned align)
+static void
+malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)
 {
-	const unsigned mz_flags = 0;
-	const size_t block_size = get_malloc_memzone_size();
-	/* ensure the data we want to allocate will fit in the memzone */
-	const size_t min_size = size + align + MALLOC_ELEM_OVERHEAD * 2;
-	const struct rte_memzone *mz = NULL;
-	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	unsigned numa_socket = heap - mcfg->malloc_heaps;
-
-	size_t mz_size = min_size;
-	if (mz_size < block_size)
-		mz_size = block_size;
-
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	snprintf(mz_name, sizeof(mz_name), "MALLOC_S%u_HEAP_%u",
-		     numa_socket, heap->mz_count++);
-
-	/* try getting a block. if we fail and we don't need as big a block
-	 * as given in the config, we can shrink our request and try again
-	 */
-	do {
-		mz = rte_memzone_reserve(mz_name, mz_size, numa_socket,
-					 mz_flags);
-		if (mz == NULL)
-			mz_size /= 2;
-	} while (mz == NULL && mz_size > min_size);
-	if (mz == NULL)
-		return -1;
-
 	/* allocate the memory block headers, one at end, one at start */
-	struct malloc_elem *start_elem = (struct malloc_elem *)mz->addr;
-	struct malloc_elem *end_elem = RTE_PTR_ADD(mz->addr,
-			mz_size - MALLOC_ELEM_OVERHEAD);
+	struct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;
+	struct malloc_elem *end_elem = RTE_PTR_ADD(ms->addr,
+			ms->len - MALLOC_ELEM_OVERHEAD);
 	end_elem = RTE_PTR_ALIGN_FLOOR(end_elem, RTE_CACHE_LINE_SIZE);
+	const size_t elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
 
-	const unsigned elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
-	malloc_elem_init(start_elem, heap, mz, elem_size);
+	malloc_elem_init(start_elem, heap, ms, elem_size);
 	malloc_elem_mkend(end_elem, start_elem);
 	malloc_elem_free_list_insert(start_elem);
 
-	/* increase heap total size by size of new memzone */
-	heap->total_size+=mz_size - MALLOC_ELEM_OVERHEAD;
-	return 0;
+	heap->total_size += elem_size;
 }
 
 /*
  * Iterates through the freelist for a heap to find a free element
  * which can store data of the required size and with the requested alignment.
+ * If size is 0, find the biggest available elem.
  * Returns null on failure, or pointer to element on success.
  */
 static struct malloc_elem *
-find_suitable_element(struct malloc_heap *heap, size_t size, unsigned align)
+find_suitable_element(struct malloc_heap *heap, size_t size,
+		unsigned flags, size_t align, size_t bound)
 {
 	size_t idx;
-	struct malloc_elem *elem;
+	struct malloc_elem *elem, *alt_elem = NULL;
 
 	for (idx = malloc_elem_free_list_index(size);
-		idx < RTE_HEAP_NUM_FREELISTS; idx++)
-	{
+			idx < RTE_HEAP_NUM_FREELISTS; idx++) {
 		for (elem = LIST_FIRST(&heap->free_head[idx]);
-			!!elem; elem = LIST_NEXT(elem, free_list))
-		{
-			if (malloc_elem_can_hold(elem, size, align))
-				return elem;
+				!!elem; elem = LIST_NEXT(elem, free_list)) {
+			if (malloc_elem_can_hold(elem, size, align, bound)) {
+				if (check_hugepage_sz(flags, elem->ms->hugepage_sz))
+					return elem;
+				if (alt_elem == NULL)
+					alt_elem = elem;
+			}
 		}
 	}
+
+	if ((alt_elem != NULL) && (flags & RTE_MEMZONE_SIZE_HINT_ONLY))
+		return alt_elem;
+
 	return NULL;
 }
 
 /*
- * Main function called by malloc to allocate a block of memory from the
- * heap. It locks the free list, scans it, and adds a new memzone if the
- * scan fails. Once the new memzone is added, it re-scans and should return
+ * Main function to allocate a block of memory from the heap.
+ * It locks the free list, scans it, and adds a new memseg if the
+ * scan fails. Once the new memseg is added, it re-scans and should return
  * the new element after releasing the lock.
  */
 void *
 malloc_heap_alloc(struct malloc_heap *heap,
-		const char *type __attribute__((unused)), size_t size, unsigned align)
+		const char *type __attribute__((unused)), size_t size, unsigned flags,
+		size_t align, size_t bound)
 {
+	struct malloc_elem *elem;
+
 	size = RTE_CACHE_LINE_ROUNDUP(size);
 	align = RTE_CACHE_LINE_ROUNDUP(align);
+
 	rte_spinlock_lock(&heap->lock);
-	struct malloc_elem *elem = find_suitable_element(heap, size, align);
-	if (elem == NULL){
-		if ((malloc_heap_add_memzone(heap, size, align)) == 0)
-			elem = find_suitable_element(heap, size, align);
-	}
 
-	if (elem != NULL){
-		elem = malloc_elem_alloc(elem, size, align);
+	elem = find_suitable_element(heap, size, flags, align, bound);
+	if (elem != NULL) {
+		elem = malloc_elem_alloc(elem, size, align, bound);
 		/* increase heap's count of allocated elements */
 		heap->alloc_count++;
 	}
 	rte_spinlock_unlock(&heap->lock);
-	return elem == NULL ? NULL : (void *)(&elem[1]);
 
+	return elem == NULL ? NULL : (void *)(&elem[1]);
 }
 
 /*
@@ -206,3 +207,21 @@ malloc_heap_get_stats(const struct malloc_heap *heap,
 	socket_stats->alloc_count = heap->alloc_count;
 	return 0;
 }
+
+int
+rte_eal_malloc_heap_init(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	unsigned ms_cnt;
+	struct rte_memseg *ms;
+
+	if (mcfg == NULL)
+		return -1;
+
+	for (ms = &mcfg->memseg[0], ms_cnt = 0;
+			(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);
+			ms_cnt++, ms++)
+		malloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);
+
+	return 0;
+}
diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h
index a47136d..3ccbef0 100644
--- a/lib/librte_eal/common/malloc_heap.h
+++ b/lib/librte_eal/common/malloc_heap.h
@@ -53,15 +53,15 @@ malloc_get_numa_socket(void)
 }
 
 void *
-malloc_heap_alloc(struct malloc_heap *heap, const char *type,
-		size_t size, unsigned align);
+malloc_heap_alloc(struct malloc_heap *heap,	const char *type, size_t size,
+		unsigned flags, size_t align, size_t bound);
 
 int
 malloc_heap_get_stats(const struct malloc_heap *heap,
 		struct rte_malloc_socket_stats *socket_stats);
 
 int
-rte_eal_heap_memzone_init(void);
+rte_eal_malloc_heap_init(void);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index c313a57..54c2bd8 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -39,7 +39,6 @@
 
 #include <rte_memcpy.h>
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_branch_prediction.h>
@@ -87,7 +86,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 		return NULL;
 
 	ret = malloc_heap_alloc(&mcfg->malloc_heaps[socket], type,
-				size, align == 0 ? 1 : align);
+				size, 0, align == 0 ? 1 : align, 0);
 	if (ret != NULL || socket_arg != SOCKET_ID_ANY)
 		return ret;
 
@@ -98,7 +97,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 			continue;
 
 		ret = malloc_heap_alloc(&mcfg->malloc_heaps[i], type,
-					size, align == 0 ? 1 : align);
+					size, 0, align == 0 ? 1 : align, 0);
 		if (ret != NULL)
 			return ret;
 	}
@@ -256,5 +255,5 @@ rte_malloc_virt2phy(const void *addr)
 	const struct malloc_elem *elem = malloc_elem_from_data(addr);
 	if (elem == NULL)
 		return 0;
-	return elem->mz->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->mz->addr);
+	return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
 }
-- 
1.9.3

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"
  2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
  2015-07-13 17:29  0%   ` Thomas Monjalon
@ 2015-07-15  8:08  3%   ` Olga Shern
  1 sibling, 0 replies; 200+ results
From: Olga Shern @ 2015-07-15  8:08 UTC (permalink / raw)
  To: Bruce Richardson, dev

Hi, 

I see the following compilation error :
dpdk/lib/librte_hash/rte_cuckoo_hash.c:145: error: flexible array member in otherwise empty struct
when compiling on RH6.5

Best Regards,
Olga

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
Sent: Monday, July 13, 2015 7:39 PM
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH v2] hash: rename unused field to "reserved"

The cuckoo hash has a fixed number of entries per bucket, so the configuration parameter for this is unused. We change this field in the parameters struct to "reserved" to indicate that there is now no such parameter value, while at the same time keeping ABI consistency.

Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")

Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_func_reentrancy.c | 1 -
 app/test/test_hash_perf.c       | 1 -
 app/test/test_hash_scaling.c    | 1 -
 drivers/net/enic/enic_clsf.c    | 2 --
 examples/l3fwd-power/main.c     | 2 --
 examples/l3fwd-vf/main.c        | 1 -
 examples/l3fwd/main.c           | 2 --
 lib/librte_hash/rte_hash.h      | 2 +-
 8 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c index 85504c0..be61773 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -226,7 +226,6 @@ hash_create_free(__attribute__((unused)) void *arg)
 	struct rte_hash_parameters hash_params = {
 		.name = NULL,
 		.entries = 16,
-		.bucket_entries = 4,
 		.key_len = 4,
 		.hash_func = (rte_hash_function)rte_jhash_32b,
 		.hash_func_init_val = 0,
diff --git a/app/test/test_hash_perf.c b/app/test/test_hash_perf.c index e9a522b..a87fc80 100644
--- a/app/test/test_hash_perf.c
+++ b/app/test/test_hash_perf.c
@@ -100,7 +100,6 @@ int32_t positions[KEYS_TO_ADD];
 /* Parameters used for hash table in unit test functions. */  static struct rte_hash_parameters ut_params = {
 	.entries = MAX_ENTRIES,
-	.bucket_entries = BUCKET_SIZE,
 	.hash_func = rte_jhash,
 	.hash_func_init_val = 0,
 };
diff --git a/app/test/test_hash_scaling.c b/app/test/test_hash_scaling.c index 682ae94..39602cb 100644
--- a/app/test/test_hash_scaling.c
+++ b/app/test/test_hash_scaling.c
@@ -129,7 +129,6 @@ test_hash_scaling(int locking_mode)
 	uint64_t i, key;
 	struct rte_hash_parameters hash_params = {
 		.entries = num_iterations*2,
-		.bucket_entries = 16,
 		.key_len = sizeof(key),
 		.hash_func = rte_hash_crc,
 		.hash_func_init_val = 0,
diff --git a/drivers/net/enic/enic_clsf.c b/drivers/net/enic/enic_clsf.c index ca12d2d..9c2abfb 100644
--- a/drivers/net/enic/enic_clsf.c
+++ b/drivers/net/enic/enic_clsf.c
@@ -63,7 +63,6 @@
 
 #define SOCKET_0                0
 #define ENICPMD_CLSF_HASH_ENTRIES       ENICPMD_FDIR_MAX
-#define ENICPMD_CLSF_BUCKET_ENTRIES     4
 
 void enic_fdir_stats_get(struct enic *enic, struct rte_eth_fdir_stats *stats)  { @@ -245,7 +244,6 @@ int enic_clsf_init(struct enic *enic)
 	struct rte_hash_parameters hash_params = {
 		.name = "enicpmd_clsf_hash",
 		.entries = ENICPMD_CLSF_HASH_ENTRIES,
-		.bucket_entries = ENICPMD_CLSF_BUCKET_ENTRIES,
 		.key_len = RTE_HASH_KEY_LENGTH_MAX,
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c index d4eba1a..6eb459d 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1247,7 +1247,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv4_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv4_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
@@ -1256,7 +1255,6 @@ setup_hash(int socketid)
 	struct rte_hash_parameters ipv6_l3fwd_hash_params = {
 		.name = NULL,
 		.entries = L3FWD_HASH_ENTRIES,
-		.bucket_entries = 4,
 		.key_len = sizeof(struct ipv6_5tuple),
 		.hash_func = DEFAULT_HASH_FUNC,
 		.hash_func_init_val = 0,
diff --git a/examples/l3fwd-vf/main.c b/examples/l3fwd-vf/main.c index ccbb02f..01f610e 100644
--- a/examples/l3fwd-vf/main.c
+++ b/examples/l3fwd-vf/main.c
@@ -251,7 +251,6 @@ static lookup_struct_t *l3fwd_lookup_struct[NB_SOCKETS];  struct rte_hash_parameters l3fwd_hash_params = {
 	.name = "l3fwd_hash_0",
 	.entries = L3FWD_HASH_ENTRIES,
-	.bucket_entries = 4,
 	.key_len = sizeof(struct ipv4_5tuple),
 	.hash_func = DEFAULT_HASH_FUNC,
 	.hash_func_init_val = 0,
diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 5c22ed1..def9594 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -2162,7 +2162,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv4_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv4_5tuple_host),
         .hash_func = ipv4_hash_crc,
         .hash_func_init_val = 0,
@@ -2171,7 +2170,6 @@ setup_hash(int socketid)
     struct rte_hash_parameters ipv6_l3fwd_hash_params = {
         .name = NULL,
         .entries = L3FWD_HASH_ENTRIES,
-        .bucket_entries = 4,
         .key_len = sizeof(union ipv6_5tuple_host),
         .hash_func = ipv6_hash_crc,
         .hash_func_init_val = 0,
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h index 68109d5..1cddc07 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -75,7 +75,7 @@ typedef uint32_t (*rte_hash_function)(const void *key, uint32_t key_len,  struct rte_hash_parameters {
 	const char *name;		/**< Name of the hash. */
 	uint32_t entries;		/**< Total hash table entries. */
-	uint32_t bucket_entries;        /**< Bucket entries. */
+	uint32_t reserved;		/**< Unused field. Should be set to 0 */
 	uint32_t key_len;		/**< Length of hash key. */
 	rte_hash_function hash_func;	/**< Primary Hash function used to calculate hash. */
 	uint32_t hash_func_init_val;	/**< Init value used by hash_func. */
--
2.4.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v9 0/9] Dynamic memzones
  2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
  2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
  2015-07-14  8:57 19%   ` [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
@ 2015-07-15  8:26  4%   ` Sergio Gonzalez Monroy
  2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
                       ` (2 more replies)
  2 siblings, 3 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15  8:26 UTC (permalink / raw)
  To: dev

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v9:
 - Fix incorrect size_t type that results in 32bits compilation error.

v8:
 - Rebase against current HEAD to factor for changes made by new Tile-Gx arch

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests


v6 Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>


Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS                                       |  22 +-
 app/test/test_malloc.c                            |  86 ----
 app/test/test_memzone.c                           | 456 ++++------------------
 config/common_bsdapp                              |   8 +-
 config/common_linuxapp                            |   8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   | 220 ++++++++++-
 doc/guides/prog_guide/img/malloc_heap.png         | Bin 81329 -> 80952 bytes
 doc/guides/prog_guide/index.rst                   |   1 -
 doc/guides/prog_guide/malloc_lib.rst              | 233 -----------
 doc/guides/prog_guide/overview.rst                |  11 +-
 doc/guides/rel_notes/abi.rst                      |   6 +-
 drivers/net/af_packet/Makefile                    |   1 -
 drivers/net/bonding/Makefile                      |   1 -
 drivers/net/e1000/Makefile                        |   2 +-
 drivers/net/enic/Makefile                         |   2 +-
 drivers/net/fm10k/Makefile                        |   2 +-
 drivers/net/i40e/Makefile                         |   2 +-
 drivers/net/ixgbe/Makefile                        |   2 +-
 drivers/net/mlx4/Makefile                         |   1 -
 drivers/net/null/Makefile                         |   1 -
 drivers/net/pcap/Makefile                         |   1 -
 drivers/net/virtio/Makefile                       |   2 +-
 drivers/net/vmxnet3/Makefile                      |   2 +-
 drivers/net/xenvirt/Makefile                      |   2 +-
 lib/Makefile                                      |   2 +-
 lib/librte_acl/Makefile                           |   2 +-
 lib/librte_eal/bsdapp/eal/Makefile                |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  19 +
 lib/librte_eal/common/Makefile                    |   1 +
 lib/librte_eal/common/eal_common_memzone.c        | 353 +++++++----------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   5 +-
 lib/librte_eal/common/include/rte_malloc.h        | 342 ++++++++++++++++
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/include/rte_memzone.h       |  11 +
 lib/librte_eal/common/malloc_elem.c               | 344 ++++++++++++++++
 lib/librte_eal/common/malloc_elem.h               | 192 +++++++++
 lib/librte_eal/common/malloc_heap.c               | 227 +++++++++++
 lib/librte_eal/common/malloc_heap.h               |  70 ++++
 lib/librte_eal/common/rte_malloc.c                | 259 ++++++++++++
 lib/librte_eal/linuxapp/eal/Makefile              |   4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c         |  17 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  19 +
 lib/librte_hash/Makefile                          |   2 +-
 lib/librte_lpm/Makefile                           |   2 +-
 lib/librte_malloc/Makefile                        |   6 +-
 lib/librte_malloc/malloc_elem.c                   | 320 ---------------
 lib/librte_malloc/malloc_elem.h                   | 190 ---------
 lib/librte_malloc/malloc_heap.c                   | 208 ----------
 lib/librte_malloc/malloc_heap.h                   |  70 ----
 lib/librte_malloc/rte_malloc.c                    | 228 +----------
 lib/librte_malloc/rte_malloc.h                    | 342 ----------------
 lib/librte_malloc/rte_malloc_version.map          |  16 -
 lib/librte_mempool/Makefile                       |   2 -
 lib/librte_port/Makefile                          |   1 -
 lib/librte_ring/Makefile                          |   3 +-
 lib/librte_table/Makefile                         |   1 -
 56 files changed, 1965 insertions(+), 2372 deletions(-)
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

-- 
1.9.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
@ 2015-07-15  8:26  1%     ` Sergio Gonzalez Monroy
  2015-07-15  8:26 19%     ` [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15  8:26 UTC (permalink / raw)
  To: dev

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/common/eal_common_memzone.c        | 289 +++++-----------------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c               |  68 +++--
 lib/librte_eal/common/malloc_elem.h               |  14 +-
 lib/librte_eal/common/malloc_heap.c               | 161 ++++++------
 lib/librte_eal/common/malloc_heap.h               |   6 +-
 lib/librte_eal/common/rte_malloc.c                |   7 +-
 8 files changed, 220 insertions(+), 330 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 9c1da71..fd7e73f 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 
+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"
 
-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
 	const struct rte_mem_config *mcfg;
+	const struct rte_memzone *mz;
 	unsigned i = 0;
 
 	/* get pointer to global configuration */
@@ -68,62 +68,62 @@ memzone_lookup_thread_unsafe(const char *name)
 	 * the algorithm is not optimal (linear), but there are few
 	 * zones and this function should be called at init only
 	 */
-	for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-		if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+	for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+		mz = &mcfg->memzone[i];
+		if (mz->addr != NULL && !strncmp(name, mz->name, RTE_MEMZONE_NAMESIZE))
 			return &mcfg->memzone[i];
 	}
 
 	return NULL;
 }
 
-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-	size_t bound)
+/* Find the heap with the greatest free block size */
+static void
+find_heap_max_free_elem(int *s, size_t *len, unsigned align)
 {
-	phys_addr_t addr_offset, bmask, end, start;
-	size_t step;
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-	step = RTE_MAX(align, bound);
-	bmask = ~((phys_addr_t)bound - 1);
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* calculate offset to closest alignment */
-	start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-	addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size > *len) {
+			*len = stats.greatest_free_size;
+			*s = i;
+		}
+	}
+	*len -= (MALLOC_ELEM_OVERHEAD + align);
+}
 
-	while (addr_offset + len < ms->len) {
+/* Find a heap that can allocate the requested size */
+static void
+find_heap_suitable(int *s, size_t len, unsigned align)
+{
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	unsigned i;
 
-		/* check, do we meet boundary condition */
-		end = start + len - (len != 0);
-		if ((start & bmask) == (end & bmask))
-			break;
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-		/* calculate next offset */
-		start = RTE_ALIGN_CEIL(start + 1, step);
-		addr_offset = start - ms->phys_addr;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size >= len + MALLOC_ELEM_OVERHEAD + align) {
+			*s = i;
+			break;
+		}
 	}
-
-	return addr_offset;
 }
 
 static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
-		int socket_id, uint64_t size_mask, unsigned align,
-		unsigned bound)
+		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
 	struct rte_mem_config *mcfg;
-	unsigned i = 0;
-	int memseg_idx = -1;
-	uint64_t addr_offset, seg_offset = 0;
 	size_t requested_len;
-	size_t memseg_len = 0;
-	phys_addr_t memseg_physaddr;
-	void *memseg_addr;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -155,7 +155,6 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	if (align < RTE_CACHE_LINE_SIZE)
 		align = RTE_CACHE_LINE_SIZE;
 
-
 	/* align length on cache boundary. Check for overflow before doing so */
 	if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) {
 		rte_errno = EINVAL; /* requested size too big */
@@ -169,108 +168,50 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	requested_len = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE,  len);
 
 	/* check that boundary condition is valid */
-	if (bound != 0 &&
-			(requested_len > bound || !rte_is_power_of_2(bound))) {
+	if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
 
-	/* find the smallest segment matching requirements */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		/* last segment */
-		if (free_memseg[i].addr == NULL)
-			break;
+	if (len == 0) {
+		if (bound != 0)
+			requested_len = bound;
+		else
+			requested_len = 0;
+	}
 
-		/* empty segment, skip it */
-		if (free_memseg[i].len == 0)
-			continue;
-
-		/* bad socket ID */
-		if (socket_id != SOCKET_ID_ANY &&
-		    free_memseg[i].socket_id != SOCKET_ID_ANY &&
-		    socket_id != free_memseg[i].socket_id)
-			continue;
-
-		/*
-		 * calculate offset to closest alignment that
-		 * meets boundary conditions.
-		 */
-		addr_offset = align_phys_boundary(free_memseg + i,
-			requested_len, align, bound);
-
-		/* check len */
-		if ((requested_len + addr_offset) > free_memseg[i].len)
-			continue;
-
-		if ((size_mask & free_memseg[i].hugepage_sz) == 0)
-			continue;
-
-		/* this segment is the best until now */
-		if (memseg_idx == -1) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
-		}
-		/* find the biggest contiguous zone */
-		else if (len == 0) {
-			if (free_memseg[i].len > memseg_len) {
-				memseg_idx = i;
-				memseg_len = free_memseg[i].len;
-				seg_offset = addr_offset;
-			}
-		}
-		/*
-		 * find the smallest (we already checked that current
-		 * zone length is > len
-		 */
-		else if (free_memseg[i].len + align < memseg_len ||
-				(free_memseg[i].len <= memseg_len + align &&
-				addr_offset < seg_offset)) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
+	if (socket_id == SOCKET_ID_ANY) {
+		if (requested_len == 0)
+			find_heap_max_free_elem(&socket_id, &requested_len, align);
+		else
+			find_heap_suitable(&socket_id, requested_len, align);
+
+		if (socket_id == SOCKET_ID_ANY) {
+			rte_errno = ENOMEM;
+			return NULL;
 		}
 	}
 
-	/* no segment found */
-	if (memseg_idx == -1) {
+	/* allocate memory on heap */
+	void *mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[socket_id], NULL,
+			requested_len, flags, align, bound);
+	if (mz_addr == NULL) {
 		rte_errno = ENOMEM;
 		return NULL;
 	}
 
-	/* save aligned physical and virtual addresses */
-	memseg_physaddr = free_memseg[memseg_idx].phys_addr + seg_offset;
-	memseg_addr = RTE_PTR_ADD(free_memseg[memseg_idx].addr,
-			(uintptr_t) seg_offset);
-
-	/* if we are looking for a biggest memzone */
-	if (len == 0) {
-		if (bound == 0)
-			requested_len = memseg_len - seg_offset;
-		else
-			requested_len = RTE_ALIGN_CEIL(memseg_physaddr + 1,
-				bound) - memseg_physaddr;
-	}
-
-	/* set length to correct value */
-	len = (size_t)seg_offset + requested_len;
-
-	/* update our internal state */
-	free_memseg[memseg_idx].len -= len;
-	free_memseg[memseg_idx].phys_addr += len;
-	free_memseg[memseg_idx].addr =
-		(char *)free_memseg[memseg_idx].addr + len;
+	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
 	struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
 	snprintf(mz->name, sizeof(mz->name), "%s", name);
-	mz->phys_addr = memseg_physaddr;
-	mz->addr = memseg_addr;
-	mz->len = requested_len;
-	mz->hugepage_sz = free_memseg[memseg_idx].hugepage_sz;
-	mz->socket_id = free_memseg[memseg_idx].socket_id;
+	mz->phys_addr = rte_malloc_virt2phy(mz_addr);
+	mz->addr = mz_addr;
+	mz->len = (requested_len == 0 ? elem->size : requested_len);
+	mz->hugepage_sz = elem->ms->hugepage_sz;
+	mz->socket_id = elem->ms->socket_id;
 	mz->flags = 0;
-	mz->memseg_id = memseg_idx;
+	mz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;
 
 	return mz;
 }
@@ -282,26 +223,6 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memzone *mz = NULL;
-	uint64_t size_mask = 0;
-
-	if (flags & RTE_MEMZONE_256KB)
-		size_mask |= RTE_PGSIZE_256K;
-	if (flags & RTE_MEMZONE_2MB)
-		size_mask |= RTE_PGSIZE_2M;
-	if (flags & RTE_MEMZONE_16MB)
-		size_mask |= RTE_PGSIZE_16M;
-	if (flags & RTE_MEMZONE_256MB)
-		size_mask |= RTE_PGSIZE_256M;
-	if (flags & RTE_MEMZONE_512MB)
-		size_mask |= RTE_PGSIZE_512M;
-	if (flags & RTE_MEMZONE_1GB)
-		size_mask |= RTE_PGSIZE_1G;
-	if (flags & RTE_MEMZONE_4GB)
-		size_mask |= RTE_PGSIZE_4G;
-	if (flags & RTE_MEMZONE_16GB)
-		size_mask |= RTE_PGSIZE_16G;
-	if (!size_mask)
-		size_mask = UINT64_MAX;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -309,18 +230,7 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 	rte_rwlock_write_lock(&mcfg->mlock);
 
 	mz = memzone_reserve_aligned_thread_unsafe(
-		name, len, socket_id, size_mask, align, bound);
-
-	/*
-	 * If we failed to allocate the requested page size, and the
-	 * RTE_MEMZONE_SIZE_HINT_ONLY flag is specified, try allocating
-	 * again.
-	 */
-	if (!mz && rte_errno == ENOMEM && size_mask != UINT64_MAX &&
-	    flags & RTE_MEMZONE_SIZE_HINT_ONLY) {
-		mz = memzone_reserve_aligned_thread_unsafe(
-			name, len, socket_id, UINT64_MAX, align, bound);
-	}
+		name, len, socket_id, flags, align, bound);
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
@@ -412,45 +322,6 @@ rte_memzone_dump(FILE *f)
 }
 
 /*
- * called by init: modify the free memseg list to have cache-aligned
- * addresses and cache-aligned lengths
- */
-static int
-memseg_sanitize(struct rte_memseg *memseg)
-{
-	unsigned phys_align;
-	unsigned virt_align;
-	unsigned off;
-
-	phys_align = memseg->phys_addr & RTE_CACHE_LINE_MASK;
-	virt_align = (unsigned long)memseg->addr & RTE_CACHE_LINE_MASK;
-
-	/*
-	 * sanity check: phys_addr and addr must have the same
-	 * alignment
-	 */
-	if (phys_align != virt_align)
-		return -1;
-
-	/* memseg is really too small, don't bother with it */
-	if (memseg->len < (2 * RTE_CACHE_LINE_SIZE)) {
-		memseg->len = 0;
-		return 0;
-	}
-
-	/* align start address */
-	off = (RTE_CACHE_LINE_SIZE - phys_align) & RTE_CACHE_LINE_MASK;
-	memseg->phys_addr += off;
-	memseg->addr = (char *)memseg->addr + off;
-	memseg->len -= off;
-
-	/* align end address */
-	memseg->len &= ~((uint64_t)RTE_CACHE_LINE_MASK);
-
-	return 0;
-}
-
-/*
  * Init the memzone subsystem
  */
 int
@@ -458,14 +329,10 @@ rte_eal_memzone_init(void)
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memseg *memseg;
-	unsigned i = 0;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* mirror the runtime memsegs from config */
-	free_memseg = mcfg->free_memseg;
-
 	/* secondary processes don't need to initialise anything */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return 0;
@@ -478,33 +345,13 @@ rte_eal_memzone_init(void)
 
 	rte_rwlock_write_lock(&mcfg->mlock);
 
-	/* fill in uninitialized free_memsegs */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (memseg[i].addr == NULL)
-			break;
-		if (free_memseg[i].addr != NULL)
-			continue;
-		memcpy(&free_memseg[i], &memseg[i], sizeof(struct rte_memseg));
-	}
-
-	/* make all zones cache-aligned */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (free_memseg[i].addr == NULL)
-			break;
-		if (memseg_sanitize(&free_memseg[i]) < 0) {
-			RTE_LOG(ERR, EAL, "%s(): Sanity check failed\n", __func__);
-			rte_rwlock_write_unlock(&mcfg->mlock);
-			return -1;
-		}
-	}
-
 	/* delete all zones */
 	mcfg->memzone_idx = 0;
 	memset(mcfg->memzone, 0, sizeof(mcfg->memzone));
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
-	return 0;
+	return rte_eal_malloc_heap_init();
 }
 
 /* Walk all reserved memory zones */
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 34f5abc..055212a 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,7 +73,7 @@ struct rte_mem_config {
 	struct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */
 	struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */
 
-	/* Runtime Physmem descriptors. */
+	/* Runtime Physmem descriptors - NOT USED */
 	struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
 
 	struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */
diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h
index 716216f..b270356 100644
--- a/lib/librte_eal/common/include/rte_malloc_heap.h
+++ b/lib/librte_eal/common/include/rte_malloc_heap.h
@@ -40,7 +40,7 @@
 #include <rte_memory.h>
 
 /* Number of free lists per heap, grouped by size. */
-#define RTE_HEAP_NUM_FREELISTS  5
+#define RTE_HEAP_NUM_FREELISTS  13
 
 /**
  * Structure to hold malloc heap
@@ -48,7 +48,6 @@
 struct malloc_heap {
 	rte_spinlock_t lock;
 	LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
-	unsigned mz_count;
 	unsigned alloc_count;
 	size_t total_size;
 } __rte_cache_aligned;
diff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c
index a5e1248..b54ee33 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -37,7 +37,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_per_lcore.h>
@@ -56,10 +55,10 @@
  */
 void
 malloc_elem_init(struct malloc_elem *elem,
-		struct malloc_heap *heap, const struct rte_memzone *mz, size_t size)
+		struct malloc_heap *heap, const struct rte_memseg *ms, size_t size)
 {
 	elem->heap = heap;
-	elem->mz = mz;
+	elem->ms = ms;
 	elem->prev = NULL;
 	memset(&elem->free_list, 0, sizeof(elem->free_list));
 	elem->state = ELEM_FREE;
@@ -70,12 +69,12 @@ malloc_elem_init(struct malloc_elem *elem,
 }
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
 {
-	malloc_elem_init(elem, prev->heap, prev->mz, 0);
+	malloc_elem_init(elem, prev->heap, prev->ms, 0);
 	elem->prev = prev;
 	elem->state = ELEM_BUSY; /* mark busy so its never merged */
 }
@@ -86,12 +85,24 @@ malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
  * fit, return NULL.
  */
 static void *
-elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
+elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	const uintptr_t end_pt = (uintptr_t)elem +
+	const size_t bmask = ~(bound - 1);
+	uintptr_t end_pt = (uintptr_t)elem +
 			elem->size - MALLOC_ELEM_TRAILER_LEN;
-	const uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
-	const uintptr_t new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
+	uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+	uintptr_t new_elem_start;
+
+	/* check boundary */
+	if ((new_data_start & bmask) != ((end_pt - 1) & bmask)) {
+		end_pt = RTE_ALIGN_FLOOR(end_pt, bound);
+		new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+		if (((end_pt - 1) & bmask) != (new_data_start & bmask))
+			return NULL;
+	}
+
+	new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
 
 	/* if the new start point is before the exist start, it won't fit */
 	return (new_elem_start < (uintptr_t)elem) ? NULL : (void *)new_elem_start;
@@ -102,9 +113,10 @@ elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
  * alignment request from the current element
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,	unsigned align,
+		size_t bound)
 {
-	return elem_start_pt(elem, size, align) != NULL;
+	return elem_start_pt(elem, size, align, bound) != NULL;
 }
 
 /*
@@ -115,10 +127,10 @@ static void
 split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)
 {
 	struct malloc_elem *next_elem = RTE_PTR_ADD(elem, elem->size);
-	const unsigned old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
-	const unsigned new_elem_size = elem->size - old_elem_size;
+	const size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
+	const size_t new_elem_size = elem->size - old_elem_size;
 
-	malloc_elem_init(split_pt, elem->heap, elem->mz, new_elem_size);
+	malloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);
 	split_pt->prev = elem;
 	next_elem->prev = split_pt;
 	elem->size = old_elem_size;
@@ -168,8 +180,9 @@ malloc_elem_free_list_index(size_t size)
 void
 malloc_elem_free_list_insert(struct malloc_elem *elem)
 {
-	size_t idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
+	size_t idx;
 
+	idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
 	elem->state = ELEM_FREE;
 	LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
 }
@@ -190,12 +203,26 @@ elem_free_list_remove(struct malloc_elem *elem)
  * is not done here, as it's done there previously.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	struct malloc_elem *new_elem = elem_start_pt(elem, size, align);
-	const unsigned old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	struct malloc_elem *new_elem = elem_start_pt(elem, size, align, bound);
+	const size_t old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	const size_t trailer_size = elem->size - old_elem_size - size -
+		MALLOC_ELEM_OVERHEAD;
+
+	elem_free_list_remove(elem);
 
-	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE){
+	if (trailer_size > MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
+		/* split it, too much free space after elem */
+		struct malloc_elem *new_free_elem =
+				RTE_PTR_ADD(new_elem, size + MALLOC_ELEM_OVERHEAD);
+
+		split_elem(elem, new_free_elem);
+		malloc_elem_free_list_insert(new_free_elem);
+	}
+
+	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
 		/* don't split it, pad the element instead */
 		elem->state = ELEM_BUSY;
 		elem->pad = old_elem_size;
@@ -208,8 +235,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 			new_elem->size = elem->size - elem->pad;
 			set_header(new_elem);
 		}
-		/* remove element from free list */
-		elem_free_list_remove(elem);
 
 		return new_elem;
 	}
@@ -219,7 +244,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 	 * Re-insert original element, in case its new size makes it
 	 * belong on a different list.
 	 */
-	elem_free_list_remove(elem);
 	split_elem(elem, new_elem);
 	new_elem->state = ELEM_BUSY;
 	malloc_elem_free_list_insert(elem);
diff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h
index 9790b1a..e05d2ea 100644
--- a/lib/librte_eal/common/malloc_elem.h
+++ b/lib/librte_eal/common/malloc_elem.h
@@ -47,9 +47,9 @@ enum elem_state {
 
 struct malloc_elem {
 	struct malloc_heap *heap;
-	struct malloc_elem *volatile prev;      /* points to prev elem in memzone */
+	struct malloc_elem *volatile prev;      /* points to prev elem in memseg */
 	LIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */
-	const struct rte_memzone *mz;
+	const struct rte_memseg *ms;
 	volatile enum elem_state state;
 	uint32_t pad;
 	size_t size;
@@ -136,11 +136,11 @@ malloc_elem_from_data(const void *data)
 void
 malloc_elem_init(struct malloc_elem *elem,
 		struct malloc_heap *heap,
-		const struct rte_memzone *mz,
+		const struct rte_memseg *ms,
 		size_t size);
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem,
@@ -151,14 +151,16 @@ malloc_elem_mkend(struct malloc_elem *elem,
  * of the requested size and with the requested alignment
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * reserve a block of data in an existing malloc_elem. If the malloc_elem
  * is much larger than the data block requested, we split the element in two.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_alloc(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * free a malloc_elem block by adding it to the free list. If the
diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index 8861d27..21d8914 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -39,7 +39,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_launch.h>
@@ -54,123 +53,125 @@
 #include "malloc_elem.h"
 #include "malloc_heap.h"
 
-/* since the memzone size starts with a digit, it will appear unquoted in
- * rte_config.h, so quote it so it can be passed to rte_str_to_size */
-#define MALLOC_MEMZONE_SIZE RTE_STR(RTE_MALLOC_MEMZONE_SIZE)
-
-/*
- * returns the configuration setting for the memzone size as a size_t value
- */
-static inline size_t
-get_malloc_memzone_size(void)
+static unsigned
+check_hugepage_sz(unsigned flags, uint64_t hugepage_sz)
 {
-	return rte_str_to_size(MALLOC_MEMZONE_SIZE);
+	unsigned check_flag = 0;
+
+	if (!(flags & ~RTE_MEMZONE_SIZE_HINT_ONLY))
+		return 1;
+
+	switch (hugepage_sz) {
+	case RTE_PGSIZE_256K:
+		check_flag = RTE_MEMZONE_256KB;
+		break;
+	case RTE_PGSIZE_2M:
+		check_flag = RTE_MEMZONE_2MB;
+		break;
+	case RTE_PGSIZE_16M:
+		check_flag = RTE_MEMZONE_16MB;
+		break;
+	case RTE_PGSIZE_256M:
+		check_flag = RTE_MEMZONE_256MB;
+		break;
+	case RTE_PGSIZE_512M:
+		check_flag = RTE_MEMZONE_512MB;
+		break;
+	case RTE_PGSIZE_1G:
+		check_flag = RTE_MEMZONE_1GB;
+		break;
+	case RTE_PGSIZE_4G:
+		check_flag = RTE_MEMZONE_4GB;
+		break;
+	case RTE_PGSIZE_16G:
+		check_flag = RTE_MEMZONE_16GB;
+	}
+
+	return (check_flag & flags);
 }
 
 /*
- * reserve an extra memory zone and make it available for use by a particular
- * heap. This reserves the zone and sets a dummy malloc_elem header at the end
+ * Expand the heap with a memseg.
+ * This reserves the zone and sets a dummy malloc_elem header at the end
  * to prevent overflow. The rest of the zone is added to free list as a single
  * large free block
  */
-static int
-malloc_heap_add_memzone(struct malloc_heap *heap, size_t size, unsigned align)
+static void
+malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)
 {
-	const unsigned mz_flags = 0;
-	const size_t block_size = get_malloc_memzone_size();
-	/* ensure the data we want to allocate will fit in the memzone */
-	const size_t min_size = size + align + MALLOC_ELEM_OVERHEAD * 2;
-	const struct rte_memzone *mz = NULL;
-	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	unsigned numa_socket = heap - mcfg->malloc_heaps;
-
-	size_t mz_size = min_size;
-	if (mz_size < block_size)
-		mz_size = block_size;
-
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	snprintf(mz_name, sizeof(mz_name), "MALLOC_S%u_HEAP_%u",
-		     numa_socket, heap->mz_count++);
-
-	/* try getting a block. if we fail and we don't need as big a block
-	 * as given in the config, we can shrink our request and try again
-	 */
-	do {
-		mz = rte_memzone_reserve(mz_name, mz_size, numa_socket,
-					 mz_flags);
-		if (mz == NULL)
-			mz_size /= 2;
-	} while (mz == NULL && mz_size > min_size);
-	if (mz == NULL)
-		return -1;
-
 	/* allocate the memory block headers, one at end, one at start */
-	struct malloc_elem *start_elem = (struct malloc_elem *)mz->addr;
-	struct malloc_elem *end_elem = RTE_PTR_ADD(mz->addr,
-			mz_size - MALLOC_ELEM_OVERHEAD);
+	struct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;
+	struct malloc_elem *end_elem = RTE_PTR_ADD(ms->addr,
+			ms->len - MALLOC_ELEM_OVERHEAD);
 	end_elem = RTE_PTR_ALIGN_FLOOR(end_elem, RTE_CACHE_LINE_SIZE);
+	const size_t elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
 
-	const unsigned elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
-	malloc_elem_init(start_elem, heap, mz, elem_size);
+	malloc_elem_init(start_elem, heap, ms, elem_size);
 	malloc_elem_mkend(end_elem, start_elem);
 	malloc_elem_free_list_insert(start_elem);
 
-	/* increase heap total size by size of new memzone */
-	heap->total_size+=mz_size - MALLOC_ELEM_OVERHEAD;
-	return 0;
+	heap->total_size += elem_size;
 }
 
 /*
  * Iterates through the freelist for a heap to find a free element
  * which can store data of the required size and with the requested alignment.
+ * If size is 0, find the biggest available elem.
  * Returns null on failure, or pointer to element on success.
  */
 static struct malloc_elem *
-find_suitable_element(struct malloc_heap *heap, size_t size, unsigned align)
+find_suitable_element(struct malloc_heap *heap, size_t size,
+		unsigned flags, size_t align, size_t bound)
 {
 	size_t idx;
-	struct malloc_elem *elem;
+	struct malloc_elem *elem, *alt_elem = NULL;
 
 	for (idx = malloc_elem_free_list_index(size);
-		idx < RTE_HEAP_NUM_FREELISTS; idx++)
-	{
+			idx < RTE_HEAP_NUM_FREELISTS; idx++) {
 		for (elem = LIST_FIRST(&heap->free_head[idx]);
-			!!elem; elem = LIST_NEXT(elem, free_list))
-		{
-			if (malloc_elem_can_hold(elem, size, align))
-				return elem;
+				!!elem; elem = LIST_NEXT(elem, free_list)) {
+			if (malloc_elem_can_hold(elem, size, align, bound)) {
+				if (check_hugepage_sz(flags, elem->ms->hugepage_sz))
+					return elem;
+				if (alt_elem == NULL)
+					alt_elem = elem;
+			}
 		}
 	}
+
+	if ((alt_elem != NULL) && (flags & RTE_MEMZONE_SIZE_HINT_ONLY))
+		return alt_elem;
+
 	return NULL;
 }
 
 /*
- * Main function called by malloc to allocate a block of memory from the
- * heap. It locks the free list, scans it, and adds a new memzone if the
- * scan fails. Once the new memzone is added, it re-scans and should return
+ * Main function to allocate a block of memory from the heap.
+ * It locks the free list, scans it, and adds a new memseg if the
+ * scan fails. Once the new memseg is added, it re-scans and should return
  * the new element after releasing the lock.
  */
 void *
 malloc_heap_alloc(struct malloc_heap *heap,
-		const char *type __attribute__((unused)), size_t size, unsigned align)
+		const char *type __attribute__((unused)), size_t size, unsigned flags,
+		size_t align, size_t bound)
 {
+	struct malloc_elem *elem;
+
 	size = RTE_CACHE_LINE_ROUNDUP(size);
 	align = RTE_CACHE_LINE_ROUNDUP(align);
+
 	rte_spinlock_lock(&heap->lock);
-	struct malloc_elem *elem = find_suitable_element(heap, size, align);
-	if (elem == NULL){
-		if ((malloc_heap_add_memzone(heap, size, align)) == 0)
-			elem = find_suitable_element(heap, size, align);
-	}
 
-	if (elem != NULL){
-		elem = malloc_elem_alloc(elem, size, align);
+	elem = find_suitable_element(heap, size, flags, align, bound);
+	if (elem != NULL) {
+		elem = malloc_elem_alloc(elem, size, align, bound);
 		/* increase heap's count of allocated elements */
 		heap->alloc_count++;
 	}
 	rte_spinlock_unlock(&heap->lock);
-	return elem == NULL ? NULL : (void *)(&elem[1]);
 
+	return elem == NULL ? NULL : (void *)(&elem[1]);
 }
 
 /*
@@ -206,3 +207,21 @@ malloc_heap_get_stats(const struct malloc_heap *heap,
 	socket_stats->alloc_count = heap->alloc_count;
 	return 0;
 }
+
+int
+rte_eal_malloc_heap_init(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	unsigned ms_cnt;
+	struct rte_memseg *ms;
+
+	if (mcfg == NULL)
+		return -1;
+
+	for (ms = &mcfg->memseg[0], ms_cnt = 0;
+			(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);
+			ms_cnt++, ms++)
+		malloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);
+
+	return 0;
+}
diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h
index a47136d..3ccbef0 100644
--- a/lib/librte_eal/common/malloc_heap.h
+++ b/lib/librte_eal/common/malloc_heap.h
@@ -53,15 +53,15 @@ malloc_get_numa_socket(void)
 }
 
 void *
-malloc_heap_alloc(struct malloc_heap *heap, const char *type,
-		size_t size, unsigned align);
+malloc_heap_alloc(struct malloc_heap *heap,	const char *type, size_t size,
+		unsigned flags, size_t align, size_t bound);
 
 int
 malloc_heap_get_stats(const struct malloc_heap *heap,
 		struct rte_malloc_socket_stats *socket_stats);
 
 int
-rte_eal_heap_memzone_init(void);
+rte_eal_malloc_heap_init(void);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index c313a57..54c2bd8 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -39,7 +39,6 @@
 
 #include <rte_memcpy.h>
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_branch_prediction.h>
@@ -87,7 +86,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 		return NULL;
 
 	ret = malloc_heap_alloc(&mcfg->malloc_heaps[socket], type,
-				size, align == 0 ? 1 : align);
+				size, 0, align == 0 ? 1 : align, 0);
 	if (ret != NULL || socket_arg != SOCKET_ID_ANY)
 		return ret;
 
@@ -98,7 +97,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 			continue;
 
 		ret = malloc_heap_alloc(&mcfg->malloc_heaps[i], type,
-					size, align == 0 ? 1 : align);
+					size, 0, align == 0 ? 1 : align, 0);
 		if (ret != NULL)
 			return ret;
 	}
@@ -256,5 +255,5 @@ rte_malloc_virt2phy(const void *addr)
 	const struct malloc_elem *elem = malloc_elem_from_data(addr);
 	if (elem == NULL)
 		return 0;
-	return elem->mz->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->mz->addr);
+	return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
 }
-- 
1.9.3

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
@ 2015-07-15  8:26 19%     ` Sergio Gonzalez Monroy
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2 siblings, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15  8:26 UTC (permalink / raw)
  To: dev

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 doc/guides/rel_notes/abi.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..76e0ae2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -16,7 +16,6 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
-
 * Significant ABI changes are planned for struct rte_eth_dev to support up to
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
@@ -24,3 +23,8 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* librte_malloc library has been integrated into librte_eal. The 2.1 release
+  creates a dummy/empty malloc library to fulfill binaries with dynamic linking
+  dependencies on librte_malloc.so. Such dummy library will not be created from
+  release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

^ permalink raw reply	[relevance 19%]

* Re: [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types
  2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
@ 2015-07-15 10:19  0%     ` Olivier MATZ
  0 siblings, 0 replies; 200+ results
From: Olivier MATZ @ 2015-07-15 10:19 UTC (permalink / raw)
  To: Helin Zhang, dev

On 07/09/2015 06:31 PM, Helin Zhang wrote:
> As there are only 6 bit flags in ol_flags for indicating packet
> types, which is not enough to describe all the possible packet
> types hardware can recognize. For example, i40e hardware can
> recognize more than 150 packet types. Unified packet type is
> composed of L2 type, L3 type, L4 type, tunnel type, inner L2 type,
> inner L3 type and inner L4 type fields, and can be stored in
> 'struct rte_mbuf' of 32 bits field 'packet_type'.
> To avoid breaking ABI compatibility, all the changes would be
> enabled by RTE_NEXT_ABI, which is disabled by default.
>
> Signed-off-by: Helin Zhang <helin.zhang@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps
@ 2015-07-15 13:11  3% Maryam Tahhan
  2015-07-15 13:11  9% ` [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
  2015-07-16  7:54  0% ` [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Olivier MATZ
  0 siblings, 2 replies; 200+ results
From: Maryam Tahhan @ 2015-07-15 13:11 UTC (permalink / raw)
  To: dev

This patch set implements xstats_get() and xstats_reset() in dev_ops for
ixgbe to expose detailed error statistics to DPDK applications. The
dump_cfg application was extended to demonstrate the usage of
retrieving statistics for DPDK interfaces and renamed to proc_info
in order reflect this new functionality. This patch set also removes non
generic statistics from the statistics strings at the ethdev level and
marks the relevant registers as depricated in struct rte_eth_stats.

v2:
 - Fixed patch dependencies.
 - Broke down patches into smaller logical changes.

v3:
 - Removes non-generic stats fields in rte_stats_strings and deprecates
   the fields related to them in struct rte_eth_stats.
 - Modifies rte_eth_xstats_get() to return generic stats and extended
   stats.
 
v4:
 - Replace count use in the loop in ixgbe_dev_xstats_get() function
   definition with i.
 - Breakdown "ixgbe: add NIC specific stats removed from ethdev" into
   two patches, one that adds the stats and another that extends
   ierrors to include more error stats.
 - Remove second call to ixgbe_dev_xstats_get() from
   rte_eth_xstats_get().

v5:
 - Added documentation for proc_info.
 - Fixed proc_info copyright year.
 - Display queue stats for all devices in proc_info.

v6:
 - Modified the driver implementation of ixgbe_dev_xstats_get() so that
   it doesn't worry about the generic stats written by the generic layer.

Maryam Tahhan (9):
  ixgbe: move stats register reads to a new function
  ixgbe: add functions to get and reset xstats
  ethdev: expose extended error stats
  ethdev: remove HW specific stats in stats structs
  ixgbe: add NIC specific stats removed from ethdev
  ixgbe: return more errors in ierrors
  app: remove dump_cfg
  app: add a new app proc_info
  doc: Add documentation for proc_info

 MAINTAINERS                            |   4 +
 app/Makefile                           |   2 +-
 app/dump_cfg/Makefile                  |  45 -----
 app/dump_cfg/main.c                    |  92 ---------
 app/proc_info/Makefile                 |  45 +++++
 app/proc_info/main.c                   | 354 +++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/abi.rst           |  12 ++
 doc/guides/sample_app_ug/index.rst     |   1 +
 doc/guides/sample_app_ug/proc_info.rst |  71 +++++++
 drivers/net/ixgbe/ixgbe_ethdev.c       | 193 ++++++++++++++----
 lib/librte_ether/rte_ethdev.c          |  40 ++--
 lib/librte_ether/rte_ethdev.h          |  30 ++-
 mk/rte.sdktest.mk                      |   4 +-
 13 files changed, 685 insertions(+), 208 deletions(-)
 delete mode 100644 app/dump_cfg/Makefile
 delete mode 100644 app/dump_cfg/main.c
 create mode 100644 app/proc_info/Makefile
 create mode 100644 app/proc_info/main.c
 create mode 100644 doc/guides/sample_app_ug/proc_info.rst
 mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c

-- 
2.4.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs
  2015-07-15 13:11  3% [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
@ 2015-07-15 13:11  9% ` Maryam Tahhan
  2015-07-16  7:54  0% ` [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Olivier MATZ
  1 sibling, 0 replies; 200+ results
From: Maryam Tahhan @ 2015-07-15 13:11 UTC (permalink / raw)
  To: dev

Remove non generic stats in rte_stats_strings and mark the relevant
fields in struct rte_eth_stats as deprecated.

Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
---
 doc/guides/rel_notes/abi.rst  | 12 ++++++++++++
 lib/librte_ether/rte_ethdev.c |  9 ---------
 lib/librte_ether/rte_ethdev.h | 30 ++++++++++++++++++++----------
 3 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..d5bf625 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -24,3 +24,15 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* The following fields have been deprecated in rte_eth_stats:
+  * uint64_t imissed
+  * uint64_t ibadcrc
+  * uint64_t ibadlen
+  * uint64_t imcasts
+  * uint64_t fdirmatch
+  * uint64_t fdirmiss
+  * uint64_t tx_pause_xon
+  * uint64_t rx_pause_xon
+  * uint64_t tx_pause_xoff
+  * uint64_t rx_pause_xoff
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7689328..c8f0e9a 100755
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -142,17 +142,8 @@ static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_bytes", offsetof(struct rte_eth_stats, ibytes)},
 	{"tx_bytes", offsetof(struct rte_eth_stats, obytes)},
 	{"tx_errors", offsetof(struct rte_eth_stats, oerrors)},
-	{"rx_missed_errors", offsetof(struct rte_eth_stats, imissed)},
-	{"rx_crc_errors", offsetof(struct rte_eth_stats, ibadcrc)},
-	{"rx_bad_length_errors", offsetof(struct rte_eth_stats, ibadlen)},
 	{"rx_errors", offsetof(struct rte_eth_stats, ierrors)},
 	{"alloc_rx_buff_failed", offsetof(struct rte_eth_stats, rx_nombuf)},
-	{"fdir_match", offsetof(struct rte_eth_stats, fdirmatch)},
-	{"fdir_miss", offsetof(struct rte_eth_stats, fdirmiss)},
-	{"tx_flow_control_xon", offsetof(struct rte_eth_stats, tx_pause_xon)},
-	{"rx_flow_control_xon", offsetof(struct rte_eth_stats, rx_pause_xon)},
-	{"tx_flow_control_xoff", offsetof(struct rte_eth_stats, tx_pause_xoff)},
-	{"rx_flow_control_xoff", offsetof(struct rte_eth_stats, rx_pause_xoff)},
 };
 #define RTE_NB_STATS (sizeof(rte_stats_strings) / sizeof(rte_stats_strings[0]))
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d76bbb3..a862027 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -193,19 +193,29 @@ struct rte_eth_stats {
 	uint64_t opackets;  /**< Total number of successfully transmitted packets.*/
 	uint64_t ibytes;    /**< Total number of successfully received bytes. */
 	uint64_t obytes;    /**< Total number of successfully transmitted bytes. */
-	uint64_t imissed;   /**< Total of RX missed packets (e.g full FIFO). */
-	uint64_t ibadcrc;   /**< Total of RX packets with CRC error. */
-	uint64_t ibadlen;   /**< Total of RX packets with bad length. */
+	/**< Deprecated; Total of RX missed packets (e.g full FIFO). */
+	uint64_t imissed;
+	/**< Deprecated; Total of RX packets with CRC error. */
+	uint64_t ibadcrc;
+	/**< Deprecated; Total of RX packets with bad length. */
+	uint64_t ibadlen;
 	uint64_t ierrors;   /**< Total number of erroneous received packets. */
 	uint64_t oerrors;   /**< Total number of failed transmitted packets. */
-	uint64_t imcasts;   /**< Total number of multicast received packets. */
+	uint64_t imcasts;
+	/**< Deprecated; Total number of multicast received packets. */
 	uint64_t rx_nombuf; /**< Total number of RX mbuf allocation failures. */
-	uint64_t fdirmatch; /**< Total number of RX packets matching a filter. */
-	uint64_t fdirmiss;  /**< Total number of RX packets not matching any filter. */
-	uint64_t tx_pause_xon;  /**< Total nb. of XON pause frame sent. */
-	uint64_t rx_pause_xon;  /**< Total nb. of XON pause frame received. */
-	uint64_t tx_pause_xoff; /**< Total nb. of XOFF pause frame sent. */
-	uint64_t rx_pause_xoff; /**< Total nb. of XOFF pause frame received. */
+	uint64_t fdirmatch;
+	/**< Deprecated; Total number of RX packets matching a filter. */
+	uint64_t fdirmiss;
+	/**< Deprecated; Total number of RX packets not matching any filter. */
+	uint64_t tx_pause_xon;
+	 /**< Deprecated; Total nb. of XON pause frame sent. */
+	uint64_t rx_pause_xon;
+	/**< Deprecated; Total nb. of XON pause frame received. */
+	uint64_t tx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame sent. */
+	uint64_t rx_pause_xoff;
+	/**< Deprecated; Total nb. of XOFF pause frame received. */
 	uint64_t q_ipackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
 	/**< Total number of queue RX packets. */
 	uint64_t q_opackets[RTE_ETHDEV_QUEUE_STAT_CNTRS];
-- 
2.4.3

^ permalink raw reply	[relevance 9%]

* [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2015-07-15 16:32  1%       ` [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
@ 2015-07-15 16:32 19%       ` Sergio Gonzalez Monroy
  1 sibling, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15 16:32 UTC (permalink / raw)
  To: dev

Announce the creation of dummy malloc library for 2.1 and removal of
such library, now integrated in librte_eal, for 2.2 release.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 doc/guides/rel_notes/abi.rst | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 931e785..76e0ae2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -16,7 +16,6 @@ Examples of Deprecation Notices
 
 Deprecation Notices
 -------------------
-
 * Significant ABI changes are planned for struct rte_eth_dev to support up to
   1024 queues per port. This change will be in release 2.2.
   There is no backward compatibility planned from release 2.2.
@@ -24,3 +23,8 @@ Deprecation Notices
 
 * The Macros RTE_HASH_BUCKET_ENTRIES_MAX and RTE_HASH_KEY_LENGTH_MAX are
   deprecated and will be removed with version 2.2.
+
+* librte_malloc library has been integrated into librte_eal. The 2.1 release
+  creates a dummy/empty malloc library to fulfill binaries with dynamic linking
+  dependencies on librte_malloc.so. Such dummy library will not be created from
+  release 2.2 so binaries will need to be rebuilt.
-- 
1.9.3

^ permalink raw reply	[relevance 19%]

* [dpdk-dev] [PATCH v10 0/9] Dynamic memzones
  2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
  2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
  2015-07-15  8:26 19%     ` [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
@ 2015-07-15 16:32  3%     ` Sergio Gonzalez Monroy
  2015-07-15 16:32  1%       ` [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
  2015-07-15 16:32 19%       ` [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  2 siblings, 2 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15 16:32 UTC (permalink / raw)
  To: dev

Current implemetation allows reserving/creating memzones but not the opposite
(unreserve/free). This affects mempools and other memzone based objects.

>From my point of view, implementing free functionality for memzones would look
like malloc over memsegs.
Thus, this approach moves malloc inside eal (which in turn removes a circular
dependency), where malloc heaps are composed of memsegs.
We keep both malloc and memzone APIs as they are, but memzones allocate its
memory by calling malloc_heap_alloc.
Some extra functionality is required in malloc to allow for boundary constrained
memory requests.
In summary, currently malloc is based on memzones, and with this approach
memzones are based on malloc.

v10:
 - Convert PNG to SVG
 - Fix issue with --no-huge by forcing SOCKET_ID_ANY
 - Rework some parts of the code

v9:
 - Fix incorrect size_t type that results in 32bits compilation error.

v8:
 - Rebase against current HEAD to factor for changes made by new Tile-Gx arch

v7:
 - Create a separated maintainer section for memory allocation

v6:
 - Fix bad patch for rte_memzone_free

v5:
 - Fix rte_memzone_free
 - Improve rte_memzone_free unit test

v4:
 - Rebase and fix couple of merge issues

v3:
 - Create dummy librte_malloc
 - Add deprecation notice
 - Rework some of the code
 - Doc update
 - checkpatch

v2:
 - New rte_memzone_free
 - Support memzone len = 0
 - Add all available memsegs to malloc heap at init
 - Update memzone/malloc unit tests


v6 Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Sergio Gonzalez Monroy (9):
  eal: move librte_malloc to eal/common
  eal: memzone allocated by malloc
  app/test: update malloc/memzone unit tests
  config: remove CONFIG_RTE_MALLOC_MEMZONE_SIZE
  eal: remove free_memseg and references to it
  eal: new rte_memzone_free
  app/test: rte_memzone_free unit test
  doc: announce ABI change of librte_malloc
  doc: update malloc documentation

 MAINTAINERS                                       |   22 +-
 app/test/test_malloc.c                            |   86 --
 app/test/test_memzone.c                           |  456 ++-------
 config/common_bsdapp                              |    8 +-
 config/common_linuxapp                            |    8 +-
 doc/guides/prog_guide/env_abstraction_layer.rst   |  220 ++++-
 doc/guides/prog_guide/img/malloc_heap.png         |  Bin 81329 -> 0 bytes
 doc/guides/prog_guide/img/malloc_heap.svg         | 1018 +++++++++++++++++++++
 doc/guides/prog_guide/index.rst                   |    1 -
 doc/guides/prog_guide/malloc_lib.rst              |  233 -----
 doc/guides/prog_guide/overview.rst                |   11 +-
 doc/guides/rel_notes/abi.rst                      |    6 +-
 drivers/net/af_packet/Makefile                    |    1 -
 drivers/net/bonding/Makefile                      |    1 -
 drivers/net/e1000/Makefile                        |    2 +-
 drivers/net/enic/Makefile                         |    2 +-
 drivers/net/fm10k/Makefile                        |    2 +-
 drivers/net/i40e/Makefile                         |    2 +-
 drivers/net/ixgbe/Makefile                        |    2 +-
 drivers/net/mlx4/Makefile                         |    1 -
 drivers/net/null/Makefile                         |    1 -
 drivers/net/pcap/Makefile                         |    1 -
 drivers/net/virtio/Makefile                       |    2 +-
 drivers/net/vmxnet3/Makefile                      |    2 +-
 drivers/net/xenvirt/Makefile                      |    2 +-
 lib/Makefile                                      |    2 +-
 lib/librte_acl/Makefile                           |    2 +-
 lib/librte_eal/bsdapp/eal/Makefile                |    4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |   19 +
 lib/librte_eal/common/Makefile                    |    1 +
 lib/librte_eal/common/eal_common_memzone.c        |  352 +++----
 lib/librte_eal/common/include/rte_eal_memconfig.h |    5 +-
 lib/librte_eal/common/include/rte_malloc.h        |  342 +++++++
 lib/librte_eal/common/include/rte_malloc_heap.h   |    3 +-
 lib/librte_eal/common/include/rte_memzone.h       |   11 +
 lib/librte_eal/common/malloc_elem.c               |  344 +++++++
 lib/librte_eal/common/malloc_elem.h               |  192 ++++
 lib/librte_eal/common/malloc_heap.c               |  227 +++++
 lib/librte_eal/common/malloc_heap.h               |   70 ++
 lib/librte_eal/common/rte_malloc.c                |  262 ++++++
 lib/librte_eal/linuxapp/eal/Makefile              |    4 +-
 lib/librte_eal/linuxapp/eal/eal_ivshmem.c         |   17 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c          |    2 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |   19 +
 lib/librte_hash/Makefile                          |    2 +-
 lib/librte_lpm/Makefile                           |    2 +-
 lib/librte_malloc/Makefile                        |    6 +-
 lib/librte_malloc/malloc_elem.c                   |  320 -------
 lib/librte_malloc/malloc_elem.h                   |  190 ----
 lib/librte_malloc/malloc_heap.c                   |  208 -----
 lib/librte_malloc/malloc_heap.h                   |   70 --
 lib/librte_malloc/rte_malloc.c                    |  228 +----
 lib/librte_malloc/rte_malloc.h                    |  342 -------
 lib/librte_malloc/rte_malloc_version.map          |   16 -
 lib/librte_mempool/Makefile                       |    2 -
 lib/librte_port/Makefile                          |    1 -
 lib/librte_ring/Makefile                          |    3 +-
 lib/librte_table/Makefile                         |    1 -
 58 files changed, 2988 insertions(+), 2371 deletions(-)
 delete mode 100644 doc/guides/prog_guide/img/malloc_heap.png
 create mode 100755 doc/guides/prog_guide/img/malloc_heap.svg
 delete mode 100644 doc/guides/prog_guide/malloc_lib.rst
 create mode 100644 lib/librte_eal/common/include/rte_malloc.h
 create mode 100644 lib/librte_eal/common/malloc_elem.c
 create mode 100644 lib/librte_eal/common/malloc_elem.h
 create mode 100644 lib/librte_eal/common/malloc_heap.c
 create mode 100644 lib/librte_eal/common/malloc_heap.h
 create mode 100644 lib/librte_eal/common/rte_malloc.c
 delete mode 100644 lib/librte_malloc/malloc_elem.c
 delete mode 100644 lib/librte_malloc/malloc_elem.h
 delete mode 100644 lib/librte_malloc/malloc_heap.c
 delete mode 100644 lib/librte_malloc/malloc_heap.h
 delete mode 100644 lib/librte_malloc/rte_malloc.h

-- 
1.9.3

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc
  2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
@ 2015-07-15 16:32  1%       ` Sergio Gonzalez Monroy
  2015-07-15 16:32 19%       ` [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
  1 sibling, 0 replies; 200+ results
From: Sergio Gonzalez Monroy @ 2015-07-15 16:32 UTC (permalink / raw)
  To: dev

In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.

This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.

It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
---
 lib/librte_eal/common/eal_common_memzone.c        | 290 ++++++----------------
 lib/librte_eal/common/include/rte_eal_memconfig.h |   2 +-
 lib/librte_eal/common/include/rte_malloc_heap.h   |   3 +-
 lib/librte_eal/common/malloc_elem.c               |  68 +++--
 lib/librte_eal/common/malloc_elem.h               |  14 +-
 lib/librte_eal/common/malloc_heap.c               | 161 ++++++------
 lib/librte_eal/common/malloc_heap.h               |   6 +-
 lib/librte_eal/common/rte_malloc.c                |  10 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c          |   2 +-
 9 files changed, 226 insertions(+), 330 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 9c1da71..31bf6d8 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -50,15 +50,15 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 
+#include "malloc_heap.h"
+#include "malloc_elem.h"
 #include "eal_private.h"
 
-/* internal copy of free memory segments */
-static struct rte_memseg *free_memseg = NULL;
-
 static inline const struct rte_memzone *
 memzone_lookup_thread_unsafe(const char *name)
 {
 	const struct rte_mem_config *mcfg;
+	const struct rte_memzone *mz;
 	unsigned i = 0;
 
 	/* get pointer to global configuration */
@@ -68,62 +68,50 @@ memzone_lookup_thread_unsafe(const char *name)
 	 * the algorithm is not optimal (linear), but there are few
 	 * zones and this function should be called at init only
 	 */
-	for (i = 0; i < RTE_MAX_MEMZONE && mcfg->memzone[i].addr != NULL; i++) {
-		if (!strncmp(name, mcfg->memzone[i].name, RTE_MEMZONE_NAMESIZE))
+	for (i = 0; i < RTE_MAX_MEMZONE; i++) {
+		mz = &mcfg->memzone[i];
+		if (mz->addr != NULL && !strncmp(name, mz->name, RTE_MEMZONE_NAMESIZE))
 			return &mcfg->memzone[i];
 	}
 
 	return NULL;
 }
 
-/*
- * Helper function for memzone_reserve_aligned_thread_unsafe().
- * Calculate address offset from the start of the segment.
- * Align offset in that way that it satisfy istart alignmnet and
- * buffer of the  requested length would not cross specified boundary.
- */
-static inline phys_addr_t
-align_phys_boundary(const struct rte_memseg *ms, size_t len, size_t align,
-	size_t bound)
+/* This function will return the greatest free block if a heap has been
+ * specified. If no heap has been specified, it will return the heap and
+ * length of the greatest free block available in all heaps */
+static size_t
+find_heap_max_free_elem(int *s, unsigned align)
 {
-	phys_addr_t addr_offset, bmask, end, start;
-	size_t step;
-
-	step = RTE_MAX(align, bound);
-	bmask = ~((phys_addr_t)bound - 1);
-
-	/* calculate offset to closest alignment */
-	start = RTE_ALIGN_CEIL(ms->phys_addr, align);
-	addr_offset = start - ms->phys_addr;
+	struct rte_mem_config *mcfg;
+	struct rte_malloc_socket_stats stats;
+	int i, socket = *s;
+	size_t len = 0;
 
-	while (addr_offset + len < ms->len) {
+	/* get pointer to global configuration */
+	mcfg = rte_eal_get_configuration()->mem_config;
 
-		/* check, do we meet boundary condition */
-		end = start + len - (len != 0);
-		if ((start & bmask) == (end & bmask))
-			break;
+	for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+		if ((socket != SOCKET_ID_ANY) && (socket != i))
+			continue;
 
-		/* calculate next offset */
-		start = RTE_ALIGN_CEIL(start + 1, step);
-		addr_offset = start - ms->phys_addr;
+		malloc_heap_get_stats(&mcfg->malloc_heaps[i], &stats);
+		if (stats.greatest_free_size > len) {
+			len = stats.greatest_free_size;
+			*s = i;
+		}
 	}
 
-	return addr_offset;
+	return (len - MALLOC_ELEM_OVERHEAD - align);
 }
 
 static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
-		int socket_id, uint64_t size_mask, unsigned align,
-		unsigned bound)
+		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
 	struct rte_mem_config *mcfg;
-	unsigned i = 0;
-	int memseg_idx = -1;
-	uint64_t addr_offset, seg_offset = 0;
 	size_t requested_len;
-	size_t memseg_len = 0;
-	phys_addr_t memseg_physaddr;
-	void *memseg_addr;
+	int socket, i;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -155,7 +143,6 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	if (align < RTE_CACHE_LINE_SIZE)
 		align = RTE_CACHE_LINE_SIZE;
 
-
 	/* align length on cache boundary. Check for overflow before doing so */
 	if (len > SIZE_MAX - RTE_CACHE_LINE_MASK) {
 		rte_errno = EINVAL; /* requested size too big */
@@ -169,108 +156,65 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	requested_len = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE,  len);
 
 	/* check that boundary condition is valid */
-	if (bound != 0 &&
-			(requested_len > bound || !rte_is_power_of_2(bound))) {
+	if (bound != 0 && (requested_len > bound || !rte_is_power_of_2(bound))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
 
-	/* find the smallest segment matching requirements */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		/* last segment */
-		if (free_memseg[i].addr == NULL)
-			break;
-
-		/* empty segment, skip it */
-		if (free_memseg[i].len == 0)
-			continue;
-
-		/* bad socket ID */
-		if (socket_id != SOCKET_ID_ANY &&
-		    free_memseg[i].socket_id != SOCKET_ID_ANY &&
-		    socket_id != free_memseg[i].socket_id)
-			continue;
-
-		/*
-		 * calculate offset to closest alignment that
-		 * meets boundary conditions.
-		 */
-		addr_offset = align_phys_boundary(free_memseg + i,
-			requested_len, align, bound);
+	if ((socket_id != SOCKET_ID_ANY) && (socket_id >= RTE_MAX_NUMA_NODES)) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
 
-		/* check len */
-		if ((requested_len + addr_offset) > free_memseg[i].len)
-			continue;
+	if (!rte_eal_has_hugepages())
+		socket_id = SOCKET_ID_ANY;
 
-		if ((size_mask & free_memseg[i].hugepage_sz) == 0)
-			continue;
+	if (len == 0) {
+		if (bound != 0)
+			requested_len = bound;
+		else
+			requested_len = find_heap_max_free_elem(&socket_id, align);
+	}
 
-		/* this segment is the best until now */
-		if (memseg_idx == -1) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
-		}
-		/* find the biggest contiguous zone */
-		else if (len == 0) {
-			if (free_memseg[i].len > memseg_len) {
-				memseg_idx = i;
-				memseg_len = free_memseg[i].len;
-				seg_offset = addr_offset;
-			}
-		}
-		/*
-		 * find the smallest (we already checked that current
-		 * zone length is > len
-		 */
-		else if (free_memseg[i].len + align < memseg_len ||
-				(free_memseg[i].len <= memseg_len + align &&
-				addr_offset < seg_offset)) {
-			memseg_idx = i;
-			memseg_len = free_memseg[i].len;
-			seg_offset = addr_offset;
+	if (socket_id == SOCKET_ID_ANY)
+		socket = malloc_get_numa_socket();
+	else
+		socket = socket_id;
+
+	/* allocate memory on heap */
+	void *mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[socket], NULL,
+			requested_len, flags, align, bound);
+
+	if ((mz_addr == NULL) && (socket_id == SOCKET_ID_ANY)) {
+		/* try other heaps */
+		for (i = 0; i < RTE_MAX_NUMA_NODES; i++) {
+			if (socket == i)
+				continue;
+
+			mz_addr = malloc_heap_alloc(&mcfg->malloc_heaps[i],
+					NULL, requested_len, flags, align, bound);
+			if (mz_addr != NULL)
+				break;
 		}
 	}
 
-	/* no segment found */
-	if (memseg_idx == -1) {
+	if (mz_addr == NULL) {
 		rte_errno = ENOMEM;
 		return NULL;
 	}
 
-	/* save aligned physical and virtual addresses */
-	memseg_physaddr = free_memseg[memseg_idx].phys_addr + seg_offset;
-	memseg_addr = RTE_PTR_ADD(free_memseg[memseg_idx].addr,
-			(uintptr_t) seg_offset);
-
-	/* if we are looking for a biggest memzone */
-	if (len == 0) {
-		if (bound == 0)
-			requested_len = memseg_len - seg_offset;
-		else
-			requested_len = RTE_ALIGN_CEIL(memseg_physaddr + 1,
-				bound) - memseg_physaddr;
-	}
-
-	/* set length to correct value */
-	len = (size_t)seg_offset + requested_len;
-
-	/* update our internal state */
-	free_memseg[memseg_idx].len -= len;
-	free_memseg[memseg_idx].phys_addr += len;
-	free_memseg[memseg_idx].addr =
-		(char *)free_memseg[memseg_idx].addr + len;
+	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
 	struct rte_memzone *mz = &mcfg->memzone[mcfg->memzone_idx++];
 	snprintf(mz->name, sizeof(mz->name), "%s", name);
-	mz->phys_addr = memseg_physaddr;
-	mz->addr = memseg_addr;
-	mz->len = requested_len;
-	mz->hugepage_sz = free_memseg[memseg_idx].hugepage_sz;
-	mz->socket_id = free_memseg[memseg_idx].socket_id;
+	mz->phys_addr = rte_malloc_virt2phy(mz_addr);
+	mz->addr = mz_addr;
+	mz->len = (requested_len == 0 ? elem->size : requested_len);
+	mz->hugepage_sz = elem->ms->hugepage_sz;
+	mz->socket_id = elem->ms->socket_id;
 	mz->flags = 0;
-	mz->memseg_id = memseg_idx;
+	mz->memseg_id = elem->ms - rte_eal_get_configuration()->mem_config->memseg;
 
 	return mz;
 }
@@ -282,26 +226,6 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memzone *mz = NULL;
-	uint64_t size_mask = 0;
-
-	if (flags & RTE_MEMZONE_256KB)
-		size_mask |= RTE_PGSIZE_256K;
-	if (flags & RTE_MEMZONE_2MB)
-		size_mask |= RTE_PGSIZE_2M;
-	if (flags & RTE_MEMZONE_16MB)
-		size_mask |= RTE_PGSIZE_16M;
-	if (flags & RTE_MEMZONE_256MB)
-		size_mask |= RTE_PGSIZE_256M;
-	if (flags & RTE_MEMZONE_512MB)
-		size_mask |= RTE_PGSIZE_512M;
-	if (flags & RTE_MEMZONE_1GB)
-		size_mask |= RTE_PGSIZE_1G;
-	if (flags & RTE_MEMZONE_4GB)
-		size_mask |= RTE_PGSIZE_4G;
-	if (flags & RTE_MEMZONE_16GB)
-		size_mask |= RTE_PGSIZE_16G;
-	if (!size_mask)
-		size_mask = UINT64_MAX;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
@@ -309,18 +233,7 @@ rte_memzone_reserve_thread_safe(const char *name, size_t len,
 	rte_rwlock_write_lock(&mcfg->mlock);
 
 	mz = memzone_reserve_aligned_thread_unsafe(
-		name, len, socket_id, size_mask, align, bound);
-
-	/*
-	 * If we failed to allocate the requested page size, and the
-	 * RTE_MEMZONE_SIZE_HINT_ONLY flag is specified, try allocating
-	 * again.
-	 */
-	if (!mz && rte_errno == ENOMEM && size_mask != UINT64_MAX &&
-	    flags & RTE_MEMZONE_SIZE_HINT_ONLY) {
-		mz = memzone_reserve_aligned_thread_unsafe(
-			name, len, socket_id, UINT64_MAX, align, bound);
-	}
+		name, len, socket_id, flags, align, bound);
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
@@ -412,45 +325,6 @@ rte_memzone_dump(FILE *f)
 }
 
 /*
- * called by init: modify the free memseg list to have cache-aligned
- * addresses and cache-aligned lengths
- */
-static int
-memseg_sanitize(struct rte_memseg *memseg)
-{
-	unsigned phys_align;
-	unsigned virt_align;
-	unsigned off;
-
-	phys_align = memseg->phys_addr & RTE_CACHE_LINE_MASK;
-	virt_align = (unsigned long)memseg->addr & RTE_CACHE_LINE_MASK;
-
-	/*
-	 * sanity check: phys_addr and addr must have the same
-	 * alignment
-	 */
-	if (phys_align != virt_align)
-		return -1;
-
-	/* memseg is really too small, don't bother with it */
-	if (memseg->len < (2 * RTE_CACHE_LINE_SIZE)) {
-		memseg->len = 0;
-		return 0;
-	}
-
-	/* align start address */
-	off = (RTE_CACHE_LINE_SIZE - phys_align) & RTE_CACHE_LINE_MASK;
-	memseg->phys_addr += off;
-	memseg->addr = (char *)memseg->addr + off;
-	memseg->len -= off;
-
-	/* align end address */
-	memseg->len &= ~((uint64_t)RTE_CACHE_LINE_MASK);
-
-	return 0;
-}
-
-/*
  * Init the memzone subsystem
  */
 int
@@ -458,14 +332,10 @@ rte_eal_memzone_init(void)
 {
 	struct rte_mem_config *mcfg;
 	const struct rte_memseg *memseg;
-	unsigned i = 0;
 
 	/* get pointer to global configuration */
 	mcfg = rte_eal_get_configuration()->mem_config;
 
-	/* mirror the runtime memsegs from config */
-	free_memseg = mcfg->free_memseg;
-
 	/* secondary processes don't need to initialise anything */
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY)
 		return 0;
@@ -478,33 +348,13 @@ rte_eal_memzone_init(void)
 
 	rte_rwlock_write_lock(&mcfg->mlock);
 
-	/* fill in uninitialized free_memsegs */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (memseg[i].addr == NULL)
-			break;
-		if (free_memseg[i].addr != NULL)
-			continue;
-		memcpy(&free_memseg[i], &memseg[i], sizeof(struct rte_memseg));
-	}
-
-	/* make all zones cache-aligned */
-	for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-		if (free_memseg[i].addr == NULL)
-			break;
-		if (memseg_sanitize(&free_memseg[i]) < 0) {
-			RTE_LOG(ERR, EAL, "%s(): Sanity check failed\n", __func__);
-			rte_rwlock_write_unlock(&mcfg->mlock);
-			return -1;
-		}
-	}
-
 	/* delete all zones */
 	mcfg->memzone_idx = 0;
 	memset(mcfg->memzone, 0, sizeof(mcfg->memzone));
 
 	rte_rwlock_write_unlock(&mcfg->mlock);
 
-	return 0;
+	return rte_eal_malloc_heap_init();
 }
 
 /* Walk all reserved memory zones */
diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h b/lib/librte_eal/common/include/rte_eal_memconfig.h
index 34f5abc..055212a 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -73,7 +73,7 @@ struct rte_mem_config {
 	struct rte_memseg memseg[RTE_MAX_MEMSEG];    /**< Physmem descriptors. */
 	struct rte_memzone memzone[RTE_MAX_MEMZONE]; /**< Memzone descriptors. */
 
-	/* Runtime Physmem descriptors. */
+	/* Runtime Physmem descriptors - NOT USED */
 	struct rte_memseg free_memseg[RTE_MAX_MEMSEG];
 
 	struct rte_tailq_head tailq_head[RTE_MAX_TAILQ]; /**< Tailqs for objects */
diff --git a/lib/librte_eal/common/include/rte_malloc_heap.h b/lib/librte_eal/common/include/rte_malloc_heap.h
index 716216f..b270356 100644
--- a/lib/librte_eal/common/include/rte_malloc_heap.h
+++ b/lib/librte_eal/common/include/rte_malloc_heap.h
@@ -40,7 +40,7 @@
 #include <rte_memory.h>
 
 /* Number of free lists per heap, grouped by size. */
-#define RTE_HEAP_NUM_FREELISTS  5
+#define RTE_HEAP_NUM_FREELISTS  13
 
 /**
  * Structure to hold malloc heap
@@ -48,7 +48,6 @@
 struct malloc_heap {
 	rte_spinlock_t lock;
 	LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
-	unsigned mz_count;
 	unsigned alloc_count;
 	size_t total_size;
 } __rte_cache_aligned;
diff --git a/lib/librte_eal/common/malloc_elem.c b/lib/librte_eal/common/malloc_elem.c
index a5e1248..b54ee33 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -37,7 +37,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_launch.h>
 #include <rte_per_lcore.h>
@@ -56,10 +55,10 @@
  */
 void
 malloc_elem_init(struct malloc_elem *elem,
-		struct malloc_heap *heap, const struct rte_memzone *mz, size_t size)
+		struct malloc_heap *heap, const struct rte_memseg *ms, size_t size)
 {
 	elem->heap = heap;
-	elem->mz = mz;
+	elem->ms = ms;
 	elem->prev = NULL;
 	memset(&elem->free_list, 0, sizeof(elem->free_list));
 	elem->state = ELEM_FREE;
@@ -70,12 +69,12 @@ malloc_elem_init(struct malloc_elem *elem,
 }
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
 {
-	malloc_elem_init(elem, prev->heap, prev->mz, 0);
+	malloc_elem_init(elem, prev->heap, prev->ms, 0);
 	elem->prev = prev;
 	elem->state = ELEM_BUSY; /* mark busy so its never merged */
 }
@@ -86,12 +85,24 @@ malloc_elem_mkend(struct malloc_elem *elem, struct malloc_elem *prev)
  * fit, return NULL.
  */
 static void *
-elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
+elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	const uintptr_t end_pt = (uintptr_t)elem +
+	const size_t bmask = ~(bound - 1);
+	uintptr_t end_pt = (uintptr_t)elem +
 			elem->size - MALLOC_ELEM_TRAILER_LEN;
-	const uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
-	const uintptr_t new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
+	uintptr_t new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+	uintptr_t new_elem_start;
+
+	/* check boundary */
+	if ((new_data_start & bmask) != ((end_pt - 1) & bmask)) {
+		end_pt = RTE_ALIGN_FLOOR(end_pt, bound);
+		new_data_start = RTE_ALIGN_FLOOR((end_pt - size), align);
+		if (((end_pt - 1) & bmask) != (new_data_start & bmask))
+			return NULL;
+	}
+
+	new_elem_start = new_data_start - MALLOC_ELEM_HEADER_LEN;
 
 	/* if the new start point is before the exist start, it won't fit */
 	return (new_elem_start < (uintptr_t)elem) ? NULL : (void *)new_elem_start;
@@ -102,9 +113,10 @@ elem_start_pt(struct malloc_elem *elem, size_t size, unsigned align)
  * alignment request from the current element
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,	unsigned align,
+		size_t bound)
 {
-	return elem_start_pt(elem, size, align) != NULL;
+	return elem_start_pt(elem, size, align, bound) != NULL;
 }
 
 /*
@@ -115,10 +127,10 @@ static void
 split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt)
 {
 	struct malloc_elem *next_elem = RTE_PTR_ADD(elem, elem->size);
-	const unsigned old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
-	const unsigned new_elem_size = elem->size - old_elem_size;
+	const size_t old_elem_size = (uintptr_t)split_pt - (uintptr_t)elem;
+	const size_t new_elem_size = elem->size - old_elem_size;
 
-	malloc_elem_init(split_pt, elem->heap, elem->mz, new_elem_size);
+	malloc_elem_init(split_pt, elem->heap, elem->ms, new_elem_size);
 	split_pt->prev = elem;
 	next_elem->prev = split_pt;
 	elem->size = old_elem_size;
@@ -168,8 +180,9 @@ malloc_elem_free_list_index(size_t size)
 void
 malloc_elem_free_list_insert(struct malloc_elem *elem)
 {
-	size_t idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
+	size_t idx;
 
+	idx = malloc_elem_free_list_index(elem->size - MALLOC_ELEM_HEADER_LEN);
 	elem->state = ELEM_FREE;
 	LIST_INSERT_HEAD(&elem->heap->free_head[idx], elem, free_list);
 }
@@ -190,12 +203,26 @@ elem_free_list_remove(struct malloc_elem *elem)
  * is not done here, as it's done there previously.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
+malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align,
+		size_t bound)
 {
-	struct malloc_elem *new_elem = elem_start_pt(elem, size, align);
-	const unsigned old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	struct malloc_elem *new_elem = elem_start_pt(elem, size, align, bound);
+	const size_t old_elem_size = (uintptr_t)new_elem - (uintptr_t)elem;
+	const size_t trailer_size = elem->size - old_elem_size - size -
+		MALLOC_ELEM_OVERHEAD;
+
+	elem_free_list_remove(elem);
 
-	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE){
+	if (trailer_size > MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
+		/* split it, too much free space after elem */
+		struct malloc_elem *new_free_elem =
+				RTE_PTR_ADD(new_elem, size + MALLOC_ELEM_OVERHEAD);
+
+		split_elem(elem, new_free_elem);
+		malloc_elem_free_list_insert(new_free_elem);
+	}
+
+	if (old_elem_size < MALLOC_ELEM_OVERHEAD + MIN_DATA_SIZE) {
 		/* don't split it, pad the element instead */
 		elem->state = ELEM_BUSY;
 		elem->pad = old_elem_size;
@@ -208,8 +235,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 			new_elem->size = elem->size - elem->pad;
 			set_header(new_elem);
 		}
-		/* remove element from free list */
-		elem_free_list_remove(elem);
 
 		return new_elem;
 	}
@@ -219,7 +244,6 @@ malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align)
 	 * Re-insert original element, in case its new size makes it
 	 * belong on a different list.
 	 */
-	elem_free_list_remove(elem);
 	split_elem(elem, new_elem);
 	new_elem->state = ELEM_BUSY;
 	malloc_elem_free_list_insert(elem);
diff --git a/lib/librte_eal/common/malloc_elem.h b/lib/librte_eal/common/malloc_elem.h
index 9790b1a..e05d2ea 100644
--- a/lib/librte_eal/common/malloc_elem.h
+++ b/lib/librte_eal/common/malloc_elem.h
@@ -47,9 +47,9 @@ enum elem_state {
 
 struct malloc_elem {
 	struct malloc_heap *heap;
-	struct malloc_elem *volatile prev;      /* points to prev elem in memzone */
+	struct malloc_elem *volatile prev;      /* points to prev elem in memseg */
 	LIST_ENTRY(malloc_elem) free_list;      /* list of free elements in heap */
-	const struct rte_memzone *mz;
+	const struct rte_memseg *ms;
 	volatile enum elem_state state;
 	uint32_t pad;
 	size_t size;
@@ -136,11 +136,11 @@ malloc_elem_from_data(const void *data)
 void
 malloc_elem_init(struct malloc_elem *elem,
 		struct malloc_heap *heap,
-		const struct rte_memzone *mz,
+		const struct rte_memseg *ms,
 		size_t size);
 
 /*
- * initialise a dummy malloc_elem header for the end-of-memzone marker
+ * initialise a dummy malloc_elem header for the end-of-memseg marker
  */
 void
 malloc_elem_mkend(struct malloc_elem *elem,
@@ -151,14 +151,16 @@ malloc_elem_mkend(struct malloc_elem *elem,
  * of the requested size and with the requested alignment
  */
 int
-malloc_elem_can_hold(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_can_hold(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * reserve a block of data in an existing malloc_elem. If the malloc_elem
  * is much larger than the data block requested, we split the element in two.
  */
 struct malloc_elem *
-malloc_elem_alloc(struct malloc_elem *elem, size_t size, unsigned align);
+malloc_elem_alloc(struct malloc_elem *elem, size_t size,
+		unsigned align, size_t bound);
 
 /*
  * free a malloc_elem block by adding it to the free list. If the
diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index 8861d27..21d8914 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -39,7 +39,6 @@
 #include <sys/queue.h>
 
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_launch.h>
@@ -54,123 +53,125 @@
 #include "malloc_elem.h"
 #include "malloc_heap.h"
 
-/* since the memzone size starts with a digit, it will appear unquoted in
- * rte_config.h, so quote it so it can be passed to rte_str_to_size */
-#define MALLOC_MEMZONE_SIZE RTE_STR(RTE_MALLOC_MEMZONE_SIZE)
-
-/*
- * returns the configuration setting for the memzone size as a size_t value
- */
-static inline size_t
-get_malloc_memzone_size(void)
+static unsigned
+check_hugepage_sz(unsigned flags, uint64_t hugepage_sz)
 {
-	return rte_str_to_size(MALLOC_MEMZONE_SIZE);
+	unsigned check_flag = 0;
+
+	if (!(flags & ~RTE_MEMZONE_SIZE_HINT_ONLY))
+		return 1;
+
+	switch (hugepage_sz) {
+	case RTE_PGSIZE_256K:
+		check_flag = RTE_MEMZONE_256KB;
+		break;
+	case RTE_PGSIZE_2M:
+		check_flag = RTE_MEMZONE_2MB;
+		break;
+	case RTE_PGSIZE_16M:
+		check_flag = RTE_MEMZONE_16MB;
+		break;
+	case RTE_PGSIZE_256M:
+		check_flag = RTE_MEMZONE_256MB;
+		break;
+	case RTE_PGSIZE_512M:
+		check_flag = RTE_MEMZONE_512MB;
+		break;
+	case RTE_PGSIZE_1G:
+		check_flag = RTE_MEMZONE_1GB;
+		break;
+	case RTE_PGSIZE_4G:
+		check_flag = RTE_MEMZONE_4GB;
+		break;
+	case RTE_PGSIZE_16G:
+		check_flag = RTE_MEMZONE_16GB;
+	}
+
+	return (check_flag & flags);
 }
 
 /*
- * reserve an extra memory zone and make it available for use by a particular
- * heap. This reserves the zone and sets a dummy malloc_elem header at the end
+ * Expand the heap with a memseg.
+ * This reserves the zone and sets a dummy malloc_elem header at the end
  * to prevent overflow. The rest of the zone is added to free list as a single
  * large free block
  */
-static int
-malloc_heap_add_memzone(struct malloc_heap *heap, size_t size, unsigned align)
+static void
+malloc_heap_add_memseg(struct malloc_heap *heap, struct rte_memseg *ms)
 {
-	const unsigned mz_flags = 0;
-	const size_t block_size = get_malloc_memzone_size();
-	/* ensure the data we want to allocate will fit in the memzone */
-	const size_t min_size = size + align + MALLOC_ELEM_OVERHEAD * 2;
-	const struct rte_memzone *mz = NULL;
-	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	unsigned numa_socket = heap - mcfg->malloc_heaps;
-
-	size_t mz_size = min_size;
-	if (mz_size < block_size)
-		mz_size = block_size;
-
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	snprintf(mz_name, sizeof(mz_name), "MALLOC_S%u_HEAP_%u",
-		     numa_socket, heap->mz_count++);
-
-	/* try getting a block. if we fail and we don't need as big a block
-	 * as given in the config, we can shrink our request and try again
-	 */
-	do {
-		mz = rte_memzone_reserve(mz_name, mz_size, numa_socket,
-					 mz_flags);
-		if (mz == NULL)
-			mz_size /= 2;
-	} while (mz == NULL && mz_size > min_size);
-	if (mz == NULL)
-		return -1;
-
 	/* allocate the memory block headers, one at end, one at start */
-	struct malloc_elem *start_elem = (struct malloc_elem *)mz->addr;
-	struct malloc_elem *end_elem = RTE_PTR_ADD(mz->addr,
-			mz_size - MALLOC_ELEM_OVERHEAD);
+	struct malloc_elem *start_elem = (struct malloc_elem *)ms->addr;
+	struct malloc_elem *end_elem = RTE_PTR_ADD(ms->addr,
+			ms->len - MALLOC_ELEM_OVERHEAD);
 	end_elem = RTE_PTR_ALIGN_FLOOR(end_elem, RTE_CACHE_LINE_SIZE);
+	const size_t elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
 
-	const unsigned elem_size = (uintptr_t)end_elem - (uintptr_t)start_elem;
-	malloc_elem_init(start_elem, heap, mz, elem_size);
+	malloc_elem_init(start_elem, heap, ms, elem_size);
 	malloc_elem_mkend(end_elem, start_elem);
 	malloc_elem_free_list_insert(start_elem);
 
-	/* increase heap total size by size of new memzone */
-	heap->total_size+=mz_size - MALLOC_ELEM_OVERHEAD;
-	return 0;
+	heap->total_size += elem_size;
 }
 
 /*
  * Iterates through the freelist for a heap to find a free element
  * which can store data of the required size and with the requested alignment.
+ * If size is 0, find the biggest available elem.
  * Returns null on failure, or pointer to element on success.
  */
 static struct malloc_elem *
-find_suitable_element(struct malloc_heap *heap, size_t size, unsigned align)
+find_suitable_element(struct malloc_heap *heap, size_t size,
+		unsigned flags, size_t align, size_t bound)
 {
 	size_t idx;
-	struct malloc_elem *elem;
+	struct malloc_elem *elem, *alt_elem = NULL;
 
 	for (idx = malloc_elem_free_list_index(size);
-		idx < RTE_HEAP_NUM_FREELISTS; idx++)
-	{
+			idx < RTE_HEAP_NUM_FREELISTS; idx++) {
 		for (elem = LIST_FIRST(&heap->free_head[idx]);
-			!!elem; elem = LIST_NEXT(elem, free_list))
-		{
-			if (malloc_elem_can_hold(elem, size, align))
-				return elem;
+				!!elem; elem = LIST_NEXT(elem, free_list)) {
+			if (malloc_elem_can_hold(elem, size, align, bound)) {
+				if (check_hugepage_sz(flags, elem->ms->hugepage_sz))
+					return elem;
+				if (alt_elem == NULL)
+					alt_elem = elem;
+			}
 		}
 	}
+
+	if ((alt_elem != NULL) && (flags & RTE_MEMZONE_SIZE_HINT_ONLY))
+		return alt_elem;
+
 	return NULL;
 }
 
 /*
- * Main function called by malloc to allocate a block of memory from the
- * heap. It locks the free list, scans it, and adds a new memzone if the
- * scan fails. Once the new memzone is added, it re-scans and should return
+ * Main function to allocate a block of memory from the heap.
+ * It locks the free list, scans it, and adds a new memseg if the
+ * scan fails. Once the new memseg is added, it re-scans and should return
  * the new element after releasing the lock.
  */
 void *
 malloc_heap_alloc(struct malloc_heap *heap,
-		const char *type __attribute__((unused)), size_t size, unsigned align)
+		const char *type __attribute__((unused)), size_t size, unsigned flags,
+		size_t align, size_t bound)
 {
+	struct malloc_elem *elem;
+
 	size = RTE_CACHE_LINE_ROUNDUP(size);
 	align = RTE_CACHE_LINE_ROUNDUP(align);
+
 	rte_spinlock_lock(&heap->lock);
-	struct malloc_elem *elem = find_suitable_element(heap, size, align);
-	if (elem == NULL){
-		if ((malloc_heap_add_memzone(heap, size, align)) == 0)
-			elem = find_suitable_element(heap, size, align);
-	}
 
-	if (elem != NULL){
-		elem = malloc_elem_alloc(elem, size, align);
+	elem = find_suitable_element(heap, size, flags, align, bound);
+	if (elem != NULL) {
+		elem = malloc_elem_alloc(elem, size, align, bound);
 		/* increase heap's count of allocated elements */
 		heap->alloc_count++;
 	}
 	rte_spinlock_unlock(&heap->lock);
-	return elem == NULL ? NULL : (void *)(&elem[1]);
 
+	return elem == NULL ? NULL : (void *)(&elem[1]);
 }
 
 /*
@@ -206,3 +207,21 @@ malloc_heap_get_stats(const struct malloc_heap *heap,
 	socket_stats->alloc_count = heap->alloc_count;
 	return 0;
 }
+
+int
+rte_eal_malloc_heap_init(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	unsigned ms_cnt;
+	struct rte_memseg *ms;
+
+	if (mcfg == NULL)
+		return -1;
+
+	for (ms = &mcfg->memseg[0], ms_cnt = 0;
+			(ms_cnt < RTE_MAX_MEMSEG) && (ms->len > 0);
+			ms_cnt++, ms++)
+		malloc_heap_add_memseg(&mcfg->malloc_heaps[ms->socket_id], ms);
+
+	return 0;
+}
diff --git a/lib/librte_eal/common/malloc_heap.h b/lib/librte_eal/common/malloc_heap.h
index a47136d..3ccbef0 100644
--- a/lib/librte_eal/common/malloc_heap.h
+++ b/lib/librte_eal/common/malloc_heap.h
@@ -53,15 +53,15 @@ malloc_get_numa_socket(void)
 }
 
 void *
-malloc_heap_alloc(struct malloc_heap *heap, const char *type,
-		size_t size, unsigned align);
+malloc_heap_alloc(struct malloc_heap *heap,	const char *type, size_t size,
+		unsigned flags, size_t align, size_t bound);
 
 int
 malloc_heap_get_stats(const struct malloc_heap *heap,
 		struct rte_malloc_socket_stats *socket_stats);
 
 int
-rte_eal_heap_memzone_init(void);
+rte_eal_malloc_heap_init(void);
 
 #ifdef __cplusplus
 }
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index c313a57..47deb00 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -39,7 +39,6 @@
 
 #include <rte_memcpy.h>
 #include <rte_memory.h>
-#include <rte_memzone.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
 #include <rte_branch_prediction.h>
@@ -77,6 +76,9 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 	if (size == 0 || (align && !rte_is_power_of_2(align)))
 		return NULL;
 
+	if (!rte_eal_has_hugepages())
+		socket_arg = SOCKET_ID_ANY;
+
 	if (socket_arg == SOCKET_ID_ANY)
 		socket = malloc_get_numa_socket();
 	else
@@ -87,7 +89,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 		return NULL;
 
 	ret = malloc_heap_alloc(&mcfg->malloc_heaps[socket], type,
-				size, align == 0 ? 1 : align);
+				size, 0, align == 0 ? 1 : align, 0);
 	if (ret != NULL || socket_arg != SOCKET_ID_ANY)
 		return ret;
 
@@ -98,7 +100,7 @@ rte_malloc_socket(const char *type, size_t size, unsigned align, int socket_arg)
 			continue;
 
 		ret = malloc_heap_alloc(&mcfg->malloc_heaps[i], type,
-					size, align == 0 ? 1 : align);
+					size, 0, align == 0 ? 1 : align, 0);
 		if (ret != NULL)
 			return ret;
 	}
@@ -256,5 +258,5 @@ rte_malloc_virt2phy(const void *addr)
 	const struct malloc_elem *elem = malloc_elem_from_data(addr);
 	if (elem == NULL)
 		return 0;
-	return elem->mz->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->mz->addr);
+	return elem->ms->phys_addr + ((uintptr_t)addr - (uintptr_t)elem->ms->addr);
 }
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 4fd63bb..80ee78f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1071,7 +1071,7 @@ rte_eal_hugepage_init(void)
 		mcfg->memseg[0].addr = addr;
 		mcfg->memseg[0].hugepage_sz = RTE_PGSIZE_4K;
 		mcfg->memseg[0].len = internal_config.memory;
-		mcfg->memseg[0].socket_id = SOCKET_ID_ANY;
+		mcfg->memseg[0].socket_id = 0;
 		return 0;
 	}
 
-- 
1.9.3

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v10 00/19] unified packet type
  2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
                     ` (18 preceding siblings ...)
  2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks Helin Zhang
@ 2015-07-15 23:00  0%   ` Thomas Monjalon
  2015-07-15 23:51  0%     ` Zhang, Helin
  19 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-15 23:00 UTC (permalink / raw)
  To: Helin Zhang; +Cc: dev

2015-07-10 00:31, Helin Zhang:
> Currently only 6 bits which are stored in ol_flags are used to indicate the
> packet types. This is not enough, as some NIC hardware can recognize quite
> a lot of packet types, e.g i40e hardware can recognize more than 150 packet
> types. Hiding those packet types hides hardware offload capabilities which
> could be quite useful for improving performance and for end users.
> So an unified packet types are needed to support all possible PMDs. A 16
> bits packet_type in mbuf structure can be changed to 32 bits and used for
> this purpose. In addition, all packet types stored in ol_flag field should
> be deleted at all, and 6 bits of ol_flags can be save as the benifit.
> 
> Initially, 32 bits of packet_type can be divided into several sub fields to
> indicate different packet type information of a packet. The initial design
> is to divide those bits into fields for L2 types, L3 types, L4 types, tunnel
> types, inner L2 types, inner L3 types and inner L4 types. All PMDs should
> translate the offloaded packet types into these 7 fields of information, for
> user applications.
> 
> To avoid breaking ABI compatibility, currently all the code changes for
> unified packet type are disabled at compile time by default. Users can enable
> it manually by defining the macro of RTE_NEXT_ABI. The code changes will be
> valid by default in a future release, and the old version will be deleted
> accordingly, after the ABI change process is done.

Applied with fixes for cxgbe and mlx4, thanks everyone

The macro RTE_ETH_IS_TUNNEL_PKT may need to take RTE_PTYPE_INNER_* into account.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] doc: announce ABI changes planned for unified packet type
  2015-07-09  0:56  4%   ` Wu, Jingjing
@ 2015-07-15 23:37  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-15 23:37 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev

> > The significant ABI changes of all shared libraries are planned to support
> > unified packet type which will be taken effect from release 2.2. Here
> > announces that ABI changes in detail.
> > 
> > Signed-off-by: Helin Zhang <helin.zhang@intel.com>
> Acked-by: Jingjing Wu <jingjing.wu@intel.com>

Applied with rewording to take new NEXT_ABI option into account.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v10 00/19] unified packet type
  2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
@ 2015-07-15 23:51  0%     ` Zhang, Helin
  0 siblings, 0 replies; 200+ results
From: Zhang, Helin @ 2015-07-15 23:51 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, July 15, 2015 4:01 PM
> To: Zhang, Helin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v10 00/19] unified packet type
> 
> 2015-07-10 00:31, Helin Zhang:
> > Currently only 6 bits which are stored in ol_flags are used to
> > indicate the packet types. This is not enough, as some NIC hardware
> > can recognize quite a lot of packet types, e.g i40e hardware can
> > recognize more than 150 packet types. Hiding those packet types hides
> > hardware offload capabilities which could be quite useful for improving
> performance and for end users.
> > So an unified packet types are needed to support all possible PMDs. A
> > 16 bits packet_type in mbuf structure can be changed to 32 bits and
> > used for this purpose. In addition, all packet types stored in ol_flag
> > field should be deleted at all, and 6 bits of ol_flags can be save as the benifit.
> >
> > Initially, 32 bits of packet_type can be divided into several sub
> > fields to indicate different packet type information of a packet. The
> > initial design is to divide those bits into fields for L2 types, L3
> > types, L4 types, tunnel types, inner L2 types, inner L3 types and
> > inner L4 types. All PMDs should translate the offloaded packet types
> > into these 7 fields of information, for user applications.
> >
> > To avoid breaking ABI compatibility, currently all the code changes
> > for unified packet type are disabled at compile time by default. Users
> > can enable it manually by defining the macro of RTE_NEXT_ABI. The code
> > changes will be valid by default in a future release, and the old
> > version will be deleted accordingly, after the ABI change process is done.
> 
> Applied with fixes for cxgbe and mlx4, thanks everyone
> 
> The macro RTE_ETH_IS_TUNNEL_PKT may need to take RTE_PTYPE_INNER_*
> into account.

Thank you so much!
Thanks to all the contributors!

Helin

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps
  2015-07-15 13:11  3% [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
  2015-07-15 13:11  9% ` [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
@ 2015-07-16  7:54  0% ` Olivier MATZ
  1 sibling, 0 replies; 200+ results
From: Olivier MATZ @ 2015-07-16  7:54 UTC (permalink / raw)
  To: Maryam Tahhan, dev

Hi Maryam,

On 07/15/2015 03:11 PM, Maryam Tahhan wrote:
> This patch set implements xstats_get() and xstats_reset() in dev_ops for
> ixgbe to expose detailed error statistics to DPDK applications. The
> dump_cfg application was extended to demonstrate the usage of
> retrieving statistics for DPDK interfaces and renamed to proc_info
> in order reflect this new functionality. This patch set also removes non
> generic statistics from the statistics strings at the ethdev level and
> marks the relevant registers as depricated in struct rte_eth_stats.
>
> v2:
>   - Fixed patch dependencies.
>   - Broke down patches into smaller logical changes.
>
> v3:
>   - Removes non-generic stats fields in rte_stats_strings and deprecates
>     the fields related to them in struct rte_eth_stats.
>   - Modifies rte_eth_xstats_get() to return generic stats and extended
>     stats.
>
> v4:
>   - Replace count use in the loop in ixgbe_dev_xstats_get() function
>     definition with i.
>   - Breakdown "ixgbe: add NIC specific stats removed from ethdev" into
>     two patches, one that adds the stats and another that extends
>     ierrors to include more error stats.
>   - Remove second call to ixgbe_dev_xstats_get() from
>     rte_eth_xstats_get().
>
> v5:
>   - Added documentation for proc_info.
>   - Fixed proc_info copyright year.
>   - Display queue stats for all devices in proc_info.
>
> v6:
>   - Modified the driver implementation of ixgbe_dev_xstats_get() so that
>     it doesn't worry about the generic stats written by the generic layer.
>
> Maryam Tahhan (9):
>    ixgbe: move stats register reads to a new function
>    ixgbe: add functions to get and reset xstats
>    ethdev: expose extended error stats
>    ethdev: remove HW specific stats in stats structs
>    ixgbe: add NIC specific stats removed from ethdev
>    ixgbe: return more errors in ierrors
>    app: remove dump_cfg
>    app: add a new app proc_info
>    doc: Add documentation for proc_info
>
>   MAINTAINERS                            |   4 +
>   app/Makefile                           |   2 +-
>   app/dump_cfg/Makefile                  |  45 -----
>   app/dump_cfg/main.c                    |  92 ---------
>   app/proc_info/Makefile                 |  45 +++++
>   app/proc_info/main.c                   | 354 +++++++++++++++++++++++++++++++++
>   doc/guides/rel_notes/abi.rst           |  12 ++
>   doc/guides/sample_app_ug/index.rst     |   1 +
>   doc/guides/sample_app_ug/proc_info.rst |  71 +++++++
>   drivers/net/ixgbe/ixgbe_ethdev.c       | 193 ++++++++++++++----
>   lib/librte_ether/rte_ethdev.c          |  40 ++--
>   lib/librte_ether/rte_ethdev.h          |  30 ++-
>   mk/rte.sdktest.mk                      |   4 +-
>   13 files changed, 685 insertions(+), 208 deletions(-)
>   delete mode 100644 app/dump_cfg/Makefile
>   delete mode 100644 app/dump_cfg/main.c
>   create mode 100644 app/proc_info/Makefile
>   create mode 100644 app/proc_info/main.c
>   create mode 100644 doc/guides/sample_app_ug/proc_info.rst
>   mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
@ 2015-07-16 11:36 14% Cristian Dumitrescu
  2015-07-16 11:50  4% ` Gajdzica, MaciejX T
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 11:36 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 126f73e..942f3ea 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -29,3 +29,8 @@ Deprecation Notices
   and several ``PKT_RX_`` flags will be removed, to support unified packet type
   from release 2.1. Those changes may be enabled in the upcoming release 2.1
   with CONFIG_RTE_NEXT_ABI.
+
+* librte_cfgfile (rte_cfgfile.h): In order to allow for longer names and values,
+  the value of macros CFG_NAME_LEN and CFG_NAME_VAL will be increased. Most
+  likely, the new values will be 64 and 256, respectively.
+
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
  2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
@ 2015-07-16 11:50  4% ` Gajdzica, MaciejX T
  2015-07-16 12:28  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-16 11:50 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 1:37 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
@ 2015-07-16 12:19 14% Cristian Dumitrescu
  2015-07-16 12:25  4% ` Gajdzica, MaciejX T
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 12:19 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..271e08e 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,15 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_port (rte_port.h): Macros to access the packet meta-data stored within
+  the packet buffer will be adjusted to cover the packet mbuf structure as well,
+  as currently they are able to access any packet buffer location except the
+  packet mbuf structure. The consequence is that applications currently using
+  these macros will have to adjust the value of the offset parameter of these
+  macros by increasing it with sizeof(struc rte_mbuf). The affected macros are:
+  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
+  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes, most likely
+  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be changed from
+  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
+  ``(&((uint8_t *) (mbuf))[offset])``.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
@ 2015-07-16 12:25  4% ` Gajdzica, MaciejX T
  2015-07-16 12:25  4% ` Thomas Monjalon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-16 12:25 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 2:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
  2015-07-16 12:25  4% ` Gajdzica, MaciejX T
@ 2015-07-16 12:25  4% ` Thomas Monjalon
  2015-07-16 15:09  4%   ` Dumitrescu, Cristian
  2015-07-16 12:30  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  3 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-16 12:25 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev

2015-07-16 13:19, Cristian Dumitrescu:
> +* librte_port (rte_port.h): Macros to access the packet meta-data stored within
> +  the packet buffer will be adjusted to cover the packet mbuf structure as well,
> +  as currently they are able to access any packet buffer location except the
> +  packet mbuf structure. The consequence is that applications currently using
> +  these macros will have to adjust the value of the offset parameter of these
> +  macros by increasing it with sizeof(struc rte_mbuf). The affected macros are:
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes, most likely
> +  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be changed from
> +  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
> +  ``(&((uint8_t *) (mbuf))[offset])``.

Cristian,
General comment: you are too verbose :)
Specifically on this patch: same comment ;)

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
  2015-07-16 12:25  4% ` Gajdzica, MaciejX T
  2015-07-16 12:25  4% ` Thomas Monjalon
@ 2015-07-16 12:30  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  3 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-16 12:30 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 2:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  doc/guides/rel_notes/abi.rst |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 9e98d62..271e08e 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -34,3 +34,15 @@ Deprecation Notices
>    creates a dummy/empty malloc library to fulfill binaries with dynamic linking
>    dependencies on librte_malloc.so. Such dummy library will not be created from
>    release 2.2 so binaries will need to be rebuilt.
> +
> +* librte_port (rte_port.h): Macros to access the packet meta-data stored within
> +  the packet buffer will be adjusted to cover the packet mbuf structure as well,
> +  as currently they are able to access any packet buffer location except the
> +  packet mbuf structure. The consequence is that applications currently using
> +  these macros will have to adjust the value of the offset parameter of these
> +  macros by increasing it with sizeof(struc rte_mbuf). The affected macros are:
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
> +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes, most
> likely
> +  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be
> changed from
> +  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
> +  ``(&((uint8_t *) (mbuf))[offset])``.
> --
> 1.7.4.1

> 1.7.4.1


Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
  2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
  2015-07-16 11:50  4% ` Gajdzica, MaciejX T
@ 2015-07-16 12:28  4% ` Mrzyglod, DanielX T
  2015-07-16 12:49  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-16 12:28 UTC (permalink / raw)
  To: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 1:37 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  doc/guides/rel_notes/abi.rst |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 126f73e..942f3ea 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -29,3 +29,8 @@ Deprecation Notices
>    and several ``PKT_RX_`` flags will be removed, to support unified packet type
>    from release 2.1. Those changes may be enabled in the upcoming release 2.1
>    with CONFIG_RTE_NEXT_ABI.
> +
> +* librte_cfgfile (rte_cfgfile.h): In order to allow for longer names and values,
> +  the value of macros CFG_NAME_LEN and CFG_NAME_VAL will be increased.
> Most
> +  likely, the new values will be 64 and 256, respectively.
> +
> --
> 1.7.4.1

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
  2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
  2015-07-16 11:50  4% ` Gajdzica, MaciejX T
  2015-07-16 12:28  4% ` Mrzyglod, DanielX T
@ 2015-07-16 12:49  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-16 12:49 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 12:37 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
                   ` (2 preceding siblings ...)
  2015-07-16 12:30  4% ` Mrzyglod, DanielX T
@ 2015-07-16 12:49  4% ` Singh, Jasvinder
  3 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-16 12:49 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 1:20 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
  2015-07-16 12:25  4% ` Thomas Monjalon
@ 2015-07-16 15:09  4%   ` Dumitrescu, Cristian
  0 siblings, 0 replies; 200+ results
From: Dumitrescu, Cristian @ 2015-07-16 15:09 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, July 16, 2015 1:26 PM
> To: Dumitrescu, Cristian
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_port
> 
> 2015-07-16 13:19, Cristian Dumitrescu:
> > +* librte_port (rte_port.h): Macros to access the packet meta-data stored
> within
> > +  the packet buffer will be adjusted to cover the packet mbuf structure as
> well,
> > +  as currently they are able to access any packet buffer location except the
> > +  packet mbuf structure. The consequence is that applications currently
> using
> > +  these macros will have to adjust the value of the offset parameter of
> these
> > +  macros by increasing it with sizeof(struc rte_mbuf). The affected macros
> are:
> > +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>_PTR and
> > +  RTE_MBUF_METADATA_UINT<8, 16, 32, 64>. In terms of code changes,
> most likely
> > +  only the definition of RTE_MBUF_METADATA_UINT8_PTR macro will be
> changed from
> > +  ``(&((uint8_t *) &(mbuf)[1])[offset])`` to
> > +  ``(&((uint8_t *) (mbuf))[offset])``.
> 
> Cristian,
> General comment: you are too verbose :)
> Specifically on this patch: same comment ;)

No problem, will simplify and resend. Thanks, Thomas.

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
@ 2015-07-16 15:27 14% Cristian Dumitrescu
  2015-07-16 15:51  4% ` Singh, Jasvinder
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 15:27 UTC (permalink / raw)
  To: dev

v2 changes: 
-text simplification

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..6c064e2 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,8 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_port (rte_port.h): Macros to access the packet meta-data stored within
+  the packet buffer will be adjusted to cover the packet mbuf structure as well,
+  as currently they are able to access any packet buffer location except the
+  packet mbuf structure.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
  2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
@ 2015-07-16 15:51  4% ` Singh, Jasvinder
  2015-07-17  7:56  4% ` Gajdzica, MaciejX T
  2015-07-17  8:08  4% ` Mrzyglod, DanielX T
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-16 15:51 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 4:27 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
@ 2015-07-16 16:59 14% Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 16:59 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..aa7c036 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,11 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_table (rte_table_lpm.h): A new parameter to hold the table name (const
+  char *name) will be added to the LPM table parameter structure
+  (struct rte_table_lpm_params).
+
+* librte_table (rte_table_acl.h): Structures rte_table_acl_rule_add_params and
+  rte_table_acl_rule_delete_params will change to store an array of rules as
+  opposed to a single rule.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
@ 2015-07-16 17:07 14% Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-16 17:07 UTC (permalink / raw)
  To: dev


Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/abi.rst |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..194e8c6 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,8 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_pipeline (rte_pipeline.h): The prototype for the pipeline input port,
+  output port and table action handlers will be updated: the pipeline parameter
+  will be added, the packets mask parameter will be either removed (for input
+  port action handler) or made input-only.
-- 
1.7.4.1

^ permalink raw reply	[relevance 14%]

* [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
@ 2015-07-16 21:21 14% Stephen Hemminger
  2015-07-16 21:25  4% ` Dumitrescu, Cristian
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Stephen Hemminger @ 2015-07-16 21:21 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 doc/guides/rel_notes/abi.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
index 9e98d62..a4d100b 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/abi.rst
@@ -34,3 +34,12 @@ Deprecation Notices
   creates a dummy/empty malloc library to fulfill binaries with dynamic linking
   dependencies on librte_malloc.so. Such dummy library will not be created from
   release 2.2 so binaries will need to be rebuilt.
+
+* librte_sched (rte_sched.h): The scheduler hierarchy structure
+  (rte_sched_port_hierarchy) will change to allow for a larger number of subport
+  entries. The number of available traffic_classes and queues may also change.
+  The mbuf structure element for sched hierarchy will also change from a single
+  32 bit to a 64 bit structure.
+
+* librte_sched (rte_sched.h): The scheduler statistics structure will change
+  to allow keeping track of RED actions.
-- 
2.1.4

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
  2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
@ 2015-07-16 21:25  4% ` Dumitrescu, Cristian
  2015-07-16 21:28  4% ` Neil Horman
  2015-07-23 10:18  4% ` Dumitrescu, Cristian
  2 siblings, 0 replies; 200+ results
From: Dumitrescu, Cristian @ 2015-07-16 21:25 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, July 16, 2015 10:22 PM
> To: Dumitrescu, Cristian
> Cc: dev@dpdk.org; Stephen Hemminger
> Subject: [PATCH] doc: announce ABI change for librte_sched
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  doc/guides/rel_notes/abi.rst | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 9e98d62..a4d100b 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -34,3 +34,12 @@ Deprecation Notices
>    creates a dummy/empty malloc library to fulfill binaries with dynamic linking
>    dependencies on librte_malloc.so. Such dummy library will not be created
> from
>    release 2.2 so binaries will need to be rebuilt.
> +
> +* librte_sched (rte_sched.h): The scheduler hierarchy structure
> +  (rte_sched_port_hierarchy) will change to allow for a larger number of
> subport
> +  entries. The number of available traffic_classes and queues may also
> change.
> +  The mbuf structure element for sched hierarchy will also change from a
> single
> +  32 bit to a 64 bit structure.
> +
> +* librte_sched (rte_sched.h): The scheduler statistics structure will change
> +  to allow keeping track of RED actions.
> --
> 2.1.4

Hi Stephen,

Agree with both, how about the new clear flag to the stats read function, shall we add a separate note on this as well?

Thanks,
Cristian

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
  2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
  2015-07-16 21:25  4% ` Dumitrescu, Cristian
@ 2015-07-16 21:28  4% ` Neil Horman
  2015-07-23 10:18  4% ` Dumitrescu, Cristian
  2 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-16 21:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

On Thu, Jul 16, 2015 at 02:21:39PM -0700, Stephen Hemminger wrote:
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  doc/guides/rel_notes/abi.rst | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/abi.rst
> index 9e98d62..a4d100b 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/abi.rst
> @@ -34,3 +34,12 @@ Deprecation Notices
>    creates a dummy/empty malloc library to fulfill binaries with dynamic linking
>    dependencies on librte_malloc.so. Such dummy library will not be created from
>    release 2.2 so binaries will need to be rebuilt.
> +
> +* librte_sched (rte_sched.h): The scheduler hierarchy structure
> +  (rte_sched_port_hierarchy) will change to allow for a larger number of subport
> +  entries. The number of available traffic_classes and queues may also change.
> +  The mbuf structure element for sched hierarchy will also change from a single
> +  32 bit to a 64 bit structure.
> +
> +* librte_sched (rte_sched.h): The scheduler statistics structure will change
> +  to allow keeping track of RED actions.
> -- 
> 2.1.4
> 
> 
ACK

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5 4/4] rte_sched: hide structure of port hierarchy
  @ 2015-07-16 21:34  3% ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2015-07-16 21:34 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev

Right now the scheduler hierarchy is encoded as a bitfield
that is visible as part of the ABI. This creates an barrier
limiting future expansion of the hierarchy.

As a transistional step. hide the actual layout of the hierarchy
and mark the exposed structure as deprecated. This will allow for
expansion in later release.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/librte_sched/rte_sched.c           | 54 ++++++++++++++++++++++++++++++++++
 lib/librte_sched/rte_sched.h           | 54 ++++++++++------------------------
 lib/librte_sched/rte_sched_version.map |  9 ++++++
 3 files changed, 79 insertions(+), 38 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index ec565d2..4593af8 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -184,6 +184,21 @@ enum grinder_state {
 	e_GRINDER_READ_MBUF
 };
 
+/*
+ * Path through the scheduler hierarchy used by the scheduler enqueue
+ * operation to identify the destination queue for the current
+ * packet. Stored in the field pkt.hash.sched of struct rte_mbuf of
+ * each packet, typically written by the classification stage and read
+ * by scheduler enqueue.
+ */
+struct __rte_sched_port_hierarchy {
+	uint32_t queue:2;                /**< Queue ID (0 .. 3) */
+	uint32_t traffic_class:2;        /**< Traffic class ID (0 .. 3)*/
+	uint32_t pipe:20;                /**< Pipe ID */
+	uint32_t subport:6;              /**< Subport ID */
+	uint32_t color:2;                /**< Color */
+};
+
 struct rte_sched_grinder {
 	/* Pipe cache */
 	uint16_t pcache_qmask[RTE_SCHED_GRINDER_PCACHE_SIZE];
@@ -910,6 +925,45 @@ rte_sched_pipe_config(struct rte_sched_port *port,
 	return 0;
 }
 
+void
+rte_sched_port_pkt_write(struct rte_mbuf *pkt,
+			 uint32_t subport, uint32_t pipe, uint32_t traffic_class,
+			 uint32_t queue, enum rte_meter_color color)
+{
+	struct __rte_sched_port_hierarchy *sched
+		= (struct __rte_sched_port_hierarchy *) &pkt->hash.sched;
+
+	sched->color = (uint32_t) color;
+	sched->subport = subport;
+	sched->pipe = pipe;
+	sched->traffic_class = traffic_class;
+	sched->queue = queue;
+}
+
+void
+rte_sched_port_pkt_read_tree_path(const struct rte_mbuf *pkt,
+				  uint32_t *subport, uint32_t *pipe,
+				  uint32_t *traffic_class, uint32_t *queue)
+{
+	const struct __rte_sched_port_hierarchy *sched
+		= (const struct __rte_sched_port_hierarchy *) &pkt->hash.sched;
+
+	*subport = sched->subport;
+	*pipe = sched->pipe;
+	*traffic_class = sched->traffic_class;
+	*queue = sched->queue;
+}
+
+
+enum rte_meter_color
+rte_sched_port_pkt_read_color(const struct rte_mbuf *pkt)
+{
+	const struct __rte_sched_port_hierarchy *sched
+		= (const struct __rte_sched_port_hierarchy *) &pkt->hash.sched;
+
+	return (enum rte_meter_color) sched->color;
+}
+
 int
 rte_sched_subport_read_stats(struct rte_sched_port *port,
 	uint32_t subport_id,
diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
index 729f8c8..1ead267 100644
--- a/lib/librte_sched/rte_sched.h
+++ b/lib/librte_sched/rte_sched.h
@@ -195,17 +195,19 @@ struct rte_sched_port_params {
 #endif
 };
 
-/** Path through the scheduler hierarchy used by the scheduler enqueue operation to
-identify the destination queue for the current packet. Stored in the field hash.sched
-of struct rte_mbuf of each packet, typically written by the classification stage and read by
-scheduler enqueue.*/
+/*
+ * Path through scheduler hierarchy
+ *
+ * Note: direct access to internal bitfields is deprecated to allow for future expansion.
+ * Use rte_sched_port_pkt_read/write API instead
+ */
 struct rte_sched_port_hierarchy {
 	uint32_t queue:2;                /**< Queue ID (0 .. 3) */
 	uint32_t traffic_class:2;        /**< Traffic class ID (0 .. 3)*/
 	uint32_t pipe:20;                /**< Pipe ID */
 	uint32_t subport:6;              /**< Subport ID */
 	uint32_t color:2;                /**< Color */
-};
+} __attribute__ ((deprecated));
 
 /*
  * Configuration
@@ -328,11 +330,6 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
 	struct rte_sched_queue_stats *stats,
 	uint16_t *qlen);
 
-/*
- * Run-time
- *
- ***/
-
 /**
  * Scheduler hierarchy path write to packet descriptor. Typically called by the
  * packet classification stage.
@@ -350,18 +347,10 @@ rte_sched_queue_read_stats(struct rte_sched_port *port,
  * @param color
  *   Packet color set
  */
-static inline void
+void
 rte_sched_port_pkt_write(struct rte_mbuf *pkt,
-	uint32_t subport, uint32_t pipe, uint32_t traffic_class, uint32_t queue, enum rte_meter_color color)
-{
-	struct rte_sched_port_hierarchy *sched = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
-
-	sched->color = (uint32_t) color;
-	sched->subport = subport;
-	sched->pipe = pipe;
-	sched->traffic_class = traffic_class;
-	sched->queue = queue;
-}
+			 uint32_t subport, uint32_t pipe, uint32_t traffic_class,
+			 uint32_t queue, enum rte_meter_color color);
 
 /**
  * Scheduler hierarchy path read from packet descriptor (struct rte_mbuf). Typically
@@ -380,24 +369,13 @@ rte_sched_port_pkt_write(struct rte_mbuf *pkt,
  *   Queue ID within pipe traffic class (0 .. 3)
  *
  */
-static inline void
-rte_sched_port_pkt_read_tree_path(struct rte_mbuf *pkt, uint32_t *subport, uint32_t *pipe, uint32_t *traffic_class, uint32_t *queue)
-{
-	struct rte_sched_port_hierarchy *sched = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
-
-	*subport = sched->subport;
-	*pipe = sched->pipe;
-	*traffic_class = sched->traffic_class;
-	*queue = sched->queue;
-}
-
-static inline enum rte_meter_color
-rte_sched_port_pkt_read_color(struct rte_mbuf *pkt)
-{
-	struct rte_sched_port_hierarchy *sched = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
+void
+rte_sched_port_pkt_read_tree_path(const struct rte_mbuf *pkt,
+				  uint32_t *subport, uint32_t *pipe,
+				  uint32_t *traffic_class, uint32_t *queue);
 
-	return (enum rte_meter_color) sched->color;
-}
+enum rte_meter_color
+rte_sched_port_pkt_read_color(const struct rte_mbuf *pkt);
 
 /**
  * Hierarchical scheduler port enqueue. Writes up to n_pkts to port scheduler and
diff --git a/lib/librte_sched/rte_sched_version.map b/lib/librte_sched/rte_sched_version.map
index 9f74e8b..3aa159a 100644
--- a/lib/librte_sched/rte_sched_version.map
+++ b/lib/librte_sched/rte_sched_version.map
@@ -20,3 +20,12 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_2.1 {
+	global:
+
+	rte_sched_port_pkt_write;
+	rte_sched_port_pkt_read_tree_path;
+	rte_sched_port_pkt_read_color;
+
+} DPDK_2.0;
-- 
2.1.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 10:26  7% [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code John McNamara
  2015-07-13 10:42  7% ` Neil Horman
@ 2015-07-16 22:22  4% ` Vlad Zolotarov
  2015-08-02 21:06  7%   ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: Vlad Zolotarov @ 2015-07-16 22:22 UTC (permalink / raw)
  To: John McNamara, dev



On 07/13/15 13:26, John McNamara wrote:
> Fix for ABI breakage introduced in LRO addition. Moves
> lro bitfield to the end of the struct/member.
>
> Fixes: 8eecb3295aed (ixgbe: add LRO support)
>
> Signed-off-by: John McNamara <john.mcnamara@intel.com>
> ---
>   lib/librte_ether/rte_ethdev.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 79bde89..1c3ace1 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>   	uint8_t port_id;           /**< Device [external] port identifier. */
>   	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) / OFF(0). */
>   		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
> -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
>   		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */

Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com>

>   };
>   
>   /**

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v13 00/14] Interrupt mode PMD
  2015-07-09 13:58  3%   ` David Marchand
@ 2015-07-17  6:04  0%     ` Liang, Cunming
  0 siblings, 0 replies; 200+ results
From: Liang, Cunming @ 2015-07-17  6:04 UTC (permalink / raw)
  To: David Marchand; +Cc: Stephen Hemminger, dev, Wang, Liang-min


> -----Original Message-----
> From: David Marchand [mailto:david.marchand@6wind.com] 
> Sent: Thursday, July 09, 2015 9:59 PM
> To: Liang, Cunming
> Cc: dev@dpdk.org; Stephen Hemminger; Thomas Monjalon; Zhou, Danny; Wang, Liang-min; Richardson, Bruce; Liu, Yong; Neil Horman
> Subject: Re: [PATCH v13 00/14] Interrupt mode PMD

> On Fri, Jun 19, 2015 at 6:00 AM, Cunming Liang <cunming.liang@intel.com> wrote:
> v13 changes
> - version map cleanup for v2.1
> - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
>
> Please, this patchset ends with a patch that deals with ABI compatibility while it should do so on a per-patch basis.
> Besides, some patches are introducing stuff that is reworked in other patches without a clear reason.
> 
> Can you rework this to ease review and ensure patch atomicity ?
> 
> Thanks.
> 
> -- 
> David Marchand

Will split it, thanks.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v14 00/13] Interrupt mode PMD
    2015-07-09 13:58  3%   ` David Marchand
@ 2015-07-17  6:16  4%   ` Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
                       ` (10 more replies)
  1 sibling, 11 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map
 - minor comments rework

v13 changes
 - version map cleanup for v2.1
 - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility

Patch series v12
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Danny Zhou <danny.zhou@intel.com>

v12 changes
 - bsd cleanup for unused variable warning
 - fix awkward line split in debug message

v11 changes
 - typo cleanup and check kernel style

v10 changes
 - code rework to return actual error code
 - bug fix for lsc when using uio_pci_generic

v9 changes
 - code rework to fix open comment
 - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
 - new patch to turn off the feature by default so as to avoid v2.1 abi broken

v8 changes
 - remove condition check for only vfio-msix
 - add multiplex intr support when only one intr vector allowed
 - lsc and rxq interrupt runtime enable decision
 - add safe event delete while the event wakeup execution happens

v7 changes
 - decouple epoll event and intr operation
 - add condition check in the case intr vector is disabled
 - renaming some APIs

v6 changes
 - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set.
 - using vector number instead of queue_id as interrupt API params.
 - patch reorder and split.

v5 changes
 - Rebase the patchset onto the HEAD
 - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
 - Export wait-for-rx interrupt function for shared libraries
 - Split-off a new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect
 - Change sample applicaiton to accomodate EAL function spec change
   accordingly

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Adjust position of new-added structure fields and functions to
   avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions
 - Move spinlok from PMD to L3fwd-power
 - Remove unnecessary variables in e1000_mac_info
 - Fix miscelleous review comments

v2 changes
 - Fix compilation issue in Makefile for missed header file.
 - Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
   pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
   which then wakes up DPDK polling thread). In this way, it introduces
   non-deterministic wakeup latency for DPDK polling thread as well as packet
   latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
   LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF only)
.
2) Build on top of the VFIO mechanism instead of UIO, so it could support
   up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
   VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
   user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
   switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
   and rx queue interrupt handlers causes a mess. [FIXED]
2) LSC interrupt is not supported by VF driver, so it is by default disabled
   in L3fwd-power now. Feel free to turn in on if you want to support both LSC
   and rx queue interrupts on a PF.

Cunming Liang (13):
  eal/linux: add interrupt vectors support in intr_handle
  eal/linux: add rte_epoll_wait/ctl support
  eal/linux: add API to set rx interrupt event monitor
  eal/linux: fix comments typo on vfio msi
  eal/linux: map eventfd to VFIO MSI-X intr vector
  eal/linux: standalone intr event fd create support
  eal/linux: fix lsc read error in uio_pci_generic
  eal/bsd: dummy for new intr definition
  eal/bsd: fix inappropriate linuxapp referred in bsd
  ethdev: add rx intr enable, disable and ctl functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
    switch

 drivers/net/e1000/igb_ethdev.c                     | 311 ++++++++++--
 drivers/net/ixgbe/ixgbe_ethdev.c                   | 527 ++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   4 +
 examples/l3fwd-power/main.c                        | 202 ++++++--
 lib/librte_eal/bsdapp/eal/eal_interrupts.c         |  28 ++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  91 +++-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |   5 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 362 ++++++++++++--
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 218 +++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   8 +
 lib/librte_ether/rte_ethdev.c                      | 109 +++++
 lib/librte_ether/rte_ethdev.h                      | 154 ++++++
 lib/librte_ether/rte_ether_version.map             |   4 +
 13 files changed, 1895 insertions(+), 128 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
@ 2015-07-17  6:16  8%     ` Cunming Liang
  2015-07-19 23:31  0%       ` Thomas Monjalon
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
                       ` (9 subsequent siblings)
  10 siblings, 1 reply; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds interrupt vectors support in rte_intr_handle.
'vec_en' is set when interrupt vectors are detected and associated event fds are set.
Those event fds are stored in efds[].
'intr_vec' is reserved for device driver to initialize the vector mapping table.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v7 changes:
 - add eptrs[], it's used to store the register rte_epoll_event instances.
 - add vec_en, to log the vector capability status.

v6 changes:
 - add mapping table between irq vector number and queue id.

v5 changes:
 - Create this new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect.

 .../linuxapp/eal/include/exec-env/rte_interrupts.h          | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index bdeb3fc..12b33c9 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#define RTE_MAX_RXTX_INTR_VEC_ID     32
+
 enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,          /**< uio device handle */
@@ -58,6 +60,17 @@ struct rte_intr_handle {
 	};
 	int fd;	 /**< interrupt event file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	/**
+	 * RTE_NEXT_ABI will be removed from v2.2.
+	 * It's only used to avoid ABI(unannounced) broken in v2.1.
+	 * Make sure being aware of the impact before turning on the feature.
+	 */
+	uint32_t max_intr;             /**< max interrupt requested */
+	uint32_t nb_efd;               /**< number of available efd(event fd) */
+	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	int *intr_vec;                 /**< intr vector number array */
+#endif
 };
 
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
-- 
1.8.1.4

^ permalink raw reply	[relevance 8%]

* [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold the rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback which is executed during wakeup.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v11 changes
 - cleanup spelling error

v9 changes
 - rework on coding style

v8 changes
 - support delete event in safety during the wakeup execution
 - add EINTR process during epoll_wait

v7 changes
 - split v6[4/8] into two patches, one for epoll event(this one)
   another for rx intr(next patch)
 - introduce rte_epoll_event definition
 - rte_epoll_wait/ctl for more generic RTE epoll API

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 139 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  80 ++++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   3 +
 3 files changed, 222 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b5f369e..5fe5b99 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -69,6 +69,8 @@
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
 
+static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+
 /**
  * union for pipe fds.
  */
@@ -894,3 +896,140 @@ rte_eal_intr_init(void)
 
 	return -ret;
 }
+
+static int
+eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
+			struct rte_epoll_event *events)
+{
+	unsigned int i, count = 0;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < n; i++) {
+		rev = evs[i].data.ptr;
+		if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
+						 RTE_EPOLL_EXEC))
+			continue;
+
+		events[count].status        = RTE_EPOLL_VALID;
+		events[count].fd            = rev->fd;
+		events[count].epfd          = rev->epfd;
+		events[count].epdata.event  = rev->epdata.event;
+		events[count].epdata.data   = rev->epdata.data;
+		if (rev->epdata.cb_fun)
+			rev->epdata.cb_fun(rev->fd,
+					   rev->epdata.cb_arg);
+
+		rte_compiler_barrier();
+		rev->status = RTE_EPOLL_VALID;
+		count++;
+	}
+	return count;
+}
+
+static inline int
+eal_init_tls_epfd(void)
+{
+	int pfd = epoll_create(255);
+
+	if (pfd < 0) {
+		RTE_LOG(ERR, EAL,
+			"Cannot create epoll instance\n");
+		return -1;
+	}
+	return pfd;
+}
+
+int
+rte_intr_tls_epfd(void)
+{
+	if (RTE_PER_LCORE(_epfd) == -1)
+		RTE_PER_LCORE(_epfd) = eal_init_tls_epfd();
+
+	return RTE_PER_LCORE(_epfd);
+}
+
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout)
+{
+	struct epoll_event evs[maxevents];
+	int rc;
+
+	if (!events) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	while (1) {
+		rc = epoll_wait(epfd, evs, maxevents, timeout);
+		if (likely(rc > 0)) {
+			/* epoll_wait has at least one fd ready to read */
+			rc = eal_epoll_process_event(evs, rc, events);
+			break;
+		} else if (rc < 0) {
+			if (errno == EINTR)
+				continue;
+			/* epoll_wait fail */
+			RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
+				strerror(errno));
+			rc = -1;
+			break;
+		}
+	}
+
+	return rc;
+}
+
+static inline void
+eal_epoll_data_safe_free(struct rte_epoll_event *ev)
+{
+	while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
+				    RTE_EPOLL_INVALID))
+		while (ev->status != RTE_EPOLL_VALID)
+			rte_pause();
+	memset(&ev->epdata, 0, sizeof(ev->epdata));
+	ev->fd = -1;
+	ev->epfd = -1;
+}
+
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event)
+{
+	struct epoll_event ev;
+
+	if (!event) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	if (op == EPOLL_CTL_ADD) {
+		event->status = RTE_EPOLL_VALID;
+		event->fd = fd;  /* ignore fd in event */
+		event->epfd = epfd;
+		ev.data.ptr = (void *)event;
+	}
+
+	ev.events = event->epdata.event;
+	if (epoll_ctl(epfd, op, fd, &ev) < 0) {
+		RTE_LOG(ERR, EAL, "Error op %d fd %d epoll_ctl, %s\n",
+			op, fd, strerror(errno));
+		if (op == EPOLL_CTL_ADD)
+			/* rollback status when CTL_ADD fail */
+			event->status = RTE_EPOLL_INVALID;
+		return -1;
+	}
+
+	if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
+		eal_epoll_data_safe_free(event);
+
+	return 0;
+}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 12b33c9..b55b4ee 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -51,6 +51,32 @@ enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_MAX
 };
 
+#define RTE_INTR_EVENT_ADD            1UL
+#define RTE_INTR_EVENT_DEL            2UL
+
+typedef void (*rte_intr_event_cb_t)(int fd, void *arg);
+
+struct rte_epoll_data {
+	uint32_t event;               /**< event type */
+	void *data;                   /**< User data */
+	rte_intr_event_cb_t cb_fun;   /**< IN: callback fun */
+	void *cb_arg;	              /**< IN: callback arg */
+};
+
+enum {
+	RTE_EPOLL_INVALID = 0,
+	RTE_EPOLL_VALID,
+	RTE_EPOLL_EXEC,
+};
+
+/** interrupt epoll event obj, taken by epoll_event.ptr */
+struct rte_epoll_event {
+	volatile uint32_t status;  /**< OUT: event status */
+	int fd;                    /**< OUT: event fd */
+	int epfd;       /**< OUT: epoll instance the ev associated with */
+	struct rte_epoll_data epdata;
+};
+
 /** Handle for interrupts. */
 struct rte_intr_handle {
 	union {
@@ -69,8 +95,62 @@ struct rte_intr_handle {
 	uint32_t max_intr;             /**< max interrupt requested */
 	uint32_t nb_efd;               /**< number of available efd(event fd) */
 	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
+				       /**< intr vector epoll event */
 	int *intr_vec;                 /**< intr vector number array */
 #endif
 };
 
+#define RTE_EPOLL_PER_THREAD        -1  /**< to hint using per thread epfd */
+
+/**
+ * It waits for events on the epoll instance.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller wait for events.
+ * @param events
+ *   Memory area contains the events that will be available for the caller.
+ * @param maxevents
+ *   Up to maxevents are returned, must greater than zero.
+ * @param timeout
+ *   Specifying a timeout of -1 causes a block indefinitely.
+ *   Specifying a timeout equal to zero cause to return immediately.
+ * @return
+ *   - On success, returns the number of available event.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout);
+
+/**
+ * It performs control operations on epoll instance referred by the epfd.
+ * It requests that the operation op be performed for the target fd.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller perform control operations.
+ * @param op
+ *   The operation be performed for the target fd.
+ * @param fd
+ *   The target fd on which the control ops perform.
+ * @param event
+ *   Describes the object linked to the fd.
+ *   Note: The caller must take care the object deletion after CTL_DEL.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event);
+
+/**
+ * The function returns the per thread epoll instance.
+ *
+ * @return
+ *   epfd the epoll instance referred to.
+ */
+int
+rte_intr_tls_epfd(void);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index e537b42..3c4c710 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -115,5 +115,8 @@ DPDK_2.0 {
 DPDK_2.1 {
 	global:
 
+	rte_epoll_ctl;
+	rte_epoll_wait;
+	rte_intr_tls_epfd;
 	rte_memzone_free;
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector events monitor on specified epoll instance.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v12 changes:
 - fix awkward line split in using RTE_LOG

v10 changes:
 - add RTE_INTR_HANDLE_UIO_INTX for uio_pci_generic

v8 changes
 - fix EWOULDBLOCK and EINTR processing
 - add event status check

v7 changes
 - rename rte_intr_rx_set to rte_intr_rx_ctl.
 - rte_intr_rx_ctl uses rte_epoll_ctl to register epoll event instance.
 - the intr rx event instance includes a intr process callback.

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 105 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  38 ++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   1 +
 3 files changed, 144 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 5fe5b99..4e34abc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -897,6 +897,51 @@ rte_eal_intr_init(void)
 	return -ret;
 }
 
+#ifdef RTE_NEXT_ABI
+static void
+eal_intr_proc_rxtx_intr(int fd, const struct rte_intr_handle *intr_handle)
+{
+	union rte_intr_read_buffer buf;
+	int bytes_read = 1;
+
+	switch (intr_handle->type) {
+	case RTE_INTR_HANDLE_UIO:
+	case RTE_INTR_HANDLE_UIO_INTX:
+		bytes_read = sizeof(buf.uio_intr_count);
+		break;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		bytes_read = sizeof(buf.vfio_intr_count);
+		break;
+#endif
+	default:
+		bytes_read = 1;
+		RTE_LOG(INFO, EAL, "unexpected intr type\n");
+		break;
+	}
+
+	/**
+	 * read out to clear the ready-to-be-read flag
+	 * for epoll_wait.
+	 */
+	do {
+		bytes_read = read(fd, &buf, bytes_read);
+		if (bytes_read < 0) {
+			if (errno == EINTR || errno == EWOULDBLOCK ||
+			    errno == EAGAIN)
+				continue;
+			RTE_LOG(ERR, EAL,
+				"Error reading from fd %d: %s\n",
+				fd, strerror(errno));
+		} else if (bytes_read == 0)
+			RTE_LOG(ERR, EAL, "Read nothing from fd %d\n", fd);
+		return;
+	} while (1);
+}
+#endif
+
 static int
 eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
 			struct rte_epoll_event *events)
@@ -1033,3 +1078,63 @@ rte_epoll_ctl(int epfd, int op, int fd,
 
 	return 0;
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
+		int op, unsigned int vec, void *data)
+{
+	struct rte_epoll_event *rev;
+	struct rte_epoll_data *epdata;
+	int epfd_op;
+	int rc = 0;
+
+	if (!intr_handle || intr_handle->nb_efd == 0 ||
+	    vec >= intr_handle->nb_efd) {
+		RTE_LOG(ERR, EAL, "Wrong intr vector number.\n");
+		return -EPERM;
+	}
+
+	switch (op) {
+	case RTE_INTR_EVENT_ADD:
+		epfd_op = EPOLL_CTL_ADD;
+		rev = &intr_handle->elist[vec];
+		if (rev->status != RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event already been added.\n");
+			return -EEXIST;
+		}
+
+		/* attach to intr vector fd */
+		epdata = &rev->epdata;
+		epdata->event  = EPOLLIN | EPOLLPRI | EPOLLET;
+		epdata->data   = data;
+		epdata->cb_fun = (rte_intr_event_cb_t)eal_intr_proc_rxtx_intr;
+		epdata->cb_arg = (void *)intr_handle;
+		rc = rte_epoll_ctl(epfd, epfd_op, intr_handle->efds[vec], rev);
+		if (!rc)
+			RTE_LOG(DEBUG, EAL,
+				"efd %d associated with vec %d added on epfd %d"
+				"\n", rev->fd, vec, epfd);
+		else
+			rc = -EPERM;
+		break;
+	case RTE_INTR_EVENT_DEL:
+		epfd_op = EPOLL_CTL_DEL;
+		rev = &intr_handle->elist[vec];
+		if (rev->status == RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event does not exist.\n");
+			return -EPERM;
+		}
+
+		rc = rte_epoll_ctl(rev->epfd, epfd_op, rev->fd, rev);
+		if (rc)
+			rc = -EPERM;
+		break;
+	default:
+		RTE_LOG(ERR, EAL, "event op type mismatch\n");
+		rc = -EPERM;
+	}
+
+	return rc;
+}
+#endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index b55b4ee..918246f 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,10 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#ifndef RTE_NEXT_ABI
+#include <rte_common.h>
+#endif
+
 #define RTE_MAX_RXTX_INTR_VEC_ID     32
 
 enum rte_intr_handle_type {
@@ -153,4 +157,38 @@ rte_epoll_ctl(int epfd, int op, int fd,
 int
 rte_intr_tls_epfd(void);
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+#else
+static inline int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+	return -ENOTSUP;
+}
+#endif
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 3c4c710..1cd4cc5 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -117,6 +117,7 @@ DPDK_2.1 {
 
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (2 preceding siblings ...)
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
@ 2015-07-17  6:16  3%     ` Cunming Liang
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support Cunming Liang
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch assigns event fds to each vfio msix interrupt vector by ioctl.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v8 changes
 - move eventfd creation out of the setup_interrupts to a standalone function

v7 changes
 - cleanup unnecessary code change
 - split event and intr operation to other patches

 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 56 ++++++++++------------------
 1 file changed, 20 insertions(+), 36 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index cca2efd..b18ab86 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -128,6 +128,9 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT
 
 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+/* irq set buffer length for queue interrupts and LSC interrupt */
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+			      sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))
 
 /* enable legacy (INTx) interrupts */
 static int
@@ -245,23 +248,6 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
 						intr_handle->fd);
 		return -1;
 	}
-
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -294,7 +280,7 @@ vfio_disable_msi(struct rte_intr_handle *intr_handle) {
 static int
 vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 	int len, ret;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	struct vfio_irq_set *irq_set;
 	int *fd_ptr;
 
@@ -302,12 +288,26 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
 	irq_set->argsz = len;
+#ifdef RTE_NEXT_ABI
+	if (!intr_handle->max_intr)
+		intr_handle->max_intr = 1;
+	else if (intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+		intr_handle->max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+
+	irq_set->count = intr_handle->max_intr;
+#else
 	irq_set->count = 1;
+#endif
 	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
 	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
 	irq_set->start = 0;
 	fd_ptr = (int *) &irq_set->data;
-	*fd_ptr = intr_handle->fd;
+#ifdef RTE_NEXT_ABI
+	memcpy(fd_ptr, intr_handle->efds, sizeof(intr_handle->efds));
+	fd_ptr[intr_handle->max_intr - 1] = intr_handle->fd;
+#else
+	fd_ptr[0] = intr_handle->fd;
+#endif
 
 	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
@@ -317,22 +317,6 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 		return -1;
 	}
 
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI-X interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -340,7 +324,7 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 static int
 vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	int len, ret;
 
 	len = sizeof(struct vfio_irq_set);
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (3 preceding siblings ...)
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
@ 2015-07-17  6:16  3%     ` Cunming Liang
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition Cunming Liang
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch exposes intr event fd create and release for PMD.
The device driver can assign the number of event associated with interrupt vector.
It also provides misc functions to check 1) allows other slowpath intr(e.g. lsc);
2) intr event on fastpath is enabled or not.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - minor changes on API decription comments

v13 changes
 - version map cleanup for v2.1

v11 changes
 - typo cleanup

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 57 ++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 87 ++++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |  4 +
 3 files changed, 148 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b18ab86..0266d98 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -44,6 +44,7 @@
 #include <sys/epoll.h>
 #include <sys/signalfd.h>
 #include <sys/ioctl.h>
+#include <sys/eventfd.h>
 
 #include <rte_common.h>
 #include <rte_interrupts.h>
@@ -68,6 +69,7 @@
 #include "eal_vfio.h"
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
+#define NB_OTHER_INTR               1
 
 static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
 
@@ -1121,4 +1123,59 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
 
 	return rc;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	uint32_t i;
+	int fd;
+	uint32_t n = RTE_MIN(nb_efd, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+
+	if (intr_handle->type == RTE_INTR_HANDLE_VFIO_MSIX) {
+		for (i = 0; i < n; i++) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				RTE_LOG(ERR, EAL,
+					"cannot setup eventfd,"
+					"error %i (%s)\n",
+					errno, strerror(errno));
+				return -1;
+			}
+			intr_handle->efds[i] = fd;
+		}
+		intr_handle->nb_efd   = n;
+		intr_handle->max_intr = NB_OTHER_INTR + n;
+	} else {
+		intr_handle->efds[0]  = intr_handle->fd;
+		intr_handle->nb_efd   = RTE_MIN(nb_efd, 1U);
+		intr_handle->max_intr = NB_OTHER_INTR;
+	}
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	uint32_t i;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < intr_handle->nb_efd; i++) {
+		rev = &intr_handle->elist[i];
+		if (rev->status == RTE_EPOLL_INVALID)
+			continue;
+		if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
+			/* force free if the entry valid */
+			eal_epoll_data_safe_free(rev);
+			rev->status = RTE_EPOLL_INVALID;
+		}
+	}
+
+	if (intr_handle->max_intr > intr_handle->nb_efd) {
+		for (i = 0; i < intr_handle->nb_efd; i++)
+			close(intr_handle->efds[i]);
+	}
+	intr_handle->nb_efd = 0;
+	intr_handle->max_intr = 0;
+}
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 918246f..3f17f29 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -191,4 +191,91 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 }
 #endif
 
+/**
+ * It enables the packet I/O interrupt event if it's necessary.
+ * It creates event fd for each interrupt vector when MSIX is used,
+ * otherwise it multiplexes a single event fd.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+#else
+static inline int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+	return 0;
+}
+#endif
+
+/**
+ * It disables the packet I/O interrupt event.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+extern void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+#else
+static inline void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
+#endif
+
+/**
+ * The packet I/O interrupt on datapath is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	return !(!intr_handle->nb_efd);
+}
+#else
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+#endif
+
+/**
+ * The interrupt handle instance allows other causes or not.
+ * Other causes stand for any none packet I/O interrupts.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+#ifdef RTE_NEXT_ABI
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	return !!(intr_handle->max_intr - intr_handle->nb_efd);
+}
+#else
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
+#endif
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 1cd4cc5..a0d9cb2 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -117,6 +117,10 @@ DPDK_2.1 {
 
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
 	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (4 preceding siblings ...)
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support Cunming Liang
@ 2015-07-17  6:16  8%     ` Cunming Liang
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

To make bsd compiling happy with new intr changes.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v13 changes
 - version map cleanup for v2.1

v12 changes
 - fix unused variables compiling warning

v8 changes
 - add stub for new function

v7 changes
 - remove stub 'linux only' function from source file

 lib/librte_eal/bsdapp/eal/eal_interrupts.c         | 28 +++++++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   | 85 ++++++++++++++++++++++
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |  5 ++
 3 files changed, 118 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 26a55c7..a550ece 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -68,3 +68,31 @@ rte_eal_intr_init(void)
 {
 	return 0;
 }
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+
+	return -ENOTSUP;
+}
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
index d4c388f..eaf5410 100644
--- a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#include <rte_common.h>
+
 enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,      /**< uio device handle */
@@ -50,6 +52,89 @@ struct rte_intr_handle {
 	int fd;                          /**< file descriptor */
 	int uio_cfg_fd;                  /**< UIO config file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	/**
+	 * RTE_NEXT_ABI will be removed from v2.2.
+	 * It's only used to avoid ABI(unannounced) broken in v2.1.
+	 * Make sure being aware of the impact before turning on the feature.
+	 */
+	int max_intr;                    /**< max interrupt requested */
+	uint32_t nb_efd;                 /**< number of available efds */
+	int *intr_vec;               /**< intr vector number array */
+#endif
 };
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+
+/**
+ * It enables the fastpath event fds if it's necessary.
+ * It creates event fds when multi-vectors allowed,
+ * otherwise it multiplexes the single event fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disable the fastpath event fds.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The fastpath interrupt is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+static inline int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+
+/**
+ * The interrupt handle instance allows other cause or not.
+ * Other cause stands for none fastpath interrupt.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+static inline int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index e537b42..b527ad4 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -116,4 +116,9 @@ DPDK_2.1 {
 	global:
 
 	rte_memzone_free;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
+	rte_intr_rx_ctl;
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 8%]

* [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (5 preceding siblings ...)
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition Cunming Liang
@ 2015-07-17  6:16  3%     ` Cunming Liang
  2015-07-17  6:16  1%     ` [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch adds two dev_ops functions to enable and disable rx queue interrupts.
In addtion, it adds rte_eth_dev_rx_intr_ctl/rx_intr_q to support per port or per queue rx intr event set.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v9 changes
 - remove unnecessary check after rte_eth_dev_is_valid_port.
   the same as http://www.dpdk.org/dev/patchwork/patch/4784

v8 changes
 - add addtion check for EEXIT

v7 changes
 - remove rx_intr_vec_get
 - add rx_intr_ctl and rx_intr_ctl_q

v6 changes
 - add rx_intr_vec_get to retrieve the vector num of the queue.

v5 changes
 - Rebase the patchset onto the HEAD

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Put new functions at the end of eth_dev_ops to avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions

 lib/librte_ether/rte_ethdev.c          | 109 +++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 154 +++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |   4 +
 3 files changed, 267 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ddf3658..d7aa840 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3006,6 +3006,115 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 	}
 	rte_spinlock_unlock(&rte_eth_dev_cb_lock);
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	uint16_t qid;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		vec = intr_handle->intr_vec[qid];
+		rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+		if (rc && rc != -EEXIST) {
+			PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+					" op %d epfd %d vec %u\n",
+					port_id, qid, op, epfd, vec);
+		}
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%u\n", queue_id);
+		return -EINVAL;
+	}
+
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	vec = intr_handle->intr_vec[queue_id];
+	rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+	if (rc && rc != -EEXIST) {
+		PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+				" op %d epfd %d vec %u\n",
+				port_id, queue_id, op, epfd, vec);
+		return rc;
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_enable(uint8_t port_id,
+			   uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+}
+
+int
+rte_eth_dev_rx_intr_disable(uint8_t port_id,
+			    uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+}
+#endif
+
 #ifdef RTE_NIC_BYPASS
 int rte_eth_dev_bypass_init(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d76bbb3..602bd2b 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -834,6 +834,10 @@ struct rte_eth_fdir {
 struct rte_intr_conf {
 	/** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
 	uint16_t lsc;
+#ifdef RTE_NEXT_ABI
+	/** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
+	uint16_t rxq;
+#endif
 };
 
 /**
@@ -1042,6 +1046,14 @@ typedef int (*eth_tx_queue_setup_t)(struct rte_eth_dev *dev,
 				    const struct rte_eth_txconf *tx_conf);
 /**< @internal Setup a transmit queue of an Ethernet device. */
 
+typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Enable interrupt of a receive queue of an Ethernet device. */
+
+typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Disable interrupt of a receive queue of an Ethernet device. */
+
 typedef void (*eth_queue_release_t)(void *queue);
 /**< @internal Release memory resources allocated by given RX/TX queue. */
 
@@ -1351,6 +1363,12 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
 	eth_rx_queue_count_t       rx_queue_count; /**< Get Rx queue count. */
 	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
+#ifdef RTE_NEXT_ABI
+	/**< Enable Rx queue interrupt. */
+	eth_rx_enable_intr_t       rx_queue_intr_enable;
+	/**< Disable Rx queue interrupt.*/
+	eth_rx_disable_intr_t      rx_queue_intr_disable;
+#endif
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue.*/
 	eth_queue_release_t        tx_queue_release;/**< Release TX queue.*/
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
@@ -2907,6 +2925,142 @@ void _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 				enum rte_eth_event_type event);
 
 /**
+ * When there is no rx packet coming in Rx Queue for a long time, we can
+ * sleep lcore related to RX Queue for power saving, and enable rx interrupt
+ * to be triggered when rx packect arrives.
+ *
+ * The rte_eth_dev_rx_intr_enable() function enables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id);
+#else
+static inline int
+rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+#endif
+
+/**
+ * When lcore wakes up from rx interrupt indicating packet coming, disable rx
+ * interrupt and returns to polling mode.
+ *
+ * The rte_eth_dev_rx_intr_disable() function disables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id);
+#else
+static inline int
+rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+#endif
+
+/**
+ * RX Interrupt control per port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data);
+#else
+static inline int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+#endif
+
+/**
+ * RX Interrupt control per queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+extern int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data);
+#else
+static inline int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+#endif
+
+/**
  * Turn on the LED on the Ethernet device.
  * This function turns on the LED on the Ethernet device.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 39baf11..fa09d75 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -109,6 +109,10 @@ DPDK_2.0 {
 DPDK_2.1 {
 	global:
 
+	rte_eth_dev_rx_intr_ctl;
+	rte_eth_dev_rx_intr_ctl_q;
+	rte_eth_dev_rx_intr_disable;
+	rte_eth_dev_rx_intr_enable;
 	rte_eth_dev_set_mc_addr_list;
 	rte_eth_timesync_disable;
 	rte_eth_timesync_enable;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (6 preceding siblings ...)
  2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
@ 2015-07-17  6:16  1%     ` Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF Cunming Liang
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch does below things for ixgbe PF and VF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v10 changes
 - return an actual error code rather than -1

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/ixgbe/ixgbe_ethdev.c | 527 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h |   4 +
 2 files changed, 518 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 8d68125..8145da9 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -82,6 +82,9 @@
  */
 #define IXGBE_FC_LO    0x40
 
+/* Default minimum inter-interrupt interval for EITR configuration */
+#define IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT    0x79E
+
 /* Timer value included in XOFF frames. */
 #define IXGBE_FC_PAUSE 0x680
 
@@ -179,6 +182,9 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
 			uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev);
 static void ixgbe_dev_interrupt_handler(struct rte_intr_handle *handle,
@@ -191,11 +197,14 @@ static void ixgbe_dcb_init(struct ixgbe_hw *hw,struct ixgbe_dcb_config *dcb_conf
 
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev);
+static int ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev);
+static int ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_configure(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_start(struct rte_eth_dev *dev);
 static void ixgbevf_dev_stop(struct rte_eth_dev *dev);
 static void ixgbevf_dev_close(struct rte_eth_dev *dev);
 static void ixgbevf_intr_disable(struct ixgbe_hw *hw);
+static void ixgbevf_intr_enable(struct ixgbe_hw *hw);
 static void ixgbevf_dev_stats_get(struct rte_eth_dev *dev,
 		struct rte_eth_stats *stats);
 static void ixgbevf_dev_stats_reset(struct rte_eth_dev *dev);
@@ -205,6 +214,17 @@ static void ixgbevf_vlan_strip_queue_set(struct rte_eth_dev *dev,
 		uint16_t queue, int on);
 static void ixgbevf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on);
+static void ixgbevf_dev_interrupt_handler(struct rte_intr_handle *handle,
+					  void *param);
+#ifdef RTE_NEXT_ABI
+static int ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					    uint16_t queue_id);
+static int ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					     uint16_t queue_id);
+static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+				 uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbevf_configure_msix(struct rte_eth_dev *dev);
 
 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -221,6 +241,15 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev,
 		uint8_t rule_id, uint8_t on);
 static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev,
 		uint8_t	rule_id);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					  uint16_t queue_id);
+static int ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					   uint16_t queue_id);
+static void ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+			       uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbe_configure_msix(struct rte_eth_dev *dev);
 
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 		uint16_t queue_idx, uint16_t tx_rate);
@@ -282,7 +311,7 @@ static int ixgbe_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
  */
 #define UPDATE_VF_STAT(reg, last, cur)	                        \
 {                                                               \
-	u32 latest = IXGBE_READ_REG(hw, reg);                   \
+	uint32_t latest = IXGBE_READ_REG(hw, reg);                   \
 	cur += latest - last;                                   \
 	last = latest;                                          \
 }
@@ -363,6 +392,10 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.tx_queue_start	      = ixgbe_dev_tx_queue_start,
 	.tx_queue_stop        = ixgbe_dev_tx_queue_stop,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbe_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbe_dev_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
@@ -427,8 +460,13 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.vlan_offload_set     = ixgbevf_vlan_offload_set,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
+	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbevf_dev_rx_queue_intr_disable,
+#endif
 	.mac_addr_add         = ixgbevf_add_mac_addr,
 	.mac_addr_remove      = ixgbevf_remove_mac_addr,
 	.set_mc_addr_list     = ixgbe_dev_set_mc_addr_list,
@@ -928,12 +966,6 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
 			eth_dev->data->port_id, pci_dev->id.vendor_id,
 			pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		ixgbe_dev_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	ixgbe_enable_intr(eth_dev);
 
@@ -1489,6 +1521,10 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct ixgbe_vf_info *vfinfo =
 		*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
 	int err, link_up = 0, negotiate = 0;
 	uint32_t speed = 0;
 	int mask = 0;
@@ -1521,6 +1557,30 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	ixgbe_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int),
+				    0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for sleep until rx interrupt */
+	ixgbe_configure_msix(dev);
+
 	/* initialize transmission unit */
 	ixgbe_dev_tx_init(dev);
 
@@ -1598,8 +1658,25 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 skip_link_setup:
 
 	/* check if lsc interrupt is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ixgbe_dev_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   ixgbe_dev_interrupt_handler,
+						   (void *)dev);
+			ixgbe_dev_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		ixgbe_dev_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	ixgbe_enable_intr(dev);
@@ -1656,6 +1733,7 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	struct ixgbe_filter_info *filter_info =
 		IXGBE_DEV_PRIVATE_TO_FILTER_INFO(dev->data->dev_private);
 	struct ixgbe_5tuple_filter *p_5tuple, *p_5tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 	int vf;
 
 	PMD_INIT_FUNC_TRACE();
@@ -1663,6 +1741,9 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	/* disable interrupts */
 	ixgbe_disable_intr(hw);
 
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	/* reset the NIC */
 	ixgbe_pf_reset_hw(hw);
 	hw->adapter_stopped = FALSE;
@@ -1703,6 +1784,14 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	memset(filter_info->fivetuple_mask, 0,
 		sizeof(uint32_t) * IXGBE_5TUPLE_ARRAY_SIZE);
 
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 /*
@@ -2298,6 +2387,30 @@ ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+/**
+ * It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+static int
+ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	intr->mask |= IXGBE_EICR_RTX_QUEUE;
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and sets flag (IXGBE_EICR_LSC) for the link_update.
  *
@@ -2324,10 +2437,10 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	PMD_DRV_LOG(INFO, "eicr %x", eicr);
 
 	intr->flags = 0;
-	if (eicr & IXGBE_EICR_LSC) {
-		/* set flag for async link update */
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
 		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
-	}
 
 	if (eicr & IXGBE_EICR_MAILBOX)
 		intr->flags |= IXGBE_FLAG_MAILBOX;
@@ -2335,6 +2448,30 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev)
+{
+	uint32_t eicr;
+	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	/* clear all cause mask */
+	ixgbevf_intr_disable(hw);
+
+	/* read-on-clear nic registers here */
+	eicr = IXGBE_READ_REG(hw, IXGBE_VTEICR);
+	PMD_DRV_LOG(INFO, "eicr %x", eicr);
+
+	intr->flags = 0;
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
+		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
+
+	return 0;
+}
+
 /**
  * It gets and then prints the link status.
  *
@@ -2430,6 +2567,18 @@ ixgbe_dev_interrupt_action(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_DRV_LOG(DEBUG, "enable intr immediately");
+	ixgbevf_intr_enable(hw);
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+	return 0;
+}
+
 /**
  * Interrupt handler which shall be registered for alarm callback for delayed
  * handling specific interrupt to wait for the stable nic state. As the
@@ -2484,13 +2633,24 @@ ixgbe_dev_interrupt_delayed_handler(void *param)
  */
 static void
 ixgbe_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
-							void *param)
+			    void *param)
 {
 	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
 	ixgbe_dev_interrupt_get_status(dev);
 	ixgbe_dev_interrupt_action(dev);
 }
 
+static void
+ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+			      void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+	ixgbevf_dev_interrupt_get_status(dev);
+	ixgbevf_dev_interrupt_action(dev);
+}
+
 static int
 ixgbe_dev_led_on(struct rte_eth_dev *dev)
 {
@@ -2988,6 +3148,19 @@ ixgbevf_intr_disable(struct ixgbe_hw *hw)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
+static void
+ixgbevf_intr_enable(struct ixgbe_hw *hw)
+{
+	PMD_INIT_FUNC_TRACE();
+
+	/* VF enable interrupt autoclean */
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAM, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAC, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, IXGBE_VF_IRQ_ENABLE_MASK);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
 static int
 ixgbevf_dev_configure(struct rte_eth_dev *dev)
 {
@@ -3029,6 +3202,11 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
 	int err, mask = 0;
 
 	PMD_INIT_FUNC_TRACE();
@@ -3059,6 +3237,42 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 
 	ixgbevf_dev_rxtx_start(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+	ixgbevf_configure_msix(dev);
+
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle))
+			rte_intr_callback_register(intr_handle,
+					       ixgbevf_dev_interrupt_handler,
+					       (void *)dev);
+		else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+	rte_intr_enable(intr_handle);
+
+	/* Re-enable interrupt for VF */
+	ixgbevf_intr_enable(hw);
+
 	return 0;
 }
 
@@ -3066,6 +3280,7 @@ static void
 ixgbevf_dev_stop(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3082,12 +3297,27 @@ ixgbevf_dev_stop(struct rte_eth_dev *dev)
 	dev->data->scattered_rx = 0;
 
 	ixgbe_dev_clear_queues(dev);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
 ixgbevf_dev_close(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3097,6 +3327,14 @@ ixgbevf_dev_close(struct rte_eth_dev *dev)
 
 	/* reprogram the RAR[0] in case user changed it. */
 	ixgbe_set_rar(hw, 0, hw->mac.addr, 0, IXGBE_RAH_AV);
+
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
 }
 
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on)
@@ -3614,6 +3852,269 @@ ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+static int
+ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask |= (1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask &= ~(1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask |= (1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= (1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= (1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask &= ~(1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= ~(1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= ~(1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+
+	return 0;
+}
+
+static void
+ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		     uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	if (direction == -1) {
+		/* other causes */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR_MISC);
+		tmp &= ~0xFF;
+		tmp |= msix_vector;
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR_MISC, tmp);
+	} else {
+		/* rx or tx cause */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		idx = ((16 * (queue & 1)) + (8 * direction));
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR(queue >> 1));
+		tmp &= ~(0xFF << idx);
+		tmp |= (msix_vector << idx);
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR(queue >> 1), tmp);
+	}
+}
+
+/**
+ * set the IVAR registers, mapping interrupt causes to vectors
+ * @param hw
+ *  pointer to ixgbe_hw struct
+ * @direction
+ *  0 for Rx, 1 for Tx, -1 for other causes
+ * @queue
+ *  queue to map the corresponding interrupt to
+ * @msix_vector
+ *  the vector to map to the corresponding queue
+ */
+static void
+ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		if (direction == -1)
+			direction = 0;
+		idx = (((direction * 64) + queue) >> 2) & 0x1F;
+		tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(idx));
+		tmp &= ~(0xFF << (8 * (queue & 0x3)));
+		tmp |= (msix_vector << (8 * (queue & 0x3)));
+		IXGBE_WRITE_REG(hw, IXGBE_IVAR(idx), tmp);
+	} else if ((hw->mac.type == ixgbe_mac_82599EB) ||
+			(hw->mac.type == ixgbe_mac_X540)) {
+		if (direction == -1) {
+			/* other causes */
+			idx = ((queue & 1) * 8);
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR_MISC);
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR_MISC, tmp);
+		} else {
+			/* rx or tx causes */
+			idx = ((16 * (queue & 1)) + (8 * direction));
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(queue >> 1));
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR(queue >> 1), tmp);
+		}
+	}
+}
+#endif
+
+static void
+ixgbevf_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t q_idx;
+	uint32_t vector_idx = 0;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd.
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* Configure all RX queues of VF */
+	for (q_idx = 0; q_idx < dev->data->nb_rx_queues; q_idx++) {
+		/* Force all queue use vector 0,
+		 * as IXGBE_VF_MAXMSIVECOTR = 1
+		 */
+		ixgbevf_set_ivar_map(hw, 0, q_idx, vector_idx);
+		intr_handle->intr_vec[q_idx] = vector_idx;
+	}
+
+	/* Configure VF Rx queue ivar */
+	ixgbevf_set_ivar_map(hw, -1, 1, vector_idx);
+#endif
+}
+
+/**
+ * Sets up the hardware to properly generate MSI-X interrupts
+ * @hw
+ *  board private structure
+ */
+static void
+ixgbe_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t queue_id, vec = 0;
+	uint32_t mask;
+	uint32_t gpie;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* setup GPIE for MSI-x mode */
+	gpie = IXGBE_READ_REG(hw, IXGBE_GPIE);
+	gpie |= IXGBE_GPIE_MSIX_MODE | IXGBE_GPIE_PBA_SUPPORT |
+		IXGBE_GPIE_OCD | IXGBE_GPIE_EIAME;
+	/* auto clearing and auto setting corresponding bits in EIMS
+	 * when MSI-X interrupt is triggered
+	 */
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM, IXGBE_EICS_RTX_QUEUE);
+	} else {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(0), 0xFFFFFFFF);
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(1), 0xFFFFFFFF);
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie);
+
+	/* Populate the IVAR table and set the ITR values to the
+	 * corresponding register.
+	 */
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues;
+	     queue_id++) {
+		/* by default, 1:1 mapping */
+		ixgbe_set_ivar_map(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	switch (hw->mac.type) {
+	case ixgbe_mac_82598EB:
+		ixgbe_set_ivar_map(hw, -1, IXGBE_IVAR_OTHER_CAUSES_INDEX,
+				   intr_handle->max_intr - 1);
+		break;
+	case ixgbe_mac_82599EB:
+	case ixgbe_mac_X540:
+		ixgbe_set_ivar_map(hw, -1, 1, intr_handle->max_intr - 1);
+		break;
+	default:
+		break;
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_EITR(queue_id),
+			IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT & 0xFFF);
+
+	/* set up to autoclear timer, and the vectors */
+	mask = IXGBE_EIMS_ENABLE_MASK;
+	mask &= ~(IXGBE_EIMS_OTHER |
+		  IXGBE_EIMS_MAILBOX |
+		  IXGBE_EIMS_LSC);
+
+	IXGBE_WRITE_REG(hw, IXGBE_EIAC, mask);
+#endif
+}
+
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 	uint16_t queue_idx, uint16_t tx_rate)
 {
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 755b674..d813f65 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -117,6 +117,9 @@
 	ETH_RSS_IPV6_TCP_EX | \
 	ETH_RSS_IPV6_UDP_EX)
 
+#define IXGBE_VF_IRQ_ENABLE_MASK        3          /* vf irq enable mask */
+#define IXGBE_VF_MAXMSIVECTOR           1
+
 /*
  * Information about the fdir mode.
  */
@@ -330,6 +333,7 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 		uint16_t rx_queue_id);
 
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
-- 
1.8.1.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (7 preceding siblings ...)
  2015-07-17  6:16  1%     ` [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start
 - fix link interrupt not working issue in vfio-msix

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove unnecessary variables in e1000_mac_info
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/e1000/igb_ethdev.c | 311 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 277 insertions(+), 34 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index eb97218..fd92c80 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -104,6 +104,9 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
 				struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -201,7 +204,6 @@ static int eth_igb_filter_ctrl(struct rte_eth_dev *dev,
 		     enum rte_filter_type filter_type,
 		     enum rte_filter_op filter_op,
 		     void *arg);
-
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
 				    struct ether_addr *mc_addr_set,
 				    uint32_t nb_mc_addr);
@@ -212,6 +214,17 @@ static int igb_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
 					  uint32_t flags);
 static int igb_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 					  struct timespec *timestamp);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					uint16_t queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					 uint16_t queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+				       uint8_t queue, uint8_t msix_vector);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+			       uint8_t index, uint8_t offset);
+#endif
+static void eth_igb_configure_msix_intr(struct rte_eth_dev *dev);
 
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
@@ -272,6 +285,10 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.vlan_tpid_set        = eth_igb_vlan_tpid_set,
 	.vlan_offload_set     = eth_igb_vlan_offload_set,
 	.rx_queue_setup       = eth_igb_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+	.rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -609,12 +626,6 @@ eth_igb_dev_init(struct rte_eth_dev *eth_dev)
 		     eth_dev->data->port_id, pci_dev->id.vendor_id,
 		     pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		eth_igb_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	igb_intr_enable(eth_dev);
 
@@ -777,7 +788,11 @@ eth_igb_start(struct rte_eth_dev *dev)
 {
 	struct e1000_hw *hw =
 		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-	int ret, i, mask;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
+	int ret, mask;
 	uint32_t ctrl_ext;
 
 	PMD_INIT_FUNC_TRACE();
@@ -817,6 +832,29 @@ eth_igb_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	igb_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle)) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for rx interrupt */
+	eth_igb_configure_msix_intr(dev);
+
 	/* Configure for OS presence */
 	igb_init_manageability(hw);
 
@@ -844,33 +882,9 @@ eth_igb_start(struct rte_eth_dev *dev)
 		igb_vmdq_vlan_hw_filter_enable(dev);
 	}
 
-	/*
-	 * Configure the Interrupt Moderation register (EITR) with the maximum
-	 * possible value (0xFFFF) to minimize "System Partial Write" issued by
-	 * spurious [DMA] memory updates of RX and TX ring descriptors.
-	 *
-	 * With a EITR granularity of 2 microseconds in the 82576, only 7/8
-	 * spurious memory updates per second should be expected.
-	 * ((65535 * 2) / 1000.1000 ~= 0.131 second).
-	 *
-	 * Because interrupts are not used at all, the MSI-X is not activated
-	 * and interrupt moderation is controlled by EITR[0].
-	 *
-	 * Note that having [almost] disabled memory updates of RX and TX ring
-	 * descriptors through the Interrupt Moderation mechanism, memory
-	 * updates of ring descriptors are now moderated by the configurable
-	 * value of Write-Back Threshold registers.
-	 */
 	if ((hw->mac.type == e1000_82576) || (hw->mac.type == e1000_82580) ||
 		(hw->mac.type == e1000_i350) || (hw->mac.type == e1000_i210) ||
 		(hw->mac.type == e1000_i211)) {
-		uint32_t ivar;
-
-		/* Enable all RX & TX queues in the IVAR registers */
-		ivar = (uint32_t) ((E1000_IVAR_VALID << 16) | E1000_IVAR_VALID);
-		for (i = 0; i < 8; i++)
-			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, i, ivar);
-
 		/* Configure EITR with the maximum possible value (0xFFFF) */
 		E1000_WRITE_REG(hw, E1000_EITR(0), 0xFFFF);
 	}
@@ -921,8 +935,25 @@ eth_igb_start(struct rte_eth_dev *dev)
 	e1000_setup_link(hw);
 
 	/* check if lsc interrupt feature is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ret = eth_igb_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   eth_igb_interrupt_handler,
+						   (void *)dev);
+			eth_igb_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		eth_igb_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	igb_intr_enable(dev);
@@ -955,8 +986,13 @@ eth_igb_stop(struct rte_eth_dev *dev)
 	struct e1000_flex_filter *p_flex;
 	struct e1000_5tuple_filter *p_5tuple, *p_5tuple_next;
 	struct e1000_2tuple_filter *p_2tuple, *p_2tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	igb_intr_disable(hw);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	igb_pf_reset_hw(hw);
 	E1000_WRITE_REG(hw, E1000_WUC, 0);
 
@@ -1005,6 +1041,15 @@ eth_igb_stop(struct rte_eth_dev *dev)
 		rte_free(p_2tuple);
 	}
 	filter_info->twotuple_mask = 0;
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
@@ -1012,6 +1057,9 @@ eth_igb_close(struct rte_eth_dev *dev)
 {
 	struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct rte_eth_link link;
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	eth_igb_stop(dev);
 	e1000_phy_hw_reset(hw);
@@ -1029,6 +1077,14 @@ eth_igb_close(struct rte_eth_dev *dev)
 
 	igb_dev_clear_queues(dev);
 
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
+
 	memset(&link, 0, sizeof(link));
 	rte_igb_dev_atomic_write_link_status(dev, &link);
 }
@@ -1853,6 +1909,35 @@ eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+/* It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	uint32_t mask, regval;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_eth_dev_info dev_info;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	eth_igb_infos_get(dev, &dev_info);
+
+	mask = 0xFFFFFFFF >> (32 - dev_info.max_rx_queues);
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and gets interrupt causes, check it and set a bit flag
  * to update link status.
@@ -3788,5 +3873,163 @@ static struct rte_driver pmd_igbvf_drv = {
 	.init = rte_igbvf_pmd_init,
 };
 
+#ifdef RTE_NEXT_ABI
+static int
+eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+
+	E1000_WRITE_REG(hw, E1000_EIMC, mask);
+	E1000_WRITE_FLUSH(hw);
+
+	return 0;
+}
+
+static int
+eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+	uint32_t regval;
+
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+	E1000_WRITE_FLUSH(hw);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static void
+eth_igb_write_ivar(struct e1000_hw *hw, uint8_t  msix_vector,
+		   uint8_t index, uint8_t offset)
+{
+	uint32_t val = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
+
+	/* clear bits */
+	val &= ~((uint32_t)0xFF << offset);
+
+	/* write vector and valid bit */
+	val |= (msix_vector | E1000_IVAR_VALID) << offset;
+
+	E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, val);
+}
+
+static void
+eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+			   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp = 0;
+
+	if (hw->mac.type == e1000_82575) {
+		if (direction == 0)
+			tmp = E1000_EICR_RX_QUEUE0 << queue;
+		else if (direction == 1)
+			tmp = E1000_EICR_TX_QUEUE0 << queue;
+		E1000_WRITE_REG(hw, E1000_MSIXBM(msix_vector), tmp);
+	} else if (hw->mac.type == e1000_82576) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector, queue & 0x7,
+					   ((queue & 0x8) << 1) +
+					   8 * direction);
+	} else if ((hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector,
+					   queue >> 1,
+					   ((queue & 0x1) << 4) +
+					   8 * direction);
+	}
+}
+#endif
+
+/* Sets up the hardware to generate MSI-X interrupts properly
+ * @hw
+ *  board private structure
+ */
+static void
+eth_igb_configure_msix_intr(struct rte_eth_dev *dev)
+{
+#ifdef RTE_NEXT_ABI
+	int queue_id;
+	uint32_t tmpval, regval, intr_mask;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t vec = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* set interrupt vector for other causes */
+	if (hw->mac.type == e1000_82575) {
+		tmpval = E1000_READ_REG(hw, E1000_CTRL_EXT);
+		/* enable MSI-X PBA support */
+		tmpval |= E1000_CTRL_EXT_PBA_CLR;
+
+		/* Auto-Mask interrupts upon ICR read */
+		tmpval |= E1000_CTRL_EXT_EIAME;
+		tmpval |= E1000_CTRL_EXT_IRCA;
+
+		E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmpval);
+
+		/* enable msix_other interrupt */
+		E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0), 0, E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAM);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | E1000_EIMS_OTHER);
+	} else if ((hw->mac.type == e1000_82576) ||
+			(hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		/* turn on MSI-X capability first */
+		E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE |
+					E1000_GPIE_PBA | E1000_GPIE_EIAME |
+					E1000_GPIE_NSICR);
+
+		intr_mask = (1 << intr_handle->max_intr) - 1;
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | intr_mask);
+
+		/* enable msix_other interrupt */
+		regval = E1000_READ_REG(hw, E1000_EIMS);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | intr_mask);
+		tmpval = (dev->data->nb_rx_queues | E1000_IVAR_VALID) << 8;
+		E1000_WRITE_REG(hw, E1000_IVAR_MISC, tmpval);
+	}
+
+	/* use EIAM to auto-mask when MSI-X interrupt
+	 * is asserted, this saves a register write for every interrupt
+	 */
+	intr_mask = (1 << intr_handle->nb_efd) - 1;
+	regval = E1000_READ_REG(hw, E1000_EIAM);
+	E1000_WRITE_REG(hw, E1000_EIAM, regval | intr_mask);
+
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues; queue_id++) {
+		eth_igb_assign_msix_vector(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	E1000_WRITE_FLUSH(hw);
+#endif
+}
+
 PMD_REGISTER_DRIVER(pmd_igb_drv);
 PMD_REGISTER_DRIVER(pmd_igbvf_drv);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (8 preceding siblings ...)
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF Cunming Liang
@ 2015-07-17  6:16  2%     ` Cunming Liang
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-17  6:16 UTC (permalink / raw)
  To: dev, thomas.monjalon, david.marchand; +Cc: shemming

The patch demonstrates how to handle per rx queue interrupt in a NAPI-like
implementation in userspace. The working thread mainly runs in polling mode
and switch to interrupt mode only if there is no packet received in recent polls.
The working thread returns to polling mode immediately once it receives an
interrupt notification caused by the incoming packets.
The sample keeps running in polling mode if the binding PMD hasn't supported
the rx interrupt yet. Now only ixgbe(pf/vf) and igb support it.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v7 changes
 - using new APIs
 - demo multiple port/queue pair wait on the same epoll instance

v6 changes
 - Split event fd add and wait

v5 changes
 - Change invoked function name and parameter to accomodate EAL change

v3 changes
 - Add spinlock to ensure thread safe when accessing interrupt mask
   register

v2 changes
 - Remove unused function which is for debug purpose

 examples/l3fwd-power/main.c | 202 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 162 insertions(+), 40 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index b3c5f43..bec78e1 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -74,12 +74,14 @@
 #include <rte_string_fns.h>
 #include <rte_timer.h>
 #include <rte_power.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
 
 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
 
-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10
 
 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES           200000000ULL
@@ -153,6 +155,9 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 /* ethernet addresses of ports */
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
 
+/* ethernet addresses of ports */
+static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
 /* Ports set in promiscuous mode off by default. */
@@ -185,6 +190,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128
 
+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
 	uint8_t port_id;
@@ -211,7 +219,7 @@ static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
 
 static struct rte_eth_conf port_conf = {
 	.rxmode = {
-		.mq_mode	= ETH_MQ_RX_RSS,
+		.mq_mode = ETH_MQ_RX_RSS,
 		.max_rx_pkt_len = ETHER_MAX_LEN,
 		.split_hdr_size = 0,
 		.header_split   = 0, /**< Header Split disabled */
@@ -223,11 +231,14 @@ static struct rte_eth_conf port_conf = {
 	.rx_adv_conf = {
 		.rss_conf = {
 			.rss_key = NULL,
-			.rss_hf = ETH_RSS_IP,
+			.rss_hf = ETH_RSS_UDP,
 		},
 	},
 	.txmode = {
-		.mq_mode = ETH_DCB_NONE,
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+	.intr_conf = {
+		.lsc = 1,
 	},
 };
 
@@ -399,19 +410,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer *tim,
 	/* accumulate total execution time in us when callback is invoked */
 	sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
 					(float)SCALING_PERIOD;
-
 	/**
 	 * check whether need to scale down frequency a step if it sleep a lot.
 	 */
-	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-		rte_power_freq_down(lcore_id);
+	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 	else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
 		/**
 		 * scale down a step if average packet per iteration less
 		 * than expectation.
 		 */
-		rte_power_freq_down(lcore_id);
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 
 	/**
 	 * initialize another timer according to current frequency to ensure
@@ -712,22 +726,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 
 }
 
-#define SLEEP_GEAR1_THRESHOLD            100
-#define SLEEP_GEAR2_THRESHOLD            1000
+#define MINIMUM_SLEEP_TIME         1
+#define SUSPEND_THRESHOLD          300
 
 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-	/* If zero count is less than 100, use it as the sleep time in us */
-	if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-		return zero_rx_packet_count;
-	/* If zero count is less than 1000, sleep time should be 100 us */
-	else if ((zero_rx_packet_count >= SLEEP_GEAR1_THRESHOLD) &&
-			(zero_rx_packet_count < SLEEP_GEAR2_THRESHOLD))
-		return SLEEP_GEAR1_THRESHOLD;
-	/* If zero count is greater than 1000, sleep time should be 1000 us */
-	else if (zero_rx_packet_count >= SLEEP_GEAR2_THRESHOLD)
-		return SLEEP_GEAR2_THRESHOLD;
+	/* If zero count is less than 100,  sleep 1us */
+	if (zero_rx_packet_count < SUSPEND_THRESHOLD)
+		return MINIMUM_SLEEP_TIME;
+	/* If zero count is less than 1000, sleep 100 us which is the
+		minimum latency switching from C3/C6 to C0
+	*/
+	else
+		return SUSPEND_THRESHOLD;
 
 	return 0;
 }
@@ -767,6 +779,84 @@ power_freq_scaleup_heuristic(unsigned lcore_id,
 	return FREQ_CURRENT;
 }
 
+/**
+ * force polling thread sleep until one-shot rx interrupt triggers
+ * @param port_id
+ *  Port id.
+ * @param queue_id
+ *  Rx queue id.
+ * @return
+ *  0 on success
+ */
+static int
+sleep_until_rx_interrupt(int num)
+{
+	struct rte_epoll_event event[num];
+	int n, i;
+	uint8_t port_id, queue_id;
+	void *data;
+
+	RTE_LOG(INFO, L3FWD_POWER,
+		"lcore %u sleeps until interrupt triggers\n",
+		rte_lcore_id());
+
+	n = rte_epoll_wait(RTE_EPOLL_PER_THREAD, event, num, -1);
+	for (i = 0; i < n; i++) {
+		data = event[i].epdata.data;
+		port_id = ((uintptr_t)data) >> CHAR_BIT;
+		queue_id = ((uintptr_t)data) &
+			RTE_LEN2MASK(CHAR_BIT, uint8_t);
+		RTE_LOG(INFO, L3FWD_POWER,
+			"lcore %u is waked up from rx interrupt on"
+			" port %d queue %d\n",
+			rte_lcore_id(), port_id, queue_id);
+	}
+
+	return 0;
+}
+
+static int turn_on_intr(struct lcore_conf *qconf)
+{
+	int i;
+	struct lcore_rx_queue *rx_queue;
+	uint8_t port_id, queue_id;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		port_id = rx_queue->port_id;
+		queue_id = rx_queue->queue_id;
+
+		rte_spinlock_lock(&(locks[port_id]));
+		rte_eth_dev_rx_intr_enable(port_id, queue_id);
+		rte_spinlock_unlock(&(locks[port_id]));
+	}
+}
+
+static int event_register(struct lcore_conf *qconf)
+{
+	struct lcore_rx_queue *rx_queue;
+	uint8_t portid, queueid;
+	uint32_t data;
+	int ret;
+	int i;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		portid = rx_queue->port_id;
+		queueid = rx_queue->queue_id;
+		data = portid << CHAR_BIT | queueid;
+
+		ret = rte_eth_dev_rx_intr_ctl_q(portid, queueid,
+						RTE_EPOLL_PER_THREAD,
+						RTE_INTR_EVENT_ADD,
+						(void *)((uintptr_t)data));
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 /* main processing loop */
 static int
 main_loop(__attribute__((unused)) void *dummy)
@@ -780,9 +870,9 @@ main_loop(__attribute__((unused)) void *dummy)
 	struct lcore_conf *qconf;
 	struct lcore_rx_queue *rx_queue;
 	enum freq_scale_hint_t lcore_scaleup_hint;
-
 	uint32_t lcore_rx_idle_count = 0;
 	uint32_t lcore_idle_hint = 0;
+	int intr_en = 0;
 
 	const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
 
@@ -799,13 +889,18 @@ main_loop(__attribute__((unused)) void *dummy)
 	RTE_LOG(INFO, L3FWD_POWER, "entering main loop on lcore %u\n", lcore_id);
 
 	for (i = 0; i < qconf->n_rx_queue; i++) {
-
 		portid = qconf->rx_queue_list[i].port_id;
 		queueid = qconf->rx_queue_list[i].queue_id;
 		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%hhu "
 			"rxqueueid=%hhu\n", lcore_id, portid, queueid);
 	}
 
+	/* add into event wait list */
+	if (event_register(qconf) == 0)
+		intr_en = 1;
+	else
+		RTE_LOG(INFO, L3FWD_POWER, "RX interrupt won't enable.\n");
+
 	while (1) {
 		stats[lcore_id].nb_iteration_looped++;
 
@@ -840,6 +935,7 @@ main_loop(__attribute__((unused)) void *dummy)
 			prev_tsc_power = cur_tsc_power;
 		}
 
+start_rx:
 		/*
 		 * Read packet from RX queues
 		 */
@@ -853,6 +949,7 @@ main_loop(__attribute__((unused)) void *dummy)
 
 			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
 								MAX_PKT_BURST);
+
 			stats[lcore_id].nb_rx_processed += nb_rx;
 			if (unlikely(nb_rx == 0)) {
 				/**
@@ -915,10 +1012,13 @@ main_loop(__attribute__((unused)) void *dummy)
 						rx_queue->freq_up_hint;
 			}
 
-			if (lcore_scaleup_hint == FREQ_HIGHEST)
-				rte_power_freq_max(lcore_id);
-			else if (lcore_scaleup_hint == FREQ_HIGHER)
-				rte_power_freq_up(lcore_id);
+			if (lcore_scaleup_hint == FREQ_HIGHEST) {
+				if (rte_power_freq_max)
+					rte_power_freq_max(lcore_id);
+			} else if (lcore_scaleup_hint == FREQ_HIGHER) {
+				if (rte_power_freq_up)
+					rte_power_freq_up(lcore_id);
+			}
 		} else {
 			/**
 			 * All Rx queues empty in recent consecutive polls,
@@ -933,16 +1033,23 @@ main_loop(__attribute__((unused)) void *dummy)
 					lcore_idle_hint = rx_queue->idle_hint;
 			}
 
-			if ( lcore_idle_hint < SLEEP_GEAR1_THRESHOLD)
+			if (lcore_idle_hint < SUSPEND_THRESHOLD)
 				/**
 				 * execute "pause" instruction to avoid context
-				 * switch for short sleep.
+				 * switch which generally take hundred of
+				 * microseconds for short sleep.
 				 */
 				rte_delay_us(lcore_idle_hint);
-			else
-				/* long sleep force runing thread to suspend */
-				usleep(lcore_idle_hint);
-
+			else {
+				/* suspend until rx interrupt trigges */
+				if (intr_en) {
+					turn_on_intr(qconf);
+					sleep_until_rx_interrupt(
+						qconf->n_rx_queue);
+				}
+				/* start receiving packets immediately */
+				goto start_rx;
+			}
 			stats[lcore_id].sleep_time += lcore_idle_hint;
 		}
 	}
@@ -1273,7 +1380,7 @@ setup_hash(int socketid)
 	char s[64];
 
 	/* create ipv4 hash */
-	snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
 	ipv4_l3fwd_hash_params.name = s;
 	ipv4_l3fwd_hash_params.socket_id = socketid;
 	ipv4_l3fwd_lookup_struct[socketid] =
@@ -1283,7 +1390,7 @@ setup_hash(int socketid)
 				"socket %d\n", socketid);
 
 	/* create ipv6 hash */
-	snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
 	ipv6_l3fwd_hash_params.name = s;
 	ipv6_l3fwd_hash_params.socket_id = socketid;
 	ipv6_l3fwd_lookup_struct[socketid] =
@@ -1477,6 +1584,7 @@ main(int argc, char **argv)
 	unsigned lcore_id;
 	uint64_t hz;
 	uint32_t n_tx_queue, nb_lcores;
+	uint32_t dev_rxq_num, dev_txq_num;
 	uint8_t portid, nb_rx_queue, queue, socketid;
 
 	/* catch SIGINT and restore cpufreq governor to ondemand */
@@ -1526,10 +1634,19 @@ main(int argc, char **argv)
 		printf("Initializing port %d ... ", portid );
 		fflush(stdout);
 
+		rte_eth_dev_info_get(portid, &dev_info);
+		dev_rxq_num = dev_info.max_rx_queues;
+		dev_txq_num = dev_info.max_tx_queues;
+
 		nb_rx_queue = get_port_n_rx_queues(portid);
+		if (nb_rx_queue > dev_rxq_num)
+			rte_exit(EXIT_FAILURE,
+				"Cannot configure not existed rxq: "
+				"port=%d\n", portid);
+
 		n_tx_queue = nb_lcores;
-		if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
-			n_tx_queue = MAX_TX_QUEUE_PER_PORT;
+		if (n_tx_queue > dev_txq_num)
+			n_tx_queue = dev_txq_num;
 		printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
 			nb_rx_queue, (unsigned)n_tx_queue );
 		ret = rte_eth_dev_configure(portid, nb_rx_queue,
@@ -1553,6 +1670,9 @@ main(int argc, char **argv)
 			if (rte_lcore_is_enabled(lcore_id) == 0)
 				continue;
 
+			if (queueid >= dev_txq_num)
+				continue;
+
 			if (numa_on)
 				socketid = \
 				(uint8_t)rte_lcore_to_socket_id(lcore_id);
@@ -1587,8 +1707,9 @@ main(int argc, char **argv)
 		/* init power management library */
 		ret = rte_power_init(lcore_id);
 		if (ret)
-			rte_exit(EXIT_FAILURE, "Power management library "
-				"initialization failed on core%u\n", lcore_id);
+			rte_log(RTE_LOG_ERR, RTE_LOGTYPE_POWER,
+				"Power management library initialization "
+				"failed on core%u", lcore_id);
 
 		/* init timer structures for each enabled lcore */
 		rte_timer_init(&power_timers[lcore_id]);
@@ -1636,7 +1757,6 @@ main(int argc, char **argv)
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, "
 						"port=%d\n", ret, portid);
-
 		/*
 		 * If enabled, put device in promiscuous mode.
 		 * This allows IO forwarding mode to forward packets
@@ -1645,6 +1765,8 @@ main(int argc, char **argv)
 		 */
 		if (promiscuous_on)
 			rte_eth_promiscuous_enable(portid);
+		/* initialize spinlock for each port */
+		rte_spinlock_init(&(locks[portid]));
 	}
 
 	check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
  2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
@ 2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:10  4% ` Mrzyglod, DanielX T
  2015-07-17 12:03  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-17  7:54 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:08 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
  2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
@ 2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:09  4% ` Mrzyglod, DanielX T
  2015-07-17 12:02  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-17  7:54 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
  2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
  2015-07-16 15:51  4% ` Singh, Jasvinder
@ 2015-07-17  7:56  4% ` Gajdzica, MaciejX T
  2015-07-17  8:08  4% ` Mrzyglod, DanielX T
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-17  7:56 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 5:27 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
  2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
  2015-07-16 15:51  4% ` Singh, Jasvinder
  2015-07-17  7:56  4% ` Gajdzica, MaciejX T
@ 2015-07-17  8:08  4% ` Mrzyglod, DanielX T
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-17  8:08 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 5:27 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: announce ABI change for librte_port
> 
> v2 changes:
> -text simplification
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
  2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
@ 2015-07-17  8:09  4% ` Mrzyglod, DanielX T
  2015-07-17 12:02  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-17  8:09 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
  2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
@ 2015-07-17  8:10  4% ` Mrzyglod, DanielX T
  2015-07-17 12:03  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-17  8:10 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 7:08 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 13:59  4%     ` Neil Horman
@ 2015-07-17 11:45  7%       ` Mcnamara, John
  2015-07-17 12:25  4%         ` Neil Horman
  2015-07-31  9:03  7%       ` Mcnamara, John
  1 sibling, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-17 11:45 UTC (permalink / raw)
  To: Neil Horman, Chao Zhu; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, July 13, 2015 3:00 PM
> To: Mcnamara, John
> Cc: dev@dpdk.org; vladz@cloudius-systems.com
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> 
> > > > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > >
> > > >
> >
> Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil

Hi,

Just pinging Chao Zhu on this again so that it isn't forgotten.

Neil, just to be clear, are you looking for a validate-abi.sh check on PPC?

Just for context, the lro flag doesn't seem to be used anywhere that would be affected by endianness:

    $ ag -w "\->lro"             
    drivers/net/ixgbe/ixgbe_rxtx.c
    3767:   if (dev->data->lro) {
    3967:   dev->data->lro = 1;

    drivers/net/ixgbe/ixgbe_ethdev.c
    1689:   dev->data->lro = 0;

John.
-- 

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
  2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:09  4% ` Mrzyglod, DanielX T
@ 2015-07-17 12:02  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-17 12:02 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 6:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_table
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
  2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
  2015-07-17  7:54  4% ` Gajdzica, MaciejX T
  2015-07-17  8:10  4% ` Mrzyglod, DanielX T
@ 2015-07-17 12:03  4% ` Singh, Jasvinder
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-17 12:03 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 16, 2015 6:08 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline
> 
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-17 11:45  7%       ` Mcnamara, John
@ 2015-07-17 12:25  4%         ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-17 12:25 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Fri, Jul 17, 2015 at 11:45:10AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, July 13, 2015 3:00 PM
> > To: Mcnamara, John
> > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > 
> > > > > -		dev_started : 1;   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > > +		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > > > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > > >
> > > > >
> > >
> > Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil
> 
> Hi,
> 
> Just pinging Chao Zhu on this again so that it isn't forgotten.
> 
> Neil, just to be clear, are you looking for a validate-abi.sh check on PPC?
> 
Yes, correct.
> Just for context, the lro flag doesn't seem to be used anywhere that would be affected by endianness:
> 
>     $ ag -w "\->lro"             
>     drivers/net/ixgbe/ixgbe_rxtx.c
>     3767:   if (dev->data->lro) {
>     3967:   dev->data->lro = 1;
> 
>     drivers/net/ixgbe/ixgbe_ethdev.c
>     1689:   dev->data->lro = 0;
> 
But this data is visible to the outside application, correct?  If so then we
can't rely on internal-only usage as a guide.  If it is only internally visible,
then yes, you are correct, endianess is not an issue then
neil

> John.
> -- 
> 
> 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes
@ 2015-07-19 10:52  4% Thomas Monjalon
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Thomas Monjalon @ 2015-07-19 10:52 UTC (permalink / raw)
  To: dev

The main change of these patches is to improve naming consistency
across ethdev and EAL.
It should be applied shortly to be part of rc1. If some comments arise,
it can be fixed/improved in rc2.

Thomas Monjalon (4):
  doc: rename ABI chapter to deprecation
  pci: fix detach and uninit naming
  ethdev: refactor port release
  ethdev: fix doxygen internal comments

 MAINTAINERS                                       |  2 +-
 doc/guides/rel_notes/{abi.rst => deprecation.rst} | 19 ++++++++-----------
 doc/guides/rel_notes/index.rst                    |  2 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  2 ++
 lib/librte_eal/common/eal_common_pci.c            | 20 ++++++++++++--------
 lib/librte_eal/common/include/rte_pci.h           |  6 ++++--
 lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  2 ++
 lib/librte_ether/rte_ethdev.c                     | 11 +++++------
 lib/librte_ether/rte_ethdev.h                     |  9 ++++-----
 9 files changed, 39 insertions(+), 34 deletions(-)
 rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)

-- 
2.4.2

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
  2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
@ 2015-07-19 10:52 36% ` Thomas Monjalon
  2015-07-21 13:20  7%   ` Dumitrescu, Cristian
  2015-07-19 21:32  0% ` [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
  2015-07-20 10:45  0% ` Neil Horman
  2 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-19 10:52 UTC (permalink / raw)
  To: dev

This chapter is for ABI and API. That's why a renaming is required.

Remove also the examples which are now in the referenced guidelines.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 MAINTAINERS                                       |  2 +-
 doc/guides/rel_notes/{abi.rst => deprecation.rst} | 16 +++++-----------
 doc/guides/rel_notes/index.rst                    |  2 +-
 3 files changed, 7 insertions(+), 13 deletions(-)
 rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2a32659..6531900 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -60,7 +60,7 @@ F: doc/guides/prog_guide/ext_app_lib_make_help.rst
 ABI versioning
 M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
-F: doc/guides/rel_notes/abi.rst
+F: doc/guides/rel_notes/deprecation.rst
 F: scripts/validate-abi.sh
 
 
diff --git a/doc/guides/rel_notes/abi.rst b/doc/guides/rel_notes/deprecation.rst
similarity index 51%
rename from doc/guides/rel_notes/abi.rst
rename to doc/guides/rel_notes/deprecation.rst
index 7a08830..eef01f1 100644
--- a/doc/guides/rel_notes/abi.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -1,17 +1,11 @@
-ABI policy
-==========
+Deprecation
+===========
 
 See the :doc:`guidelines document for details of the ABI policy </guidelines/versioning>`.
-ABI deprecation notices are to be posted here.
+API and ABI deprecation notices are to be posted here.
 
-
-Examples of Deprecation Notices
--------------------------------
-
-* The Macro #RTE_FOO is deprecated and will be removed with version 2.0, to be replaced with the inline function rte_bar()
-* The function rte_mbuf_grok has been updated to include new parameter in version 2.0.  Backwards compatibility will be maintained for this function until the release of version 2.1
-* The members struct foo have been reorganized in release 2.0.  Existing binary applications will have backwards compatibility in release 2.0, while newly built binaries will need to reference new structure variant struct foo2.  Compatibility will be removed in release 2.2, and all applications will require updating and rebuilding to the new structure at that time, which will be renamed to the original struct foo.
-* Significant ABI changes are planned for the librte_dostuff library.  The upcoming release 2.0 will not contain these changes, but release 2.1 will, and no backwards compatibility is planned due to the invasive nature of these changes.  Binaries using this library built prior to version 2.1 will require updating and recompilation.
+Help to update from a previous release is provided in
+:doc:`another section </rel_notes/updating_apps>`.
 
 
 Deprecation Notices
diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
index d790783..9d66cd8 100644
--- a/doc/guides/rel_notes/index.rst
+++ b/doc/guides/rel_notes/index.rst
@@ -48,5 +48,5 @@ Contents
     updating_apps
     known_issues
     resolved_issues
-    abi
+    deprecation
     faq
-- 
2.4.2

^ permalink raw reply	[relevance 36%]

* Re: [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes
  2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
@ 2015-07-19 21:32  0% ` Thomas Monjalon
  2015-07-20 10:45  0% ` Neil Horman
  2 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-19 21:32 UTC (permalink / raw)
  To: dev

2015-07-19 12:52, Thomas Monjalon:
> The main change of these patches is to improve naming consistency
> across ethdev and EAL.
> It should be applied shortly to be part of rc1. If some comments arise,
> it can be fixed/improved in rc2.
> 
> Thomas Monjalon (4):
>   doc: rename ABI chapter to deprecation
>   pci: fix detach and uninit naming
>   ethdev: refactor port release
>   ethdev: fix doxygen internal comments

Applied, do not hesitate to comment if something must be fixed.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR
  @ 2015-07-19 22:54  3%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-19 22:54 UTC (permalink / raw)
  To: Wu, Jingjing; +Cc: dev

> > This patch set fixes the issue SCTP flow cannot be matched by FVL's flow
> > director. The issue's root cause is that due to the NIC's firmware update,
> > the input set of sctp flow are changed to source IP, destination IP,
> > source port, destination port and Verification-Tag, which are source IP,
> > destination IP and Verification-Tag previously.
> > And because this fix will affect the struct rte_eth_fdir_flow, use
> > RTE_NEXT_ABI to avoid ABI breaking.
> > 
> > Jingjing Wu (3):
> >   ethdev: change the input set of sctp flow
> >   i40e: make sport and dport of sctp flow involved in match
> >   testpmd: add sport and dport configuration for sctp flow
> 
> Tested-by: Marvin Liu <yong.liu@intel.com>

Applied, thanks

An ABI deprecation announce must be sent.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
@ 2015-07-19 23:31  0%       ` Thomas Monjalon
  2015-07-20  2:02  0%         ` Liang, Cunming
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-19 23:31 UTC (permalink / raw)
  To: Cunming Liang; +Cc: dev, shemming

2015-07-17 14:16, Cunming Liang:
> +#ifdef RTE_NEXT_ABI
> +	/**
> +	 * RTE_NEXT_ABI will be removed from v2.2.
> +	 * It's only used to avoid ABI(unannounced) broken in v2.1.
> +	 * Make sure being aware of the impact before turning on the feature.
> +	 */

We are not going to put this comment each time NEXT_ABI is used with ifdef.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-19 23:31  0%       ` Thomas Monjalon
@ 2015-07-20  2:02  0%         ` Liang, Cunming
  0 siblings, 0 replies; 200+ results
From: Liang, Cunming @ 2015-07-20  2:02 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, shemming



On 7/20/2015 7:31 AM, Thomas Monjalon wrote:
> 2015-07-17 14:16, Cunming Liang:
>> +#ifdef RTE_NEXT_ABI
>> +	/**
>> +	 * RTE_NEXT_ABI will be removed from v2.2.
>> +	 * It's only used to avoid ABI(unannounced) broken in v2.1.
>> +	 * Make sure being aware of the impact before turning on the feature.
>> +	 */
> We are not going to put this comment each time NEXT_ABI is used with ifdef.
Ok, will remove the comment.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD
  2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
                       ` (9 preceding siblings ...)
  2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
@ 2015-07-20  3:02  4%     ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
                         ` (10 more replies)
  10 siblings, 11 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

v15 changes
 - remove unnecessary RTE_NEXT_ABI comment
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map
 - minor comments rework

v13 changes
 - version map cleanup for v2.1
 - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility

Patch series v12
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Danny Zhou <danny.zhou@intel.com>

v12 changes
 - bsd cleanup for unused variable warning
 - fix awkward line split in debug message

v11 changes
 - typo cleanup and check kernel style

v10 changes
 - code rework to return actual error code
 - bug fix for lsc when using uio_pci_generic

v9 changes
 - code rework to fix open comment
 - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
 - new patch to turn off the feature by default so as to avoid v2.1 abi broken

v8 changes
 - remove condition check for only vfio-msix
 - add multiplex intr support when only one intr vector allowed
 - lsc and rxq interrupt runtime enable decision
 - add safe event delete while the event wakeup execution happens

v7 changes
 - decouple epoll event and intr operation
 - add condition check in the case intr vector is disabled
 - renaming some APIs

v6 changes
 - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set.
 - using vector number instead of queue_id as interrupt API params.
 - patch reorder and split.

v5 changes
 - Rebase the patchset onto the HEAD
 - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
 - Export wait-for-rx interrupt function for shared libraries
 - Split-off a new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect
 - Change sample applicaiton to accomodate EAL function spec change
   accordingly

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Adjust position of new-added structure fields and functions to
   avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions
 - Move spinlok from PMD to L3fwd-power
 - Remove unnecessary variables in e1000_mac_info
 - Fix miscelleous review comments

v2 changes
 - Fix compilation issue in Makefile for missed header file.
 - Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
   pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
   which then wakes up DPDK polling thread). In this way, it introduces
   non-deterministic wakeup latency for DPDK polling thread as well as packet
   latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
   LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF only)
.
2) Build on top of the VFIO mechanism instead of UIO, so it could support
   up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
   VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
   user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
   switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
   and rx queue interrupt handlers causes a mess. [FIXED]
2) LSC interrupt is not supported by VF driver, so it is by default disabled
   in L3fwd-power now. Feel free to turn in on if you want to support both LSC
   and rx queue interrupts on a PF.

Cunming Liang (13):
  eal/linux: add interrupt vectors support in intr_handle
  eal/linux: add rte_epoll_wait/ctl support
  eal/linux: add API to set rx interrupt event monitor
  eal/linux: fix comments typo on vfio msi
  eal/linux: map eventfd to VFIO MSI-X intr vector
  eal/linux: standalone intr event fd create support
  eal/linux: fix lsc read error in uio_pci_generic
  eal/bsd: dummy for new intr definition
  eal/bsd: fix inappropriate linuxapp referred in bsd
  ethdev: add rx intr enable, disable and ctl functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
    switch

 drivers/net/e1000/igb_ethdev.c                     | 311 ++++++++++--
 drivers/net/ixgbe/ixgbe_ethdev.c                   | 527 ++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h                   |   4 +
 examples/l3fwd-power/main.c                        | 205 ++++++--
 lib/librte_eal/bsdapp/eal/eal_interrupts.c         |  42 ++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  74 ++-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |   5 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 414 ++++++++++++++--
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 153 ++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   8 +
 lib/librte_ether/rte_ethdev.c                      | 147 ++++++
 lib/librte_ether/rte_ethdev.h                      | 104 ++++
 lib/librte_ether/rte_ether_version.map             |   4 +
 13 files changed, 1870 insertions(+), 128 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
                         ` (9 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds interrupt vectors support in rte_intr_handle.
'vec_en' is set when interrupt vectors are detected and associated event fds are set.
Those event fds are stored in efds[].
'intr_vec' is reserved for device driver to initialize the vector mapping table.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove unnecessary RTE_NEXT_ABI comment

v14 changes
 - per-patch basis ABI compatibility rework

v7 changes:
 - add eptrs[], it's used to store the register rte_epoll_event instances.
 - add vec_en, to log the vector capability status.

v6 changes:
 - add mapping table between irq vector number and queue id.

v5 changes:
 - Create this new patch file for changed struct rte_intr_handle that
   other patches depend on, to avoid breaking git bisect.

 lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index bdeb3fc..ac33eda 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -38,6 +38,8 @@
 #ifndef _RTE_LINUXAPP_INTERRUPTS_H_
 #define _RTE_LINUXAPP_INTERRUPTS_H_
 
+#define RTE_MAX_RXTX_INTR_VEC_ID     32
+
 enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_UNKNOWN = 0,
 	RTE_INTR_HANDLE_UIO,          /**< uio device handle */
@@ -58,6 +60,12 @@ struct rte_intr_handle {
 	};
 	int fd;	 /**< interrupt event file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	uint32_t max_intr;             /**< max interrupt requested */
+	uint32_t nb_efd;               /**< number of available efd(event fd) */
+	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	int *intr_vec;                 /**< intr vector number array */
+#endif
 };
 
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
                         ` (8 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold the rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback which is executed during wakeup.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v11 changes
 - cleanup spelling error

v9 changes
 - rework on coding style

v8 changes
 - support delete event in safety during the wakeup execution
 - add EINTR process during epoll_wait

v7 changes
 - split v6[4/8] into two patches, one for epoll event(this one)
   another for rx intr(next patch)
 - introduce rte_epoll_event definition
 - rte_epoll_wait/ctl for more generic RTE epoll API

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 139 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  80 ++++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   3 +
 3 files changed, 222 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 61e7c85..55be263 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -69,6 +69,8 @@
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
 
+static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+
 /**
  * union for pipe fds.
  */
@@ -896,3 +898,140 @@ rte_eal_intr_init(void)
 
 	return -ret;
 }
+
+static int
+eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
+			struct rte_epoll_event *events)
+{
+	unsigned int i, count = 0;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < n; i++) {
+		rev = evs[i].data.ptr;
+		if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
+						 RTE_EPOLL_EXEC))
+			continue;
+
+		events[count].status        = RTE_EPOLL_VALID;
+		events[count].fd            = rev->fd;
+		events[count].epfd          = rev->epfd;
+		events[count].epdata.event  = rev->epdata.event;
+		events[count].epdata.data   = rev->epdata.data;
+		if (rev->epdata.cb_fun)
+			rev->epdata.cb_fun(rev->fd,
+					   rev->epdata.cb_arg);
+
+		rte_compiler_barrier();
+		rev->status = RTE_EPOLL_VALID;
+		count++;
+	}
+	return count;
+}
+
+static inline int
+eal_init_tls_epfd(void)
+{
+	int pfd = epoll_create(255);
+
+	if (pfd < 0) {
+		RTE_LOG(ERR, EAL,
+			"Cannot create epoll instance\n");
+		return -1;
+	}
+	return pfd;
+}
+
+int
+rte_intr_tls_epfd(void)
+{
+	if (RTE_PER_LCORE(_epfd) == -1)
+		RTE_PER_LCORE(_epfd) = eal_init_tls_epfd();
+
+	return RTE_PER_LCORE(_epfd);
+}
+
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout)
+{
+	struct epoll_event evs[maxevents];
+	int rc;
+
+	if (!events) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	while (1) {
+		rc = epoll_wait(epfd, evs, maxevents, timeout);
+		if (likely(rc > 0)) {
+			/* epoll_wait has at least one fd ready to read */
+			rc = eal_epoll_process_event(evs, rc, events);
+			break;
+		} else if (rc < 0) {
+			if (errno == EINTR)
+				continue;
+			/* epoll_wait fail */
+			RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
+				strerror(errno));
+			rc = -1;
+			break;
+		}
+	}
+
+	return rc;
+}
+
+static inline void
+eal_epoll_data_safe_free(struct rte_epoll_event *ev)
+{
+	while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
+				    RTE_EPOLL_INVALID))
+		while (ev->status != RTE_EPOLL_VALID)
+			rte_pause();
+	memset(&ev->epdata, 0, sizeof(ev->epdata));
+	ev->fd = -1;
+	ev->epfd = -1;
+}
+
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event)
+{
+	struct epoll_event ev;
+
+	if (!event) {
+		RTE_LOG(ERR, EAL, "rte_epoll_event can't be NULL\n");
+		return -1;
+	}
+
+	/* using per thread epoll fd */
+	if (epfd == RTE_EPOLL_PER_THREAD)
+		epfd = rte_intr_tls_epfd();
+
+	if (op == EPOLL_CTL_ADD) {
+		event->status = RTE_EPOLL_VALID;
+		event->fd = fd;  /* ignore fd in event */
+		event->epfd = epfd;
+		ev.data.ptr = (void *)event;
+	}
+
+	ev.events = event->epdata.event;
+	if (epoll_ctl(epfd, op, fd, &ev) < 0) {
+		RTE_LOG(ERR, EAL, "Error op %d fd %d epoll_ctl, %s\n",
+			op, fd, strerror(errno));
+		if (op == EPOLL_CTL_ADD)
+			/* rollback status when CTL_ADD fail */
+			event->status = RTE_EPOLL_INVALID;
+		return -1;
+	}
+
+	if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
+		eal_epoll_data_safe_free(event);
+
+	return 0;
+}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index ac33eda..886608c 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -51,6 +51,32 @@ enum rte_intr_handle_type {
 	RTE_INTR_HANDLE_MAX
 };
 
+#define RTE_INTR_EVENT_ADD            1UL
+#define RTE_INTR_EVENT_DEL            2UL
+
+typedef void (*rte_intr_event_cb_t)(int fd, void *arg);
+
+struct rte_epoll_data {
+	uint32_t event;               /**< event type */
+	void *data;                   /**< User data */
+	rte_intr_event_cb_t cb_fun;   /**< IN: callback fun */
+	void *cb_arg;	              /**< IN: callback arg */
+};
+
+enum {
+	RTE_EPOLL_INVALID = 0,
+	RTE_EPOLL_VALID,
+	RTE_EPOLL_EXEC,
+};
+
+/** interrupt epoll event obj, taken by epoll_event.ptr */
+struct rte_epoll_event {
+	volatile uint32_t status;  /**< OUT: event status */
+	int fd;                    /**< OUT: event fd */
+	int epfd;       /**< OUT: epoll instance the ev associated with */
+	struct rte_epoll_data epdata;
+};
+
 /** Handle for interrupts. */
 struct rte_intr_handle {
 	union {
@@ -64,8 +90,62 @@ struct rte_intr_handle {
 	uint32_t max_intr;             /**< max interrupt requested */
 	uint32_t nb_efd;               /**< number of available efd(event fd) */
 	int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+	struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
+				       /**< intr vector epoll event */
 	int *intr_vec;                 /**< intr vector number array */
 #endif
 };
 
+#define RTE_EPOLL_PER_THREAD        -1  /**< to hint using per thread epfd */
+
+/**
+ * It waits for events on the epoll instance.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller wait for events.
+ * @param events
+ *   Memory area contains the events that will be available for the caller.
+ * @param maxevents
+ *   Up to maxevents are returned, must greater than zero.
+ * @param timeout
+ *   Specifying a timeout of -1 causes a block indefinitely.
+ *   Specifying a timeout equal to zero cause to return immediately.
+ * @return
+ *   - On success, returns the number of available event.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_wait(int epfd, struct rte_epoll_event *events,
+	       int maxevents, int timeout);
+
+/**
+ * It performs control operations on epoll instance referred by the epfd.
+ * It requests that the operation op be performed for the target fd.
+ *
+ * @param epfd
+ *   Epoll instance fd on which the caller perform control operations.
+ * @param op
+ *   The operation be performed for the target fd.
+ * @param fd
+ *   The target fd on which the control ops perform.
+ * @param event
+ *   Describes the object linked to the fd.
+ *   Note: The caller must take care the object deletion after CTL_DEL.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_epoll_ctl(int epfd, int op, int fd,
+	      struct rte_epoll_event *event);
+
+/**
+ * The function returns the per thread epoll instance.
+ *
+ * @return
+ *   epfd the epoll instance referred to.
+ */
+int
+rte_intr_tls_epfd(void);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index b2d4441..39cc2d2 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -116,6 +116,9 @@ DPDK_2.1 {
 	global:
 
 	rte_eal_pci_detach;
+	rte_epoll_ctl;
+	rte_epoll_wait;
+	rte_intr_tls_epfd;
 	rte_memzone_free;
 
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
                         ` (7 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector events monitor on specified epoll instance.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v12 changes:
 - fix awkward line split in using RTE_LOG

v10 changes:
 - add RTE_INTR_HANDLE_UIO_INTX for uio_pci_generic

v8 changes
 - fix EWOULDBLOCK and EINTR processing
 - add event status check

v7 changes
 - rename rte_intr_rx_set to rte_intr_rx_ctl.
 - rte_intr_rx_ctl uses rte_epoll_ctl to register epoll event instance.
 - the intr rx event instance includes a intr process callback.

v6 changes
 - split rte_intr_wait_rx_pkt into two function, wait and set.
 - rewrite rte_intr_rx_wait/rte_intr_rx_set to remove queue visibility on eal.
 - rte_intr_rx_wait to support multiplexing.
 - allow epfd as input to support flexible event fd combination.

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 117 +++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |  20 ++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   1 +
 3 files changed, 138 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 55be263..ffccb0e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -899,6 +899,51 @@ rte_eal_intr_init(void)
 	return -ret;
 }
 
+#ifdef RTE_NEXT_ABI
+static void
+eal_intr_proc_rxtx_intr(int fd, const struct rte_intr_handle *intr_handle)
+{
+	union rte_intr_read_buffer buf;
+	int bytes_read = 1;
+
+	switch (intr_handle->type) {
+	case RTE_INTR_HANDLE_UIO:
+	case RTE_INTR_HANDLE_UIO_INTX:
+		bytes_read = sizeof(buf.uio_intr_count);
+		break;
+#ifdef VFIO_PRESENT
+	case RTE_INTR_HANDLE_VFIO_MSIX:
+	case RTE_INTR_HANDLE_VFIO_MSI:
+	case RTE_INTR_HANDLE_VFIO_LEGACY:
+		bytes_read = sizeof(buf.vfio_intr_count);
+		break;
+#endif
+	default:
+		bytes_read = 1;
+		RTE_LOG(INFO, EAL, "unexpected intr type\n");
+		break;
+	}
+
+	/**
+	 * read out to clear the ready-to-be-read flag
+	 * for epoll_wait.
+	 */
+	do {
+		bytes_read = read(fd, &buf, bytes_read);
+		if (bytes_read < 0) {
+			if (errno == EINTR || errno == EWOULDBLOCK ||
+			    errno == EAGAIN)
+				continue;
+			RTE_LOG(ERR, EAL,
+				"Error reading from fd %d: %s\n",
+				fd, strerror(errno));
+		} else if (bytes_read == 0)
+			RTE_LOG(ERR, EAL, "Read nothing from fd %d\n", fd);
+		return;
+	} while (1);
+}
+#endif
+
 static int
 eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
 			struct rte_epoll_event *events)
@@ -1035,3 +1080,75 @@ rte_epoll_ctl(int epfd, int op, int fd,
 
 	return 0;
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
+		int op, unsigned int vec, void *data)
+{
+	struct rte_epoll_event *rev;
+	struct rte_epoll_data *epdata;
+	int epfd_op;
+	int rc = 0;
+
+	if (!intr_handle || intr_handle->nb_efd == 0 ||
+	    vec >= intr_handle->nb_efd) {
+		RTE_LOG(ERR, EAL, "Wrong intr vector number.\n");
+		return -EPERM;
+	}
+
+	switch (op) {
+	case RTE_INTR_EVENT_ADD:
+		epfd_op = EPOLL_CTL_ADD;
+		rev = &intr_handle->elist[vec];
+		if (rev->status != RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event already been added.\n");
+			return -EEXIST;
+		}
+
+		/* attach to intr vector fd */
+		epdata = &rev->epdata;
+		epdata->event  = EPOLLIN | EPOLLPRI | EPOLLET;
+		epdata->data   = data;
+		epdata->cb_fun = (rte_intr_event_cb_t)eal_intr_proc_rxtx_intr;
+		epdata->cb_arg = (void *)intr_handle;
+		rc = rte_epoll_ctl(epfd, epfd_op, intr_handle->efds[vec], rev);
+		if (!rc)
+			RTE_LOG(DEBUG, EAL,
+				"efd %d associated with vec %d added on epfd %d"
+				"\n", rev->fd, vec, epfd);
+		else
+			rc = -EPERM;
+		break;
+	case RTE_INTR_EVENT_DEL:
+		epfd_op = EPOLL_CTL_DEL;
+		rev = &intr_handle->elist[vec];
+		if (rev->status == RTE_EPOLL_INVALID) {
+			RTE_LOG(INFO, EAL, "Event does not exist.\n");
+			return -EPERM;
+		}
+
+		rc = rte_epoll_ctl(rev->epfd, epfd_op, rev->fd, rev);
+		if (rc)
+			rc = -EPERM;
+		break;
+	default:
+		RTE_LOG(ERR, EAL, "event op type mismatch\n");
+		rc = -EPERM;
+	}
+
+	return rc;
+}
+#else
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+	return -ENOTSUP;
+}
+#endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index 886608c..acf4be9 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -148,4 +148,24 @@ rte_epoll_ctl(int epfd, int op, int fd,
 int
 rte_intr_tls_epfd(void);
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 39cc2d2..095b2c5 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -118,6 +118,7 @@ DPDK_2.1 {
 	rte_eal_pci_detach;
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
 
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (2 preceding siblings ...)
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support Cunming Liang
                         ` (6 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch maps each of the eventfd to the interrupt vector of VFIO MSI-X.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v8 changes
 - move eventfd creation out of the setup_interrupts to a standalone function

v7 changes
 - cleanup unnecessary code change
 - split event and intr operation to other patches

 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 56 ++++++++++------------------
 1 file changed, 20 insertions(+), 36 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 5acc3b7..12105cc 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -128,6 +128,9 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT
 
 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+/* irq set buffer length for queue interrupts and LSC interrupt */
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
+			      sizeof(int) * (RTE_MAX_RXTX_INTR_VEC_ID + 1))
 
 /* enable legacy (INTx) interrupts */
 static int
@@ -245,23 +248,6 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
 						intr_handle->fd);
 		return -1;
 	}
-
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -294,7 +280,7 @@ vfio_disable_msi(struct rte_intr_handle *intr_handle) {
 static int
 vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 	int len, ret;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	struct vfio_irq_set *irq_set;
 	int *fd_ptr;
 
@@ -302,12 +288,26 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 
 	irq_set = (struct vfio_irq_set *) irq_set_buf;
 	irq_set->argsz = len;
+#ifdef RTE_NEXT_ABI
+	if (!intr_handle->max_intr)
+		intr_handle->max_intr = 1;
+	else if (intr_handle->max_intr > RTE_MAX_RXTX_INTR_VEC_ID)
+		intr_handle->max_intr = RTE_MAX_RXTX_INTR_VEC_ID + 1;
+
+	irq_set->count = intr_handle->max_intr;
+#else
 	irq_set->count = 1;
+#endif
 	irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
 	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
 	irq_set->start = 0;
 	fd_ptr = (int *) &irq_set->data;
-	*fd_ptr = intr_handle->fd;
+#ifdef RTE_NEXT_ABI
+	memcpy(fd_ptr, intr_handle->efds, sizeof(intr_handle->efds));
+	fd_ptr[intr_handle->max_intr - 1] = intr_handle->fd;
+#else
+	fd_ptr[0] = intr_handle->fd;
+#endif
 
 	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
 
@@ -317,22 +317,6 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 		return -1;
 	}
 
-	/* manually trigger interrupt to enable it */
-	memset(irq_set, 0, len);
-	len = sizeof(struct vfio_irq_set);
-	irq_set->argsz = len;
-	irq_set->count = 1;
-	irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-	irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX;
-	irq_set->start = 0;
-
-	ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-	if (ret) {
-		RTE_LOG(ERR, EAL, "Error triggering MSI-X interrupts for fd %d\n",
-						intr_handle->fd);
-		return -1;
-	}
 	return 0;
 }
 
@@ -340,7 +324,7 @@ vfio_enable_msix(struct rte_intr_handle *intr_handle) {
 static int
 vfio_disable_msix(struct rte_intr_handle *intr_handle) {
 	struct vfio_irq_set *irq_set;
-	char irq_set_buf[IRQ_SET_BUF_LEN];
+	char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
 	int len, ret;
 
 	len = sizeof(struct vfio_irq_set);
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (3 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition Cunming Liang
                         ` (5 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch exposes intr event fd create and release for PMD.
The device driver can assign the number of event associated with interrupt vector.
It also provides misc functions to check 1) allows other slowpath intr(e.g. lsc);
2) intr event on fastpath is enabled or not.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - minor changes on API decription comments

v13 changes
 - version map cleanup for v2.1

v11 changes
 - typo cleanup

 lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 97 ++++++++++++++++++++++
 .../linuxapp/eal/include/exec-env/rte_interrupts.h | 45 ++++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map    |  4 +
 3 files changed, 146 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 12105cc..1cea4bf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -44,6 +44,7 @@
 #include <sys/epoll.h>
 #include <sys/signalfd.h>
 #include <sys/ioctl.h>
+#include <sys/eventfd.h>
 
 #include <rte_common.h>
 #include <rte_interrupts.h>
@@ -68,6 +69,7 @@
 #include "eal_vfio.h"
 
 #define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
+#define NB_OTHER_INTR               1
 
 static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
 
@@ -1123,6 +1125,73 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
 
 	return rc;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	uint32_t i;
+	int fd;
+	uint32_t n = RTE_MIN(nb_efd, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+
+	if (intr_handle->type == RTE_INTR_HANDLE_VFIO_MSIX) {
+		for (i = 0; i < n; i++) {
+			fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+			if (fd < 0) {
+				RTE_LOG(ERR, EAL,
+					"can't setup eventfd, error %i (%s)\n",
+					errno, strerror(errno));
+				return -1;
+			}
+			intr_handle->efds[i] = fd;
+		}
+		intr_handle->nb_efd   = n;
+		intr_handle->max_intr = NB_OTHER_INTR + n;
+	} else {
+		intr_handle->efds[0]  = intr_handle->fd;
+		intr_handle->nb_efd   = RTE_MIN(nb_efd, 1U);
+		intr_handle->max_intr = NB_OTHER_INTR;
+	}
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	uint32_t i;
+	struct rte_epoll_event *rev;
+
+	for (i = 0; i < intr_handle->nb_efd; i++) {
+		rev = &intr_handle->elist[i];
+		if (rev->status == RTE_EPOLL_INVALID)
+			continue;
+		if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
+			/* force free if the entry valid */
+			eal_epoll_data_safe_free(rev);
+			rev->status = RTE_EPOLL_INVALID;
+		}
+	}
+
+	if (intr_handle->max_intr > intr_handle->nb_efd) {
+		for (i = 0; i < intr_handle->nb_efd; i++)
+			close(intr_handle->efds[i]);
+	}
+	intr_handle->nb_efd = 0;
+	intr_handle->max_intr = 0;
+}
+
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	return !(!intr_handle->nb_efd);
+}
+
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	return !!(intr_handle->max_intr - intr_handle->nb_efd);
+}
+
 #else
 int
 rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
@@ -1135,4 +1204,32 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 	RTE_SET_USED(data);
 	return -ENOTSUP;
 }
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
+
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
index acf4be9..b05f4c8 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_interrupts.h
@@ -168,4 +168,49 @@ int
 rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
 		int epfd, int op, unsigned int vec, void *data);
 
+/**
+ * It enables the packet I/O interrupt event if it's necessary.
+ * It creates event fd for each interrupt vector when MSIX is used,
+ * otherwise it multiplexes a single event fd.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disables the packet I/O interrupt event.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The packet I/O interrupt on datapath is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle);
+
+/**
+ * The interrupt handle instance allows other causes or not.
+ * Other causes stand for any none packet I/O interrupts.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 095b2c5..f44bc34 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -118,6 +118,10 @@ DPDK_2.1 {
 	rte_eal_pci_detach;
 	rte_epoll_ctl;
 	rte_epoll_wait;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
 	rte_intr_rx_ctl;
 	rte_intr_tls_epfd;
 	rte_memzone_free;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (4 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
                         ` (4 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

To make bsd compiling happy with new intr changes.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework

v13 changes
 - version map cleanup for v2.1

v12 changes
 - fix unused variables compiling warning

v8 changes
 - add stub for new function

v7 changes
 - remove stub 'linux only' function from source file

 lib/librte_eal/bsdapp/eal/eal_interrupts.c         | 42 +++++++++++++
 .../bsdapp/eal/include/exec-env/rte_interrupts.h   | 68 ++++++++++++++++++++++
 lib/librte_eal/bsdapp/eal/rte_eal_version.map      |  5 ++
 3 files changed, 115 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_interrupts.c b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
index 26a55c7..51a13fa 100644
--- a/lib/librte_eal/bsdapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/bsdapp/eal/eal_interrupts.c
@@ -68,3 +68,45 @@ rte_eal_intr_init(void)
 {
 	return 0;
 }
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(vec);
+	RTE_SET_USED(data);
+
+	return -ENOTSUP;
+}
+
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd)
+{
+	RTE_SET_USED(intr_handle);
+	RTE_SET_USED(nb_efd);
+
+	return 0;
+}
+
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+}
+
+int
+rte_intr_dp_is_en(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 0;
+}
+
+int
+rte_intr_allow_others(struct rte_intr_handle *intr_handle)
+{
+	RTE_SET_USED(intr_handle);
+	return 1;
+}
diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
index d4c388f..91d1900 100644
--- a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
+++ b/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h
@@ -50,6 +50,74 @@ struct rte_intr_handle {
 	int fd;                          /**< file descriptor */
 	int uio_cfg_fd;                  /**< UIO config file descriptor */
 	enum rte_intr_handle_type type;  /**< handle type */
+#ifdef RTE_NEXT_ABI
+	int max_intr;                    /**< max interrupt requested */
+	uint32_t nb_efd;                 /**< number of available efds */
+	int *intr_vec;               /**< intr vector number array */
+#endif
 };
 
+/**
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {ADD, DEL}.
+ * @param vec
+ *   RX intr vector number added to the epoll instance wait list.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+		int epfd, int op, unsigned int vec, void *data);
+
+/**
+ * It enables the fastpath event fds if it's necessary.
+ * It creates event fds when multi-vectors allowed,
+ * otherwise it multiplexes the single event fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ * @param nb_vec
+ *   Number of interrupt vector trying to enable.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+rte_intr_efd_enable(struct rte_intr_handle *intr_handle, uint32_t nb_efd);
+
+/**
+ * It disable the fastpath event fds.
+ * It deletes registered eventfds and closes the open fds.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+void
+rte_intr_efd_disable(struct rte_intr_handle *intr_handle);
+
+/**
+ * The fastpath interrupt is enabled or not.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int rte_intr_dp_is_en(struct rte_intr_handle *intr_handle);
+
+/**
+ * The interrupt handle instance allows other cause or not.
+ * Other cause stands for none fastpath interrupt.
+ *
+ * @param intr_handle
+ *   Pointer to the interrupt handle.
+ */
+int rte_intr_allow_others(struct rte_intr_handle *intr_handle);
+
 #endif /* _RTE_LINUXAPP_INTERRUPTS_H_ */
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index b2d4441..cfeb0fb 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -116,6 +116,11 @@ DPDK_2.1 {
 	global:
 
 	rte_eal_pci_detach;
+	rte_intr_allow_others;
+	rte_intr_dp_is_en;
+	rte_intr_efd_enable;
+	rte_intr_efd_disable;
+	rte_intr_rx_ctl;
 	rte_memzone_free;
 
 } DPDK_2.0;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (5 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition Cunming Liang
@ 2015-07-20  3:02  3%       ` Cunming Liang
  2015-07-20  3:02  1%       ` [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
                         ` (3 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch adds two dev_ops functions to enable and disable rx queue interrupts.
In addtion, it adds rte_eth_dev_rx_intr_ctl/rx_intr_q to support per port or per queue rx intr event set.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v15 changes
 - remove ifdef RTE_NEXT_ABI from header file

v14 changes
 - per-patch basis ABI compatibility rework
 - remove unnecessary 'local: *' from version map

v13 changes
 - version map cleanup for v2.1

v9 changes
 - remove unnecessary check after rte_eth_dev_is_valid_port.
   the same as http://www.dpdk.org/dev/patchwork/patch/4784

v8 changes
 - add addtion check for EEXIT

v7 changes
 - remove rx_intr_vec_get
 - add rx_intr_ctl and rx_intr_ctl_q

v6 changes
 - add rx_intr_vec_get to retrieve the vector num of the queue.

v5 changes
 - Rebase the patchset onto the HEAD

v4 changes
 - Export interrupt enable/disable functions for shared libraries
 - Put new functions at the end of eth_dev_ops to avoid breaking ABI

v3 changes
 - Add return value for interrupt enable/disable functions

 lib/librte_ether/rte_ethdev.c          | 147 +++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 104 +++++++++++++++++++++++
 lib/librte_ether/rte_ether_version.map |   4 +
 3 files changed, 255 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 94104ce..a24c399 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3031,6 +3031,153 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 	}
 	rte_spinlock_unlock(&rte_eth_dev_cb_lock);
 }
+
+#ifdef RTE_NEXT_ABI
+int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	uint16_t qid;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	for (qid = 0; qid < dev->data->nb_rx_queues; qid++) {
+		vec = intr_handle->intr_vec[qid];
+		rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+		if (rc && rc != -EEXIST) {
+			PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+					" op %d epfd %d vec %u\n",
+					port_id, qid, op, epfd, vec);
+		}
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	uint32_t vec;
+	struct rte_eth_dev *dev;
+	struct rte_intr_handle *intr_handle;
+	int rc;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%u\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%u\n", queue_id);
+		return -EINVAL;
+	}
+
+	intr_handle = &dev->pci_dev->intr_handle;
+	if (!intr_handle->intr_vec) {
+		PMD_DEBUG_TRACE("RX Intr vector unset\n");
+		return -EPERM;
+	}
+
+	vec = intr_handle->intr_vec[queue_id];
+	rc = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
+	if (rc && rc != -EEXIST) {
+		PMD_DEBUG_TRACE("p %u q %u rx ctl error"
+				" op %d epfd %d vec %u\n",
+				port_id, queue_id, op, epfd, vec);
+		return rc;
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_rx_intr_enable(uint8_t port_id,
+			   uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+}
+
+int
+rte_eth_dev_rx_intr_disable(uint8_t port_id,
+			    uint16_t queue_id)
+{
+	struct rte_eth_dev *dev;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -ENODEV;
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
+	return (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+}
+#else
+int
+rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+
+int
+rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	return -ENOTSUP;
+}
+
+int
+rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+
+int
+rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			  int epfd, int op, void *data)
+{
+	RTE_SET_USED(port_id);
+	RTE_SET_USED(queue_id);
+	RTE_SET_USED(epfd);
+	RTE_SET_USED(op);
+	RTE_SET_USED(data);
+	return -1;
+}
+#endif
+
 #ifdef RTE_NIC_BYPASS
 int rte_eth_dev_bypass_init(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c901a2c..662e106 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -845,6 +845,10 @@ struct rte_eth_fdir {
 struct rte_intr_conf {
 	/** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
 	uint16_t lsc;
+#ifdef RTE_NEXT_ABI
+	/** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
+	uint16_t rxq;
+#endif
 };
 
 /**
@@ -1053,6 +1057,14 @@ typedef int (*eth_tx_queue_setup_t)(struct rte_eth_dev *dev,
 				    const struct rte_eth_txconf *tx_conf);
 /**< @internal Setup a transmit queue of an Ethernet device. */
 
+typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Enable interrupt of a receive queue of an Ethernet device. */
+
+typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
+				    uint16_t rx_queue_id);
+/**< @internal Disable interrupt of a receive queue of an Ethernet device. */
+
 typedef void (*eth_queue_release_t)(void *queue);
 /**< @internal Release memory resources allocated by given RX/TX queue. */
 
@@ -1380,6 +1392,12 @@ struct eth_dev_ops {
 	eth_queue_release_t        rx_queue_release;/**< Release RX queue.*/
 	eth_rx_queue_count_t       rx_queue_count; /**< Get Rx queue count. */
 	eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
+#ifdef RTE_NEXT_ABI
+	/**< Enable Rx queue interrupt. */
+	eth_rx_enable_intr_t       rx_queue_intr_enable;
+	/**< Disable Rx queue interrupt.*/
+	eth_rx_disable_intr_t      rx_queue_intr_disable;
+#endif
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue.*/
 	eth_queue_release_t        tx_queue_release;/**< Release TX queue.*/
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
@@ -2957,6 +2975,92 @@ void _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
 				enum rte_eth_event_type event);
 
 /**
+ * When there is no rx packet coming in Rx Queue for a long time, we can
+ * sleep lcore related to RX Queue for power saving, and enable rx interrupt
+ * to be triggered when rx packect arrives.
+ *
+ * The rte_eth_dev_rx_intr_enable() function enables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+int rte_eth_dev_rx_intr_enable(uint8_t port_id, uint16_t queue_id);
+
+/**
+ * When lcore wakes up from rx interrupt indicating packet coming, disable rx
+ * interrupt and returns to polling mode.
+ *
+ * The rte_eth_dev_rx_intr_disable() function disables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ *     that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+int rte_eth_dev_rx_intr_disable(uint8_t port_id, uint16_t queue_id);
+
+/**
+ * RX Interrupt control per port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data);
+
+/**
+ * RX Interrupt control per queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param epfd
+ *   Epoll instance fd which the intr vector associated to.
+ *   Using RTE_EPOLL_PER_THREAD allows to use per thread epoll instance.
+ * @param op
+ *   The operation be performed for the vector.
+ *   Operation type of {RTE_INTR_EVENT_ADD, RTE_INTR_EVENT_DEL}.
+ * @param data
+ *   User raw data.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
+			      int epfd, int op, void *data);
+
+/**
  * Turn on the LED on the Ethernet device.
  * This function turns on the LED on the Ethernet device.
  *
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 23cfee9..8345a6c 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -115,6 +115,10 @@ DPDK_2.1 {
 	rte_eth_dev_get_reg_info;
 	rte_eth_dev_get_reg_length;
 	rte_eth_dev_is_valid_port;
+	rte_eth_dev_rx_intr_ctl;
+	rte_eth_dev_rx_intr_ctl_q;
+	rte_eth_dev_rx_intr_disable;
+	rte_eth_dev_rx_intr_enable;
 	rte_eth_dev_set_eeprom;
 	rte_eth_dev_set_mc_addr_list;
 	rte_eth_timesync_disable;
-- 
1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (6 preceding siblings ...)
  2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
@ 2015-07-20  3:02  1%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF Cunming Liang
                         ` (2 subsequent siblings)
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch does below things for ixgbe PF and VF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v10 changes
 - return an actual error code rather than -1

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/ixgbe/ixgbe_ethdev.c | 527 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/ixgbe/ixgbe_ethdev.h |   4 +
 2 files changed, 518 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 3a8cff0..7f43fb6 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -85,6 +85,9 @@
  */
 #define IXGBE_FC_LO    0x40
 
+/* Default minimum inter-interrupt interval for EITR configuration */
+#define IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT    0x79E
+
 /* Timer value included in XOFF frames. */
 #define IXGBE_FC_PAUSE 0x680
 
@@ -187,6 +190,9 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
 			uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev);
 static void ixgbe_dev_interrupt_handler(struct rte_intr_handle *handle,
@@ -202,11 +208,14 @@ static void ixgbe_dcb_init(struct ixgbe_hw *hw,struct ixgbe_dcb_config *dcb_conf
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev);
+static int ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev);
+static int ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_configure(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_start(struct rte_eth_dev *dev);
 static void ixgbevf_dev_stop(struct rte_eth_dev *dev);
 static void ixgbevf_dev_close(struct rte_eth_dev *dev);
 static void ixgbevf_intr_disable(struct ixgbe_hw *hw);
+static void ixgbevf_intr_enable(struct ixgbe_hw *hw);
 static void ixgbevf_dev_stats_get(struct rte_eth_dev *dev,
 		struct rte_eth_stats *stats);
 static void ixgbevf_dev_stats_reset(struct rte_eth_dev *dev);
@@ -216,6 +225,17 @@ static void ixgbevf_vlan_strip_queue_set(struct rte_eth_dev *dev,
 		uint16_t queue, int on);
 static void ixgbevf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on);
+static void ixgbevf_dev_interrupt_handler(struct rte_intr_handle *handle,
+					  void *param);
+#ifdef RTE_NEXT_ABI
+static int ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					    uint16_t queue_id);
+static int ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					     uint16_t queue_id);
+static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+				 uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbevf_configure_msix(struct rte_eth_dev *dev);
 
 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -232,6 +252,15 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev,
 		uint8_t rule_id, uint8_t on);
 static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev,
 		uint8_t	rule_id);
+#ifdef RTE_NEXT_ABI
+static int ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					  uint16_t queue_id);
+static int ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					   uint16_t queue_id);
+static void ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+			       uint8_t queue, uint8_t msix_vector);
+#endif
+static void ixgbe_configure_msix(struct rte_eth_dev *dev);
 
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 		uint16_t queue_idx, uint16_t tx_rate);
@@ -308,7 +337,7 @@ static int ixgbe_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
  */
 #define UPDATE_VF_STAT(reg, last, cur)	                        \
 {                                                               \
-	u32 latest = IXGBE_READ_REG(hw, reg);                   \
+	uint32_t latest = IXGBE_READ_REG(hw, reg);              \
 	cur += latest - last;                                   \
 	last = latest;                                          \
 }
@@ -391,6 +420,10 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
 	.tx_queue_start	      = ixgbe_dev_tx_queue_start,
 	.tx_queue_stop        = ixgbe_dev_tx_queue_stop,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbe_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbe_dev_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
 	.rx_queue_count       = ixgbe_dev_rx_queue_count,
 	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
@@ -461,8 +494,13 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
 	.vlan_offload_set     = ixgbevf_vlan_offload_set,
 	.rx_queue_setup       = ixgbe_dev_rx_queue_setup,
 	.rx_queue_release     = ixgbe_dev_rx_queue_release,
+	.rx_descriptor_done   = ixgbe_dev_rx_descriptor_done,
 	.tx_queue_setup       = ixgbe_dev_tx_queue_setup,
 	.tx_queue_release     = ixgbe_dev_tx_queue_release,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = ixgbevf_dev_rx_queue_intr_enable,
+	.rx_queue_intr_disable = ixgbevf_dev_rx_queue_intr_disable,
+#endif
 	.mac_addr_add         = ixgbevf_add_mac_addr,
 	.mac_addr_remove      = ixgbevf_remove_mac_addr,
 	.set_mc_addr_list     = ixgbe_dev_set_mc_addr_list,
@@ -1000,12 +1038,6 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
 			eth_dev->data->port_id, pci_dev->id.vendor_id,
 			pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		ixgbe_dev_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	ixgbe_enable_intr(eth_dev);
 
@@ -1647,6 +1679,10 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct ixgbe_vf_info *vfinfo =
 		*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
 	int err, link_up = 0, negotiate = 0;
 	uint32_t speed = 0;
 	int mask = 0;
@@ -1679,6 +1715,30 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	ixgbe_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int),
+				    0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for sleep until rx interrupt */
+	ixgbe_configure_msix(dev);
+
 	/* initialize transmission unit */
 	ixgbe_dev_tx_init(dev);
 
@@ -1756,8 +1816,25 @@ ixgbe_dev_start(struct rte_eth_dev *dev)
 skip_link_setup:
 
 	/* check if lsc interrupt is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ixgbe_dev_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   ixgbe_dev_interrupt_handler,
+						   (void *)dev);
+			ixgbe_dev_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		ixgbe_dev_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	ixgbe_enable_intr(dev);
@@ -1814,6 +1891,7 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	struct ixgbe_filter_info *filter_info =
 		IXGBE_DEV_PRIVATE_TO_FILTER_INFO(dev->data->dev_private);
 	struct ixgbe_5tuple_filter *p_5tuple, *p_5tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 	int vf;
 
 	PMD_INIT_FUNC_TRACE();
@@ -1821,6 +1899,9 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	/* disable interrupts */
 	ixgbe_disable_intr(hw);
 
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	/* reset the NIC */
 	ixgbe_pf_reset_hw(hw);
 	hw->adapter_stopped = 0;
@@ -1861,6 +1942,14 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
 	memset(filter_info->fivetuple_mask, 0,
 		sizeof(uint32_t) * IXGBE_5TUPLE_ARRAY_SIZE);
 
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 /*
@@ -2535,6 +2624,30 @@ ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+/**
+ * It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+#ifdef RTE_NEXT_ABI
+static int
+ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	intr->mask |= IXGBE_EICR_RTX_QUEUE;
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and sets flag (IXGBE_EICR_LSC) for the link_update.
  *
@@ -2561,10 +2674,10 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	PMD_DRV_LOG(INFO, "eicr %x", eicr);
 
 	intr->flags = 0;
-	if (eicr & IXGBE_EICR_LSC) {
-		/* set flag for async link update */
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
 		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
-	}
 
 	if (eicr & IXGBE_EICR_MAILBOX)
 		intr->flags |= IXGBE_FLAG_MAILBOX;
@@ -2572,6 +2685,30 @@ ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev)
+{
+	uint32_t eicr;
+	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	/* clear all cause mask */
+	ixgbevf_intr_disable(hw);
+
+	/* read-on-clear nic registers here */
+	eicr = IXGBE_READ_REG(hw, IXGBE_VTEICR);
+	PMD_DRV_LOG(INFO, "eicr %x", eicr);
+
+	intr->flags = 0;
+
+	/* set flag for async link update */
+	if (eicr & IXGBE_EICR_LSC)
+		intr->flags |= IXGBE_FLAG_NEED_LINK_UPDATE;
+
+	return 0;
+}
+
 /**
  * It gets and then prints the link status.
  *
@@ -2667,6 +2804,18 @@ ixgbe_dev_interrupt_action(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	PMD_DRV_LOG(DEBUG, "enable intr immediately");
+	ixgbevf_intr_enable(hw);
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+	return 0;
+}
+
 /**
  * Interrupt handler which shall be registered for alarm callback for delayed
  * handling specific interrupt to wait for the stable nic state. As the
@@ -2721,13 +2870,24 @@ ixgbe_dev_interrupt_delayed_handler(void *param)
  */
 static void
 ixgbe_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
-							void *param)
+			    void *param)
 {
 	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
 	ixgbe_dev_interrupt_get_status(dev);
 	ixgbe_dev_interrupt_action(dev);
 }
 
+static void
+ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+			      void *param)
+{
+	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+	ixgbevf_dev_interrupt_get_status(dev);
+	ixgbevf_dev_interrupt_action(dev);
+}
+
 static int
 ixgbe_dev_led_on(struct rte_eth_dev *dev)
 {
@@ -3233,6 +3393,19 @@ ixgbevf_intr_disable(struct ixgbe_hw *hw)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
+static void
+ixgbevf_intr_enable(struct ixgbe_hw *hw)
+{
+	PMD_INIT_FUNC_TRACE();
+
+	/* VF enable interrupt autoclean */
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAM, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIAC, IXGBE_VF_IRQ_ENABLE_MASK);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, IXGBE_VF_IRQ_ENABLE_MASK);
+
+	IXGBE_WRITE_FLUSH(hw);
+}
+
 static int
 ixgbevf_dev_configure(struct rte_eth_dev *dev)
 {
@@ -3274,6 +3447,11 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
 		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
 	int err, mask = 0;
 
 	PMD_INIT_FUNC_TRACE();
@@ -3304,6 +3482,42 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
 
 	ixgbevf_dev_rxtx_start(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle) && !intr_handle->intr_vec) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+	ixgbevf_configure_msix(dev);
+
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle))
+			rte_intr_callback_register(intr_handle,
+					ixgbevf_dev_interrupt_handler,
+					(void *)dev);
+		else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+	rte_intr_enable(intr_handle);
+
+	/* Re-enable interrupt for VF */
+	ixgbevf_intr_enable(hw);
+
 	return 0;
 }
 
@@ -3311,6 +3525,7 @@ static void
 ixgbevf_dev_stop(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3327,12 +3542,27 @@ ixgbevf_dev_stop(struct rte_eth_dev *dev)
 	dev->data->scattered_rx = 0;
 
 	ixgbe_dev_clear_queues(dev);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
 ixgbevf_dev_close(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -3344,6 +3574,14 @@ ixgbevf_dev_close(struct rte_eth_dev *dev)
 
 	/* reprogram the RAR[0] in case user changed it. */
 	ixgbe_set_rar(hw, 0, hw->mac.addr, 0, IXGBE_RAH_AV);
+
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
 }
 
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on)
@@ -3861,6 +4099,269 @@ ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+static int
+ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask |= (1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	mask = IXGBE_READ_REG(hw, IXGBE_VTEIMS);
+	mask &= ~(1 << queue_id);
+	IXGBE_WRITE_REG(hw, IXGBE_VTEIMS, mask);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask |= (1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= (1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= (1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static int
+ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	uint32_t mask;
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct ixgbe_interrupt *intr =
+		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
+
+	if (queue_id < 16) {
+		ixgbe_disable_intr(hw);
+		intr->mask &= ~(1 << queue_id);
+		ixgbe_enable_intr(dev);
+	} else if (queue_id < 32) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(0));
+		mask &= ~(1 << queue_id);
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
+	} else if (queue_id < 64) {
+		mask = IXGBE_READ_REG(hw, IXGBE_EIMS_EX(1));
+		mask &= ~(1 << (queue_id - 32));
+		IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(1), mask);
+	}
+
+	return 0;
+}
+
+static void
+ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		     uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	if (direction == -1) {
+		/* other causes */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR_MISC);
+		tmp &= ~0xFF;
+		tmp |= msix_vector;
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR_MISC, tmp);
+	} else {
+		/* rx or tx cause */
+		msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+		idx = ((16 * (queue & 1)) + (8 * direction));
+		tmp = IXGBE_READ_REG(hw, IXGBE_VTIVAR(queue >> 1));
+		tmp &= ~(0xFF << idx);
+		tmp |= (msix_vector << idx);
+		IXGBE_WRITE_REG(hw, IXGBE_VTIVAR(queue >> 1), tmp);
+	}
+}
+
+/**
+ * set the IVAR registers, mapping interrupt causes to vectors
+ * @param hw
+ *  pointer to ixgbe_hw struct
+ * @direction
+ *  0 for Rx, 1 for Tx, -1 for other causes
+ * @queue
+ *  queue to map the corresponding interrupt to
+ * @msix_vector
+ *  the vector to map to the corresponding queue
+ */
+static void
+ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+		   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp, idx;
+
+	msix_vector |= IXGBE_IVAR_ALLOC_VAL;
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		if (direction == -1)
+			direction = 0;
+		idx = (((direction * 64) + queue) >> 2) & 0x1F;
+		tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(idx));
+		tmp &= ~(0xFF << (8 * (queue & 0x3)));
+		tmp |= (msix_vector << (8 * (queue & 0x3)));
+		IXGBE_WRITE_REG(hw, IXGBE_IVAR(idx), tmp);
+	} else if ((hw->mac.type == ixgbe_mac_82599EB) ||
+			(hw->mac.type == ixgbe_mac_X540)) {
+		if (direction == -1) {
+			/* other causes */
+			idx = ((queue & 1) * 8);
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR_MISC);
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR_MISC, tmp);
+		} else {
+			/* rx or tx causes */
+			idx = ((16 * (queue & 1)) + (8 * direction));
+			tmp = IXGBE_READ_REG(hw, IXGBE_IVAR(queue >> 1));
+			tmp &= ~(0xFF << idx);
+			tmp |= (msix_vector << idx);
+			IXGBE_WRITE_REG(hw, IXGBE_IVAR(queue >> 1), tmp);
+		}
+	}
+}
+#endif
+
+static void
+ixgbevf_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t q_idx;
+	uint32_t vector_idx = 0;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd.
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* Configure all RX queues of VF */
+	for (q_idx = 0; q_idx < dev->data->nb_rx_queues; q_idx++) {
+		/* Force all queue use vector 0,
+		 * as IXGBE_VF_MAXMSIVECOTR = 1
+		 */
+		ixgbevf_set_ivar_map(hw, 0, q_idx, vector_idx);
+		intr_handle->intr_vec[q_idx] = vector_idx;
+	}
+
+	/* Configure VF Rx queue ivar */
+	ixgbevf_set_ivar_map(hw, -1, 1, vector_idx);
+#endif
+}
+
+/**
+ * Sets up the hardware to properly generate MSI-X interrupts
+ * @hw
+ *  board private structure
+ */
+static void
+ixgbe_configure_msix(struct rte_eth_dev *dev)
+{
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+#ifdef RTE_NEXT_ABI
+	struct ixgbe_hw *hw =
+		IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t queue_id, vec = 0;
+	uint32_t mask;
+	uint32_t gpie;
+#endif
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* setup GPIE for MSI-x mode */
+	gpie = IXGBE_READ_REG(hw, IXGBE_GPIE);
+	gpie |= IXGBE_GPIE_MSIX_MODE | IXGBE_GPIE_PBA_SUPPORT |
+		IXGBE_GPIE_OCD | IXGBE_GPIE_EIAME;
+	/* auto clearing and auto setting corresponding bits in EIMS
+	 * when MSI-X interrupt is triggered
+	 */
+	if (hw->mac.type == ixgbe_mac_82598EB) {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM, IXGBE_EICS_RTX_QUEUE);
+	} else {
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(0), 0xFFFFFFFF);
+		IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(1), 0xFFFFFFFF);
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie);
+
+	/* Populate the IVAR table and set the ITR values to the
+	 * corresponding register.
+	 */
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues;
+	     queue_id++) {
+		/* by default, 1:1 mapping */
+		ixgbe_set_ivar_map(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	switch (hw->mac.type) {
+	case ixgbe_mac_82598EB:
+		ixgbe_set_ivar_map(hw, -1, IXGBE_IVAR_OTHER_CAUSES_INDEX,
+				   intr_handle->max_intr - 1);
+		break;
+	case ixgbe_mac_82599EB:
+	case ixgbe_mac_X540:
+		ixgbe_set_ivar_map(hw, -1, 1, intr_handle->max_intr - 1);
+		break;
+	default:
+		break;
+	}
+	IXGBE_WRITE_REG(hw, IXGBE_EITR(queue_id),
+			IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT & 0xFFF);
+
+	/* set up to autoclear timer, and the vectors */
+	mask = IXGBE_EIMS_ENABLE_MASK;
+	mask &= ~(IXGBE_EIMS_OTHER |
+		  IXGBE_EIMS_MAILBOX |
+		  IXGBE_EIMS_LSC);
+
+	IXGBE_WRITE_REG(hw, IXGBE_EIAC, mask);
+#endif
+}
+
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
 	uint16_t queue_idx, uint16_t tx_rate)
 {
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index c16c11d..c3d4f4f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -117,6 +117,9 @@
 	ETH_RSS_IPV6_TCP_EX | \
 	ETH_RSS_IPV6_UDP_EX)
 
+#define IXGBE_VF_IRQ_ENABLE_MASK        3          /* vf irq enable mask */
+#define IXGBE_VF_MAXMSIVECTOR           1
+
 /*
  * Information about the fdir mode.
  */
@@ -332,6 +335,7 @@ uint32_t ixgbe_dev_rx_queue_count(struct rte_eth_dev *dev,
 		uint16_t rx_queue_id);
 
 int ixgbe_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
+int ixgbevf_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
 
 int ixgbe_dev_rx_init(struct rte_eth_dev *dev);
 
-- 
1.8.1.4

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (7 preceding siblings ...)
  2015-07-20  3:02  1%       ` [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
  2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework

v9 changes
 - move queue-vec mapping init from dev_configure to dev_start
 - fix link interrupt not working issue in vfio-msix

v8 changes
 - add vfio-msi/vfio-legacy and uio-legacy support

v7 changes
 - add condition check when intr vector is not enabled

v6 changes
 - fill queue-vector mapping table

v5 changes
 - Rebase the patchset onto the HEAD

v3 changes
 - Remove unnecessary variables in e1000_mac_info
 - Remove spinlok from PMD

v2 changes
 - Consolidate review comments related to coding style

 drivers/net/e1000/igb_ethdev.c | 311 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 277 insertions(+), 34 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index ddc7186..56734a3 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -105,6 +105,9 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
 				struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
+#endif
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -218,7 +221,6 @@ static int eth_igb_get_eeprom(struct rte_eth_dev *dev,
 		struct rte_dev_eeprom_info *eeprom);
 static int eth_igb_set_eeprom(struct rte_eth_dev *dev,
 		struct rte_dev_eeprom_info *eeprom);
-
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
 				    struct ether_addr *mc_addr_set,
 				    uint32_t nb_mc_addr);
@@ -229,6 +231,17 @@ static int igb_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
 					  uint32_t flags);
 static int igb_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 					  struct timespec *timestamp);
+#ifdef RTE_NEXT_ABI
+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
+					uint16_t queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev,
+					 uint16_t queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+				       uint8_t queue, uint8_t msix_vector);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+			       uint8_t index, uint8_t offset);
+#endif
+static void eth_igb_configure_msix_intr(struct rte_eth_dev *dev);
 
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
@@ -289,6 +302,10 @@ static const struct eth_dev_ops eth_igb_ops = {
 	.vlan_tpid_set        = eth_igb_vlan_tpid_set,
 	.vlan_offload_set     = eth_igb_vlan_offload_set,
 	.rx_queue_setup       = eth_igb_rx_queue_setup,
+#ifdef RTE_NEXT_ABI
+	.rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+	.rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
+#endif
 	.rx_queue_release     = eth_igb_rx_queue_release,
 	.rx_queue_count       = eth_igb_rx_queue_count,
 	.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -639,12 +656,6 @@ eth_igb_dev_init(struct rte_eth_dev *eth_dev)
 		     eth_dev->data->port_id, pci_dev->id.vendor_id,
 		     pci_dev->id.device_id);
 
-	rte_intr_callback_register(&(pci_dev->intr_handle),
-		eth_igb_interrupt_handler, (void *)eth_dev);
-
-	/* enable uio intr after callback register */
-	rte_intr_enable(&(pci_dev->intr_handle));
-
 	/* enable support intr */
 	igb_intr_enable(eth_dev);
 
@@ -879,7 +890,11 @@ eth_igb_start(struct rte_eth_dev *dev)
 		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	struct e1000_adapter *adapter =
 		E1000_DEV_PRIVATE(dev->data->dev_private);
-	int ret, i, mask;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+	int ret, mask;
+#ifdef RTE_NEXT_ABI
+	uint32_t intr_vector = 0;
+#endif
 	uint32_t ctrl_ext;
 
 	PMD_INIT_FUNC_TRACE();
@@ -920,6 +935,29 @@ eth_igb_start(struct rte_eth_dev *dev)
 	/* configure PF module if SRIOV enabled */
 	igb_pf_host_configure(dev);
 
+#ifdef RTE_NEXT_ABI
+	/* check and configure queue intr-vector mapping */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		intr_vector = dev->data->nb_rx_queues;
+
+	if (rte_intr_efd_enable(intr_handle, intr_vector))
+		return -1;
+
+	if (rte_intr_dp_is_en(intr_handle)) {
+		intr_handle->intr_vec =
+			rte_zmalloc("intr_vec",
+				    dev->data->nb_rx_queues * sizeof(int), 0);
+		if (intr_handle->intr_vec == NULL) {
+			PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+				     " intr_vec\n", dev->data->nb_rx_queues);
+			return -ENOMEM;
+		}
+	}
+#endif
+
+	/* confiugre msix for rx interrupt */
+	eth_igb_configure_msix_intr(dev);
+
 	/* Configure for OS presence */
 	igb_init_manageability(hw);
 
@@ -947,33 +985,9 @@ eth_igb_start(struct rte_eth_dev *dev)
 		igb_vmdq_vlan_hw_filter_enable(dev);
 	}
 
-	/*
-	 * Configure the Interrupt Moderation register (EITR) with the maximum
-	 * possible value (0xFFFF) to minimize "System Partial Write" issued by
-	 * spurious [DMA] memory updates of RX and TX ring descriptors.
-	 *
-	 * With a EITR granularity of 2 microseconds in the 82576, only 7/8
-	 * spurious memory updates per second should be expected.
-	 * ((65535 * 2) / 1000.1000 ~= 0.131 second).
-	 *
-	 * Because interrupts are not used at all, the MSI-X is not activated
-	 * and interrupt moderation is controlled by EITR[0].
-	 *
-	 * Note that having [almost] disabled memory updates of RX and TX ring
-	 * descriptors through the Interrupt Moderation mechanism, memory
-	 * updates of ring descriptors are now moderated by the configurable
-	 * value of Write-Back Threshold registers.
-	 */
 	if ((hw->mac.type == e1000_82576) || (hw->mac.type == e1000_82580) ||
 		(hw->mac.type == e1000_i350) || (hw->mac.type == e1000_i210) ||
 		(hw->mac.type == e1000_i211)) {
-		uint32_t ivar;
-
-		/* Enable all RX & TX queues in the IVAR registers */
-		ivar = (uint32_t) ((E1000_IVAR_VALID << 16) | E1000_IVAR_VALID);
-		for (i = 0; i < 8; i++)
-			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, i, ivar);
-
 		/* Configure EITR with the maximum possible value (0xFFFF) */
 		E1000_WRITE_REG(hw, E1000_EITR(0), 0xFFFF);
 	}
@@ -1024,8 +1038,25 @@ eth_igb_start(struct rte_eth_dev *dev)
 	e1000_setup_link(hw);
 
 	/* check if lsc interrupt feature is enabled */
-	if (dev->data->dev_conf.intr_conf.lsc != 0)
-		ret = eth_igb_lsc_interrupt_setup(dev);
+	if (dev->data->dev_conf.intr_conf.lsc != 0) {
+		if (rte_intr_allow_others(intr_handle)) {
+			rte_intr_callback_register(intr_handle,
+						   eth_igb_interrupt_handler,
+						   (void *)dev);
+			eth_igb_lsc_interrupt_setup(dev);
+		} else
+			PMD_INIT_LOG(INFO, "lsc won't enable because of"
+				     " no intr multiplex\n");
+	}
+
+#ifdef RTE_NEXT_ABI
+	/* check if rxq interrupt is enabled */
+	if (dev->data->dev_conf.intr_conf.rxq != 0)
+		eth_igb_rxq_interrupt_setup(dev);
+#endif
+
+	/* enable uio/vfio intr/eventfd mapping */
+	rte_intr_enable(intr_handle);
 
 	/* resume enabled intr since hw reset */
 	igb_intr_enable(dev);
@@ -1058,8 +1089,13 @@ eth_igb_stop(struct rte_eth_dev *dev)
 	struct e1000_flex_filter *p_flex;
 	struct e1000_5tuple_filter *p_5tuple, *p_5tuple_next;
 	struct e1000_2tuple_filter *p_2tuple, *p_2tuple_next;
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
 
 	igb_intr_disable(hw);
+
+	/* disable intr eventfd mapping */
+	rte_intr_disable(intr_handle);
+
 	igb_pf_reset_hw(hw);
 	E1000_WRITE_REG(hw, E1000_WUC, 0);
 
@@ -1108,6 +1144,15 @@ eth_igb_stop(struct rte_eth_dev *dev)
 		rte_free(p_2tuple);
 	}
 	filter_info->twotuple_mask = 0;
+
+#ifdef RTE_NEXT_ABI
+	/* Clean datapath event and queue/vec mapping */
+	rte_intr_efd_disable(intr_handle);
+	if (intr_handle->intr_vec != NULL) {
+		rte_free(intr_handle->intr_vec);
+		intr_handle->intr_vec = NULL;
+	}
+#endif
 }
 
 static void
@@ -1117,6 +1162,9 @@ eth_igb_close(struct rte_eth_dev *dev)
 	struct e1000_adapter *adapter =
 		E1000_DEV_PRIVATE(dev->data->dev_private);
 	struct rte_eth_link link;
+#ifdef RTE_NEXT_ABI
+	struct rte_pci_device *pci_dev;
+#endif
 
 	eth_igb_stop(dev);
 	adapter->stopped = 1;
@@ -1136,6 +1184,14 @@ eth_igb_close(struct rte_eth_dev *dev)
 
 	igb_dev_free_queues(dev);
 
+#ifdef RTE_NEXT_ABI
+	pci_dev = dev->pci_dev;
+	if (pci_dev->intr_handle.intr_vec) {
+		rte_free(pci_dev->intr_handle.intr_vec);
+		pci_dev->intr_handle.intr_vec = NULL;
+	}
+#endif
+
 	memset(&link, 0, sizeof(link));
 	rte_igb_dev_atomic_write_link_status(dev, &link);
 }
@@ -1960,6 +2016,35 @@ eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev)
 	return 0;
 }
 
+#ifdef RTE_NEXT_ABI
+/* It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+	uint32_t mask, regval;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	struct rte_eth_dev_info dev_info;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	eth_igb_infos_get(dev, &dev_info);
+
+	mask = 0xFFFFFFFF >> (32 - dev_info.max_rx_queues);
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+
+	return 0;
+}
+#endif
+
 /*
  * It reads ICR and gets interrupt causes, check it and set a bit flag
  * to update link status.
@@ -4051,5 +4136,163 @@ static struct rte_driver pmd_igbvf_drv = {
 	.init = rte_igbvf_pmd_init,
 };
 
+#ifdef RTE_NEXT_ABI
+static int
+eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+
+	E1000_WRITE_REG(hw, E1000_EIMC, mask);
+	E1000_WRITE_FLUSH(hw);
+
+	return 0;
+}
+
+static int
+eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t mask = 1 << queue_id;
+	uint32_t regval;
+
+	regval = E1000_READ_REG(hw, E1000_EIMS);
+	E1000_WRITE_REG(hw, E1000_EIMS, regval | mask);
+	E1000_WRITE_FLUSH(hw);
+
+	rte_intr_enable(&dev->pci_dev->intr_handle);
+
+	return 0;
+}
+
+static void
+eth_igb_write_ivar(struct e1000_hw *hw, uint8_t  msix_vector,
+		   uint8_t index, uint8_t offset)
+{
+	uint32_t val = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
+
+	/* clear bits */
+	val &= ~((uint32_t)0xFF << offset);
+
+	/* write vector and valid bit */
+	val |= (msix_vector | E1000_IVAR_VALID) << offset;
+
+	E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, val);
+}
+
+static void
+eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+			   uint8_t queue, uint8_t msix_vector)
+{
+	uint32_t tmp = 0;
+
+	if (hw->mac.type == e1000_82575) {
+		if (direction == 0)
+			tmp = E1000_EICR_RX_QUEUE0 << queue;
+		else if (direction == 1)
+			tmp = E1000_EICR_TX_QUEUE0 << queue;
+		E1000_WRITE_REG(hw, E1000_MSIXBM(msix_vector), tmp);
+	} else if (hw->mac.type == e1000_82576) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector, queue & 0x7,
+					   ((queue & 0x8) << 1) +
+					   8 * direction);
+	} else if ((hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		if ((direction == 0) || (direction == 1))
+			eth_igb_write_ivar(hw, msix_vector,
+					   queue >> 1,
+					   ((queue & 0x1) << 4) +
+					   8 * direction);
+	}
+}
+#endif
+
+/* Sets up the hardware to generate MSI-X interrupts properly
+ * @hw
+ *  board private structure
+ */
+static void
+eth_igb_configure_msix_intr(struct rte_eth_dev *dev)
+{
+#ifdef RTE_NEXT_ABI
+	int queue_id;
+	uint32_t tmpval, regval, intr_mask;
+	struct e1000_hw *hw =
+		E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	uint32_t vec = 0;
+#endif
+	struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
+
+	/* won't configure msix register if no mapping is done
+	 * between intr vector and event fd
+	 */
+	if (!rte_intr_dp_is_en(intr_handle))
+		return;
+
+#ifdef RTE_NEXT_ABI
+	/* set interrupt vector for other causes */
+	if (hw->mac.type == e1000_82575) {
+		tmpval = E1000_READ_REG(hw, E1000_CTRL_EXT);
+		/* enable MSI-X PBA support */
+		tmpval |= E1000_CTRL_EXT_PBA_CLR;
+
+		/* Auto-Mask interrupts upon ICR read */
+		tmpval |= E1000_CTRL_EXT_EIAME;
+		tmpval |= E1000_CTRL_EXT_IRCA;
+
+		E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmpval);
+
+		/* enable msix_other interrupt */
+		E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0), 0, E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | E1000_EIMS_OTHER);
+		regval = E1000_READ_REG(hw, E1000_EIAM);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | E1000_EIMS_OTHER);
+	} else if ((hw->mac.type == e1000_82576) ||
+			(hw->mac.type == e1000_82580) ||
+			(hw->mac.type == e1000_i350) ||
+			(hw->mac.type == e1000_i354) ||
+			(hw->mac.type == e1000_i210) ||
+			(hw->mac.type == e1000_i211)) {
+		/* turn on MSI-X capability first */
+		E1000_WRITE_REG(hw, E1000_GPIE, E1000_GPIE_MSIX_MODE |
+					E1000_GPIE_PBA | E1000_GPIE_EIAME |
+					E1000_GPIE_NSICR);
+
+		intr_mask = (1 << intr_handle->max_intr) - 1;
+		regval = E1000_READ_REG(hw, E1000_EIAC);
+		E1000_WRITE_REG(hw, E1000_EIAC, regval | intr_mask);
+
+		/* enable msix_other interrupt */
+		regval = E1000_READ_REG(hw, E1000_EIMS);
+		E1000_WRITE_REG(hw, E1000_EIMS, regval | intr_mask);
+		tmpval = (dev->data->nb_rx_queues | E1000_IVAR_VALID) << 8;
+		E1000_WRITE_REG(hw, E1000_IVAR_MISC, tmpval);
+	}
+
+	/* use EIAM to auto-mask when MSI-X interrupt
+	 * is asserted, this saves a register write for every interrupt
+	 */
+	intr_mask = (1 << intr_handle->nb_efd) - 1;
+	regval = E1000_READ_REG(hw, E1000_EIAM);
+	E1000_WRITE_REG(hw, E1000_EIAM, regval | intr_mask);
+
+	for (queue_id = 0; queue_id < dev->data->nb_rx_queues; queue_id++) {
+		eth_igb_assign_msix_vector(hw, 0, queue_id, vec);
+		intr_handle->intr_vec[queue_id] = vec;
+		if (vec < intr_handle->nb_efd - 1)
+			vec++;
+	}
+
+	E1000_WRITE_FLUSH(hw);
+#endif
+}
+
 PMD_REGISTER_DRIVER(pmd_igb_drv);
 PMD_REGISTER_DRIVER(pmd_igbvf_drv);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (8 preceding siblings ...)
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF Cunming Liang
@ 2015-07-20  3:02  2%       ` Cunming Liang
  2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
  10 siblings, 0 replies; 200+ results
From: Cunming Liang @ 2015-07-20  3:02 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

The patch demonstrates how to handle per rx queue interrupt in a NAPI-like
implementation in userspace. The working thread mainly runs in polling mode
and switch to interrupt mode only if there is no packet received in recent polls.
The working thread returns to polling mode immediately once it receives an
interrupt notification caused by the incoming packets.
The sample keeps running in polling mode if the binding PMD hasn't supported
the rx interrupt yet. Now only ixgbe(pf/vf) and igb support it.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
v14 changes
 - per-patch basis ABI compatibility rework
 - reword commit comments

v7 changes
 - using new APIs
 - demo multiple port/queue pair wait on the same epoll instance

v6 changes
 - Split event fd add and wait

v5 changes
 - Change invoked function name and parameter to accomodate EAL change

v3 changes
 - Add spinlock to ensure thread safe when accessing interrupt mask
   register

v2 changes
 - Remove unused function which is for debug purpose

 examples/l3fwd-power/main.c | 205 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 165 insertions(+), 40 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index b3c5f43..14f6fba 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -74,12 +74,14 @@
 #include <rte_string_fns.h>
 #include <rte_timer.h>
 #include <rte_power.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
 
 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1
 
 #define MAX_PKT_BURST 32
 
-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10
 
 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES           200000000ULL
@@ -153,6 +155,9 @@ static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 /* ethernet addresses of ports */
 static struct ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
 
+/* ethernet addresses of ports */
+static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
+
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
 /* Ports set in promiscuous mode off by default. */
@@ -185,6 +190,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128
 
+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
 	uint8_t port_id;
@@ -211,7 +219,7 @@ static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
 
 static struct rte_eth_conf port_conf = {
 	.rxmode = {
-		.mq_mode	= ETH_MQ_RX_RSS,
+		.mq_mode        = ETH_MQ_RX_RSS,
 		.max_rx_pkt_len = ETHER_MAX_LEN,
 		.split_hdr_size = 0,
 		.header_split   = 0, /**< Header Split disabled */
@@ -223,11 +231,17 @@ static struct rte_eth_conf port_conf = {
 	.rx_adv_conf = {
 		.rss_conf = {
 			.rss_key = NULL,
-			.rss_hf = ETH_RSS_IP,
+			.rss_hf = ETH_RSS_UDP,
 		},
 	},
 	.txmode = {
-		.mq_mode = ETH_DCB_NONE,
+		.mq_mode = ETH_MQ_TX_NONE,
+	},
+	.intr_conf = {
+		.lsc = 1,
+#ifdef RTE_NEXT_ABI
+		.rxq = 1,
+#endif
 	},
 };
 
@@ -399,19 +413,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer *tim,
 	/* accumulate total execution time in us when callback is invoked */
 	sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
 					(float)SCALING_PERIOD;
-
 	/**
 	 * check whether need to scale down frequency a step if it sleep a lot.
 	 */
-	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-		rte_power_freq_down(lcore_id);
+	if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 	else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+		stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
 		/**
 		 * scale down a step if average packet per iteration less
 		 * than expectation.
 		 */
-		rte_power_freq_down(lcore_id);
+		if (rte_power_freq_down)
+			rte_power_freq_down(lcore_id);
+	}
 
 	/**
 	 * initialize another timer according to current frequency to ensure
@@ -712,22 +729,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,
 
 }
 
-#define SLEEP_GEAR1_THRESHOLD            100
-#define SLEEP_GEAR2_THRESHOLD            1000
+#define MINIMUM_SLEEP_TIME         1
+#define SUSPEND_THRESHOLD          300
 
 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-	/* If zero count is less than 100, use it as the sleep time in us */
-	if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-		return zero_rx_packet_count;
-	/* If zero count is less than 1000, sleep time should be 100 us */
-	else if ((zero_rx_packet_count >= SLEEP_GEAR1_THRESHOLD) &&
-			(zero_rx_packet_count < SLEEP_GEAR2_THRESHOLD))
-		return SLEEP_GEAR1_THRESHOLD;
-	/* If zero count is greater than 1000, sleep time should be 1000 us */
-	else if (zero_rx_packet_count >= SLEEP_GEAR2_THRESHOLD)
-		return SLEEP_GEAR2_THRESHOLD;
+	/* If zero count is less than 100,  sleep 1us */
+	if (zero_rx_packet_count < SUSPEND_THRESHOLD)
+		return MINIMUM_SLEEP_TIME;
+	/* If zero count is less than 1000, sleep 100 us which is the
+		minimum latency switching from C3/C6 to C0
+	*/
+	else
+		return SUSPEND_THRESHOLD;
 
 	return 0;
 }
@@ -767,6 +782,84 @@ power_freq_scaleup_heuristic(unsigned lcore_id,
 	return FREQ_CURRENT;
 }
 
+/**
+ * force polling thread sleep until one-shot rx interrupt triggers
+ * @param port_id
+ *  Port id.
+ * @param queue_id
+ *  Rx queue id.
+ * @return
+ *  0 on success
+ */
+static int
+sleep_until_rx_interrupt(int num)
+{
+	struct rte_epoll_event event[num];
+	int n, i;
+	uint8_t port_id, queue_id;
+	void *data;
+
+	RTE_LOG(INFO, L3FWD_POWER,
+		"lcore %u sleeps until interrupt triggers\n",
+		rte_lcore_id());
+
+	n = rte_epoll_wait(RTE_EPOLL_PER_THREAD, event, num, -1);
+	for (i = 0; i < n; i++) {
+		data = event[i].epdata.data;
+		port_id = ((uintptr_t)data) >> CHAR_BIT;
+		queue_id = ((uintptr_t)data) &
+			RTE_LEN2MASK(CHAR_BIT, uint8_t);
+		RTE_LOG(INFO, L3FWD_POWER,
+			"lcore %u is waked up from rx interrupt on"
+			" port %d queue %d\n",
+			rte_lcore_id(), port_id, queue_id);
+	}
+
+	return 0;
+}
+
+static int turn_on_intr(struct lcore_conf *qconf)
+{
+	int i;
+	struct lcore_rx_queue *rx_queue;
+	uint8_t port_id, queue_id;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		port_id = rx_queue->port_id;
+		queue_id = rx_queue->queue_id;
+
+		rte_spinlock_lock(&(locks[port_id]));
+		rte_eth_dev_rx_intr_enable(port_id, queue_id);
+		rte_spinlock_unlock(&(locks[port_id]));
+	}
+}
+
+static int event_register(struct lcore_conf *qconf)
+{
+	struct lcore_rx_queue *rx_queue;
+	uint8_t portid, queueid;
+	uint32_t data;
+	int ret;
+	int i;
+
+	for (i = 0; i < qconf->n_rx_queue; ++i) {
+		rx_queue = &(qconf->rx_queue_list[i]);
+		portid = rx_queue->port_id;
+		queueid = rx_queue->queue_id;
+		data = portid << CHAR_BIT | queueid;
+
+		ret = rte_eth_dev_rx_intr_ctl_q(portid, queueid,
+						RTE_EPOLL_PER_THREAD,
+						RTE_INTR_EVENT_ADD,
+						(void *)((uintptr_t)data));
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 /* main processing loop */
 static int
 main_loop(__attribute__((unused)) void *dummy)
@@ -780,9 +873,9 @@ main_loop(__attribute__((unused)) void *dummy)
 	struct lcore_conf *qconf;
 	struct lcore_rx_queue *rx_queue;
 	enum freq_scale_hint_t lcore_scaleup_hint;
-
 	uint32_t lcore_rx_idle_count = 0;
 	uint32_t lcore_idle_hint = 0;
+	int intr_en = 0;
 
 	const uint64_t drain_tsc = (rte_get_tsc_hz() + US_PER_S - 1) / US_PER_S * BURST_TX_DRAIN_US;
 
@@ -799,13 +892,18 @@ main_loop(__attribute__((unused)) void *dummy)
 	RTE_LOG(INFO, L3FWD_POWER, "entering main loop on lcore %u\n", lcore_id);
 
 	for (i = 0; i < qconf->n_rx_queue; i++) {
-
 		portid = qconf->rx_queue_list[i].port_id;
 		queueid = qconf->rx_queue_list[i].queue_id;
 		RTE_LOG(INFO, L3FWD_POWER, " -- lcoreid=%u portid=%hhu "
 			"rxqueueid=%hhu\n", lcore_id, portid, queueid);
 	}
 
+	/* add into event wait list */
+	if (event_register(qconf) == 0)
+		intr_en = 1;
+	else
+		RTE_LOG(INFO, L3FWD_POWER, "RX interrupt won't enable.\n");
+
 	while (1) {
 		stats[lcore_id].nb_iteration_looped++;
 
@@ -840,6 +938,7 @@ main_loop(__attribute__((unused)) void *dummy)
 			prev_tsc_power = cur_tsc_power;
 		}
 
+start_rx:
 		/*
 		 * Read packet from RX queues
 		 */
@@ -853,6 +952,7 @@ main_loop(__attribute__((unused)) void *dummy)
 
 			nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
 								MAX_PKT_BURST);
+
 			stats[lcore_id].nb_rx_processed += nb_rx;
 			if (unlikely(nb_rx == 0)) {
 				/**
@@ -915,10 +1015,13 @@ main_loop(__attribute__((unused)) void *dummy)
 						rx_queue->freq_up_hint;
 			}
 
-			if (lcore_scaleup_hint == FREQ_HIGHEST)
-				rte_power_freq_max(lcore_id);
-			else if (lcore_scaleup_hint == FREQ_HIGHER)
-				rte_power_freq_up(lcore_id);
+			if (lcore_scaleup_hint == FREQ_HIGHEST) {
+				if (rte_power_freq_max)
+					rte_power_freq_max(lcore_id);
+			} else if (lcore_scaleup_hint == FREQ_HIGHER) {
+				if (rte_power_freq_up)
+					rte_power_freq_up(lcore_id);
+			}
 		} else {
 			/**
 			 * All Rx queues empty in recent consecutive polls,
@@ -933,16 +1036,23 @@ main_loop(__attribute__((unused)) void *dummy)
 					lcore_idle_hint = rx_queue->idle_hint;
 			}
 
-			if ( lcore_idle_hint < SLEEP_GEAR1_THRESHOLD)
+			if (lcore_idle_hint < SUSPEND_THRESHOLD)
 				/**
 				 * execute "pause" instruction to avoid context
-				 * switch for short sleep.
+				 * switch which generally take hundred of
+				 * microseconds for short sleep.
 				 */
 				rte_delay_us(lcore_idle_hint);
-			else
-				/* long sleep force runing thread to suspend */
-				usleep(lcore_idle_hint);
-
+			else {
+				/* suspend until rx interrupt trigges */
+				if (intr_en) {
+					turn_on_intr(qconf);
+					sleep_until_rx_interrupt(
+						qconf->n_rx_queue);
+				}
+				/* start receiving packets immediately */
+				goto start_rx;
+			}
 			stats[lcore_id].sleep_time += lcore_idle_hint;
 		}
 	}
@@ -1273,7 +1383,7 @@ setup_hash(int socketid)
 	char s[64];
 
 	/* create ipv4 hash */
-	snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv4_l3fwd_hash_%d", socketid);
 	ipv4_l3fwd_hash_params.name = s;
 	ipv4_l3fwd_hash_params.socket_id = socketid;
 	ipv4_l3fwd_lookup_struct[socketid] =
@@ -1283,7 +1393,7 @@ setup_hash(int socketid)
 				"socket %d\n", socketid);
 
 	/* create ipv6 hash */
-	snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
+	rte_snprintf(s, sizeof(s), "ipv6_l3fwd_hash_%d", socketid);
 	ipv6_l3fwd_hash_params.name = s;
 	ipv6_l3fwd_hash_params.socket_id = socketid;
 	ipv6_l3fwd_lookup_struct[socketid] =
@@ -1477,6 +1587,7 @@ main(int argc, char **argv)
 	unsigned lcore_id;
 	uint64_t hz;
 	uint32_t n_tx_queue, nb_lcores;
+	uint32_t dev_rxq_num, dev_txq_num;
 	uint8_t portid, nb_rx_queue, queue, socketid;
 
 	/* catch SIGINT and restore cpufreq governor to ondemand */
@@ -1526,10 +1637,19 @@ main(int argc, char **argv)
 		printf("Initializing port %d ... ", portid );
 		fflush(stdout);
 
+		rte_eth_dev_info_get(portid, &dev_info);
+		dev_rxq_num = dev_info.max_rx_queues;
+		dev_txq_num = dev_info.max_tx_queues;
+
 		nb_rx_queue = get_port_n_rx_queues(portid);
+		if (nb_rx_queue > dev_rxq_num)
+			rte_exit(EXIT_FAILURE,
+				"Cannot configure not existed rxq: "
+				"port=%d\n", portid);
+
 		n_tx_queue = nb_lcores;
-		if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
-			n_tx_queue = MAX_TX_QUEUE_PER_PORT;
+		if (n_tx_queue > dev_txq_num)
+			n_tx_queue = dev_txq_num;
 		printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
 			nb_rx_queue, (unsigned)n_tx_queue );
 		ret = rte_eth_dev_configure(portid, nb_rx_queue,
@@ -1553,6 +1673,9 @@ main(int argc, char **argv)
 			if (rte_lcore_is_enabled(lcore_id) == 0)
 				continue;
 
+			if (queueid >= dev_txq_num)
+				continue;
+
 			if (numa_on)
 				socketid = \
 				(uint8_t)rte_lcore_to_socket_id(lcore_id);
@@ -1587,8 +1710,9 @@ main(int argc, char **argv)
 		/* init power management library */
 		ret = rte_power_init(lcore_id);
 		if (ret)
-			rte_exit(EXIT_FAILURE, "Power management library "
-				"initialization failed on core%u\n", lcore_id);
+			rte_log(RTE_LOG_ERR, RTE_LOGTYPE_POWER,
+				"Power management library initialization "
+				"failed on core%u", lcore_id);
 
 		/* init timer structures for each enabled lcore */
 		rte_timer_init(&power_timers[lcore_id]);
@@ -1636,7 +1760,6 @@ main(int argc, char **argv)
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE, "rte_eth_dev_start: err=%d, "
 						"port=%d\n", ret, portid);
-
 		/*
 		 * If enabled, put device in promiscuous mode.
 		 * This allows IO forwarding mode to forward packets
@@ -1645,6 +1768,8 @@ main(int argc, char **argv)
 		 */
 		if (promiscuous_on)
 			rte_eth_promiscuous_enable(portid);
+		/* initialize spinlock for each port */
+		rte_spinlock_init(&(locks[portid]));
 	}
 
 	check_all_ports_link_status((uint8_t)nb_ports, enabled_port_mask);
-- 
1.8.1.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter
@ 2015-07-20  7:03 13% Jingjing Wu
  2015-07-28  8:22  4% ` Lu, Wenzhuo
  2015-07-30  3:38  4% ` Liang, Cunming
  0 siblings, 2 replies; 200+ results
From: Jingjing Wu @ 2015-07-20  7:03 UTC (permalink / raw)
  To: dev

To fix the FVL's flow director issue for SCTP flow, rte_eth_fdir_filter
need to be change to support SCTP flow keys extension. Here announce
the ABI deprecation.

Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 5330d3b..63e19c7 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -35,3 +35,7 @@ Deprecation Notices
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
+
+* Significant ABI change is planned for struct rte_eth_fdir_filter to extend
+  the SCTP flow's key input from release 2.1. The change may be enabled in
+  the upcoming release 2.1 with CONFIG_RTE_NEXT_ABI.
-- 
2.4.0

^ permalink raw reply	[relevance 13%]

* Re: [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes
  2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
  2015-07-19 21:32  0% ` [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
@ 2015-07-20 10:45  0% ` Neil Horman
  2 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-20 10:45 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Sun, Jul 19, 2015 at 12:52:13PM +0200, Thomas Monjalon wrote:
> The main change of these patches is to improve naming consistency
> across ethdev and EAL.
> It should be applied shortly to be part of rc1. If some comments arise,
> it can be fixed/improved in rc2.
> 
> Thomas Monjalon (4):
>   doc: rename ABI chapter to deprecation
>   pci: fix detach and uninit naming
>   ethdev: refactor port release
>   ethdev: fix doxygen internal comments
> 
>  MAINTAINERS                                       |  2 +-
>  doc/guides/rel_notes/{abi.rst => deprecation.rst} | 19 ++++++++-----------
>  doc/guides/rel_notes/index.rst                    |  2 +-
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map     |  2 ++
>  lib/librte_eal/common/eal_common_pci.c            | 20 ++++++++++++--------
>  lib/librte_eal/common/include/rte_pci.h           |  6 ++++--
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map   |  2 ++
>  lib/librte_ether/rte_ethdev.c                     | 11 +++++------
>  lib/librte_ether/rte_ethdev.h                     |  9 ++++-----
>  9 files changed, 39 insertions(+), 34 deletions(-)
>  rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)
> 
> -- 
> 2.4.2
> 
> 

Series
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
  @ 2015-07-20 12:19  2% ` Konstantin Ananyev
  2015-07-22 16:50  0%   ` Zhang, Helin
  2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
  0 siblings, 2 replies; 200+ results
From: Konstantin Ananyev @ 2015-07-20 12:19 UTC (permalink / raw)
  To: dev

Add the ability for the upper layer to query RX/TX queue information.

Add new structures:
struct rte_eth_rxq_info
struct rte_eth_txq_info

new functions:
rte_eth_rx_queue_info_get
rte_eth_tx_queue_info_get

into rte_etdev API.

Left extra free space in the queue info structures,
so extra fields could be added later without ABI breakage.

v2 changes:
- Add formal check for the qinfo input parameter.
- As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'

v3 changes:
- Updated rte_ether_version.map 
- Merged with latest changes

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 87 +++++++++++++++++++++++++++++++---
 lib/librte_ether/rte_ether_version.map |  2 +
 3 files changed, 137 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 94104ce..a94c119 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 int
+rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
+rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_tx_queues) {
+		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
 rte_eth_dev_set_mc_addr_list(uint8_t port_id,
 			     struct ether_addr *mc_addr_set,
 			     uint32_t nb_mc_addr)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c901a2c..0c6705e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -960,6 +960,30 @@ struct rte_eth_xstats {
 	uint64_t value;
 };
 
+/**
+ * Ethernet device RX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_rxq_info {
+	struct rte_mempool *mp;     /**< mempool used by that queue. */
+	struct rte_eth_rxconf conf; /**< queue config parameters. */
+	uint8_t scattered_rx;       /**< scattered packets RX supported. */
+	uint16_t nb_desc;           /**< configured number of RXDs. */
+	uint16_t max_desc;          /**< max allowed number of RXDs. */
+	uint16_t min_desc;          /**< min allowed number of RXDs. */
+} __rte_cache_aligned;
+
+/**
+ * Ethernet device TX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_txq_info {
+	struct rte_eth_txconf conf; /**< queue config parameters. */
+	uint16_t nb_desc;           /**< configured number of TXDs. */
+	uint16_t max_desc;          /**< max allowed number of TXDs. */
+	uint16_t min_desc;          /**< min allowed number of TXDs. */
+} __rte_cache_aligned;
+
 struct rte_eth_dev;
 
 struct rte_eth_dev_callback;
@@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
+
+typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
+
 typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);
 /**< @internal Set MTU. */
 
@@ -1451,9 +1481,13 @@ struct eth_dev_ops {
 	rss_hash_update_t rss_hash_update;
 	/** Get current RSS hash configuration. */
 	rss_hash_conf_get_t rss_hash_conf_get;
-	eth_filter_ctrl_t              filter_ctrl;          /**< common filter control*/
+	eth_filter_ctrl_t              filter_ctrl;
+	/**< common filter control. */
 	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
-
+	eth_rxq_info_get_t rxq_info_get;
+	/**< retrieve RX queue information. */
+	eth_txq_info_get_t txq_info_get;
+	/**< retrieve TX queue information. */
 	/** Turn IEEE1588/802.1AS timestamping on. */
 	eth_timesync_enable_t timesync_enable;
 	/** Turn IEEE1588/802.1AS timestamping off. */
@@ -3721,6 +3755,46 @@ int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		struct rte_eth_rxtx_callback *user_cb);
 
 /**
+ * Retrieve information about given port's RX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The RX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo);
+
+/**
+ * Retrieve information about given port's TX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The TX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo);
+
+/*
  * Retrieve number of available registers for access
  *
  * @param port_id
@@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
  */
 int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
 
-#ifdef __cplusplus
-}
-#endif
-
 /**
  * Set the list of multicast addresses to filter on an Ethernet device.
  *
@@ -3882,4 +3952,9 @@ extern int rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
  */
 extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
 					      struct timespec *timestamp);
+
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _RTE_ETHDEV_H_ */
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 23cfee9..8de0928 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -92,6 +92,7 @@ DPDK_2.0 {
 	rte_eth_rx_burst;
 	rte_eth_rx_descriptor_done;
 	rte_eth_rx_queue_count;
+	rte_eth_rx_queue_info_get;
 	rte_eth_rx_queue_setup;
 	rte_eth_set_queue_rate_limit;
 	rte_eth_set_vf_rate_limit;
@@ -99,6 +100,7 @@ DPDK_2.0 {
 	rte_eth_stats_get;
 	rte_eth_stats_reset;
 	rte_eth_tx_burst;
+	rte_eth_tx_queue_info_get;
 	rte_eth_tx_queue_setup;
 	rte_eth_xstats_get;
 	rte_eth_xstats_reset;
-- 
1.8.3.1

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
  2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
@ 2015-07-21 13:20  7%   ` Dumitrescu, Cristian
  2015-07-21 14:03  7%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Dumitrescu, Cristian @ 2015-07-21 13:20 UTC (permalink / raw)
  To: Thomas Monjalon, dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Sunday, July 19, 2015 11:52 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
> 
> This chapter is for ABI and API. That's why a renaming is required.
> 
> Remove also the examples which are now in the referenced guidelines.
> 
> Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> ---
>  MAINTAINERS                                       |  2 +-
>  doc/guides/rel_notes/{abi.rst => deprecation.rst} | 16 +++++-----------
>  doc/guides/rel_notes/index.rst                    |  2 +-
>  3 files changed, 7 insertions(+), 13 deletions(-)
>  rename doc/guides/rel_notes/{abi.rst => deprecation.rst} (51%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2a32659..6531900 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -60,7 +60,7 @@ F: doc/guides/prog_guide/ext_app_lib_make_help.rst
>  ABI versioning
>  M: Neil Horman <nhorman@tuxdriver.com>
>  F: lib/librte_compat/
> -F: doc/guides/rel_notes/abi.rst
> +F: doc/guides/rel_notes/deprecation.rst
>  F: scripts/validate-abi.sh
> 
> 
> diff --git a/doc/guides/rel_notes/abi.rst
> b/doc/guides/rel_notes/deprecation.rst
> similarity index 51%
> rename from doc/guides/rel_notes/abi.rst
> rename to doc/guides/rel_notes/deprecation.rst
> index 7a08830..eef01f1 100644
> --- a/doc/guides/rel_notes/abi.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -1,17 +1,11 @@
> -ABI policy
> -==========
> +Deprecation
> +===========
> 
>  See the :doc:`guidelines document for details of the ABI policy
> </guidelines/versioning>`.
> -ABI deprecation notices are to be posted here.
> +API and ABI deprecation notices are to be posted here.
> 
> -
> -Examples of Deprecation Notices
> --------------------------------
> -
> -* The Macro #RTE_FOO is deprecated and will be removed with version 2.0,
> to be replaced with the inline function rte_bar()
> -* The function rte_mbuf_grok has been updated to include new parameter
> in version 2.0.  Backwards compatibility will be maintained for this function
> until the release of version 2.1
> -* The members struct foo have been reorganized in release 2.0.  Existing
> binary applications will have backwards compatibility in release 2.0, while
> newly built binaries will need to reference new structure variant struct foo2.
> Compatibility will be removed in release 2.2, and all applications will require
> updating and rebuilding to the new structure at that time, which will be
> renamed to the original struct foo.
> -* Significant ABI changes are planned for the librte_dostuff library.  The
> upcoming release 2.0 will not contain these changes, but release 2.1 will, and
> no backwards compatibility is planned due to the invasive nature of these
> changes.  Binaries using this library built prior to version 2.1 will require
> updating and recompilation.
> +Help to update from a previous release is provided in
> +:doc:`another section </rel_notes/updating_apps>`.
> 
> 
>  Deprecation Notices
> diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
> index d790783..9d66cd8 100644
> --- a/doc/guides/rel_notes/index.rst
> +++ b/doc/guides/rel_notes/index.rst
> @@ -48,5 +48,5 @@ Contents
>      updating_apps
>      known_issues
>      resolved_issues
> -    abi
> +    deprecation
>      faq
> --
> 2.4.2

Hi Thomas,

There are some pending doc patches on ABI changes that have been sent and ack-ed prior to this change.

Due to this change, they cannot be applied cleanly anymore. Are you OK to integrate them with the small local change required from your side?

Thanks,
Cristian

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation
  2015-07-21 13:20  7%   ` Dumitrescu, Cristian
@ 2015-07-21 14:03  7%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-21 14:03 UTC (permalink / raw)
  To: Dumitrescu, Cristian; +Cc: dev

2015-07-21 13:20, Dumitrescu, Cristian:
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > --- a/doc/guides/rel_notes/abi.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> 
> There are some pending doc patches on ABI changes that have been sent and ack-ed prior to this change.
> 
> Due to this change, they cannot be applied cleanly anymore.
> Are you OK to integrate them with the small local change required from your side?

Yes it is a basic merge issue that is easily managed locally.

Though, it would be nice to have at least 3 acks on these patches,
as specified in the ABI policy.

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct
@ 2015-07-21 14:10  3% Pablo de Lara
  2015-07-22  9:08  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Pablo de Lara @ 2015-07-21 14:10 UTC (permalink / raw)
  To: dev

In order to keep the ABI consistent with the old hash library,
hash_func_init_val field has been moved, so it remains
at the same offset as previously, since hash_func and
hash_func_init_val are fields accesed by the public function
rte_hash_hash and must keep the same offset as older versions.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 lib/librte_hash/rte_cuckoo_hash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
index dec18ce..5cf4af6 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -105,8 +105,8 @@ struct rte_hash {
 	uint32_t num_buckets;           /**< Number of buckets in table. */
 	uint32_t key_len;               /**< Length of hash key. */
 	rte_hash_function hash_func;    /**< Function used to calculate hash. */
-	rte_hash_cmp_eq_t rte_hash_cmp_eq; /**< Function used to compare keys. */
 	uint32_t hash_func_init_val;    /**< Init value used by hash_func. */
+	rte_hash_cmp_eq_t rte_hash_cmp_eq; /**< Function used to compare keys. */
 	uint32_t bucket_bitmask;        /**< Bitmask for getting bucket index
 						from hash signature. */
 	uint32_t key_entry_size;         /**< Size of each key entry. */
-- 
2.4.2

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct
  2015-07-21 14:10  3% [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct Pablo de Lara
@ 2015-07-22  9:08  0% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-22  9:08 UTC (permalink / raw)
  To: Pablo de Lara; +Cc: dev

> In order to keep the ABI consistent with the old hash library,
> hash_func_init_val field has been moved, so it remains
> at the same offset as previously, since hash_func and
> hash_func_init_val are fields accesed by the public function
> rte_hash_hash and must keep the same offset as older versions.
> 
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>

Applied, thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-20 12:19  2% ` [dpdk-dev] [PATCHv3 1/5] " Konstantin Ananyev
@ 2015-07-22 16:50  0%   ` Zhang, Helin
  2015-07-22 17:00  0%     ` Ananyev, Konstantin
  2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
  1 sibling, 1 reply; 200+ results
From: Zhang, Helin @ 2015-07-22 16:50 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Konstantin Ananyev
> Sent: Monday, July 20, 2015 5:19 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue
> information
> 
> Add the ability for the upper layer to query RX/TX queue information.
> 
> Add new structures:
> struct rte_eth_rxq_info
> struct rte_eth_txq_info
> 
> new functions:
> rte_eth_rx_queue_info_get
> rte_eth_tx_queue_info_get
> 
> into rte_etdev API.
> 
> Left extra free space in the queue info structures, so extra fields could be added
> later without ABI breakage.
> 
> v2 changes:
> - Add formal check for the qinfo input parameter.
> - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> 
> v3 changes:
> - Updated rte_ether_version.map
> - Merged with latest changes
> 
> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.h          | 87
> +++++++++++++++++++++++++++++++---
>  lib/librte_ether/rte_ether_version.map |  2 +
>  3 files changed, 137 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index
> 94104ce..a94c119 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t
> queue_id,  }
> 
>  int
> +rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_rxq_info *qinfo)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	if (qinfo == NULL)
> +		return -EINVAL;
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_eth_devices[port_id];
> +	if (queue_id >= dev->data->nb_rx_queues) {
> +		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
> +		return -EINVAL;
> +	}
> +
> +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
> +
> +	memset(qinfo, 0, sizeof(*qinfo));
> +	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
> +	return 0;
> +}
> +
> +int
> +rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_txq_info *qinfo)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	if (qinfo == NULL)
> +		return -EINVAL;
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_eth_devices[port_id];
> +	if (queue_id >= dev->data->nb_tx_queues) {
> +		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
> +		return -EINVAL;
> +	}
> +
> +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
> +
> +	memset(qinfo, 0, sizeof(*qinfo));
> +	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
> +	return 0;
> +}
> +
> +int
>  rte_eth_dev_set_mc_addr_list(uint8_t port_id,
>  			     struct ether_addr *mc_addr_set,
>  			     uint32_t nb_mc_addr)
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index
> c901a2c..0c6705e 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -960,6 +960,30 @@ struct rte_eth_xstats {
>  	uint64_t value;
>  };
> 
> +/**
> + * Ethernet device RX queue information strcuture.
> + * Used to retieve information about configured queue.
> + */
> +struct rte_eth_rxq_info {
> +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> +	uint16_t nb_desc;           /**< configured number of RXDs. */
> +	uint16_t max_desc;          /**< max allowed number of RXDs. */
> +	uint16_t min_desc;          /**< min allowed number of RXDs. */
> +} __rte_cache_aligned;
> +
> +/**
> + * Ethernet device TX queue information strcuture.
> + * Used to retieve information about configured queue.
> + */
> +struct rte_eth_txq_info {
> +	struct rte_eth_txconf conf; /**< queue config parameters. */
> +	uint16_t nb_desc;           /**< configured number of TXDs. */
> +	uint16_t max_desc;          /**< max allowed number of TXDs. */
> +	uint16_t min_desc;          /**< min allowed number of TXDs. */
> +} __rte_cache_aligned;
> +
>  struct rte_eth_dev;
> 
>  struct rte_eth_dev_callback;
> @@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct
> rte_eth_dev *dev,  typedef int (*eth_rx_descriptor_done_t)(void *rxq,
> uint16_t offset);  /**< @internal Check DD bit of specific RX descriptor */
> 
> +typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
> +	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
> +
> +typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
> +	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
> +
>  typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);  /**<
> @internal Set MTU. */
> 
> @@ -1451,9 +1481,13 @@ struct eth_dev_ops {
>  	rss_hash_update_t rss_hash_update;
>  	/** Get current RSS hash configuration. */
>  	rss_hash_conf_get_t rss_hash_conf_get;
> -	eth_filter_ctrl_t              filter_ctrl;          /**< common filter
> control*/
> +	eth_filter_ctrl_t              filter_ctrl;
> +	/**< common filter control. */
>  	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
> -
> +	eth_rxq_info_get_t rxq_info_get;
> +	/**< retrieve RX queue information. */
> +	eth_txq_info_get_t txq_info_get;
> +	/**< retrieve TX queue information. */
>  	/** Turn IEEE1588/802.1AS timestamping on. */
>  	eth_timesync_enable_t timesync_enable;
>  	/** Turn IEEE1588/802.1AS timestamping off. */ @@ -3721,6 +3755,46 @@
> int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
>  		struct rte_eth_rxtx_callback *user_cb);
Is it targeting R2.1? If no, the new ops should be added at the end of this structure?

> 
>  /**
> + * Retrieve information about given port's RX queue.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param queue_id
> + *   The RX queue on the Ethernet device for which information
> + *   will be retrieved.
> + * @param qinfo
> + *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
> + *   the information of the Ethernet device.
> + *
> + * @return
> + *   - 0: Success
> + *   - -ENOTSUP: routine is not supported by the device PMD.
> + *   - -EINVAL:  The port_id or the queue_id is out of range.
> + */
> +int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_rxq_info *qinfo);
> +
> +/**
> + * Retrieve information about given port's TX queue.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param queue_id
> + *   The TX queue on the Ethernet device for which information
> + *   will be retrieved.
> + * @param qinfo
> + *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
> + *   the information of the Ethernet device.
> + *
> + * @return
> + *   - 0: Success
> + *   - -ENOTSUP: routine is not supported by the device PMD.
> + *   - -EINVAL:  The port_id or the queue_id is out of range.
> + */
> +int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> +	struct rte_eth_txq_info *qinfo);
> +
> +/*
>   * Retrieve number of available registers for access
>   *
>   * @param port_id
> @@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct
> rte_dev_eeprom_info *info);
>   */
>  int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info
> *info);
> 
> -#ifdef __cplusplus
> -}
> -#endif


> -
>  /**
>   * Set the list of multicast addresses to filter on an Ethernet device.
>   *
> @@ -3882,4 +3952,9 @@ extern int
> rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
>   */
>  extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
>  					      struct timespec *timestamp);
> +
> +#ifdef __cplusplus
> +}
> +#endif
This is fix for the issue introduced by new ieee1588. It must be added in R2.1 no matter
these patches can be applied or not.

> +
>  #endif /* _RTE_ETHDEV_H_ */
> diff --git a/lib/librte_ether/rte_ether_version.map
> b/lib/librte_ether/rte_ether_version.map
> index 23cfee9..8de0928 100644
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -92,6 +92,7 @@ DPDK_2.0 {
>  	rte_eth_rx_burst;
>  	rte_eth_rx_descriptor_done;
>  	rte_eth_rx_queue_count;
> +	rte_eth_rx_queue_info_get;
>  	rte_eth_rx_queue_setup;
>  	rte_eth_set_queue_rate_limit;
>  	rte_eth_set_vf_rate_limit;
> @@ -99,6 +100,7 @@ DPDK_2.0 {
>  	rte_eth_stats_get;
>  	rte_eth_stats_reset;
>  	rte_eth_tx_burst;
> +	rte_eth_tx_queue_info_get;
>  	rte_eth_tx_queue_setup;
>  	rte_eth_xstats_get;
>  	rte_eth_xstats_reset;
I am not quite sure about the version map. But I have a question: does it need to create new {} for DPDK_2.1?
And should your changes be in DPDK_2.1?

Regards,
Helin

> --
> 1.8.3.1

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-22 16:50  0%   ` Zhang, Helin
@ 2015-07-22 17:00  0%     ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2015-07-22 17:00 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev



> -----Original Message-----
> From: Zhang, Helin
> Sent: Wednesday, July 22, 2015 5:51 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue information
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Konstantin Ananyev
> > Sent: Monday, July 20, 2015 5:19 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCHv3 1/5] ethdev: add new API to retrieve RX/TX queue
> > information
> >
> > Add the ability for the upper layer to query RX/TX queue information.
> >
> > Add new structures:
> > struct rte_eth_rxq_info
> > struct rte_eth_txq_info
> >
> > new functions:
> > rte_eth_rx_queue_info_get
> > rte_eth_tx_queue_info_get
> >
> > into rte_etdev API.
> >
> > Left extra free space in the queue info structures, so extra fields could be added
> > later without ABI breakage.
> >
> > v2 changes:
> > - Add formal check for the qinfo input parameter.
> > - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> >
> > v3 changes:
> > - Updated rte_ether_version.map
> > - Merged with latest changes
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev.h          | 87
> > +++++++++++++++++++++++++++++++---
> >  lib/librte_ether/rte_ether_version.map |  2 +
> >  3 files changed, 137 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index
> > 94104ce..a94c119 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t
> > queue_id,  }
> >
> >  int
> > +rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_rxq_info *qinfo)
> > +{
> > +	struct rte_eth_dev *dev;
> > +
> > +	if (qinfo == NULL)
> > +		return -EINVAL;
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +	if (queue_id >= dev->data->nb_rx_queues) {
> > +		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
> > +
> > +	memset(qinfo, 0, sizeof(*qinfo));
> > +	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_txq_info *qinfo)
> > +{
> > +	struct rte_eth_dev *dev;
> > +
> > +	if (qinfo == NULL)
> > +		return -EINVAL;
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	dev = &rte_eth_devices[port_id];
> > +	if (queue_id >= dev->data->nb_tx_queues) {
> > +		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
> > +		return -EINVAL;
> > +	}
> > +
> > +	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
> > +
> > +	memset(qinfo, 0, sizeof(*qinfo));
> > +	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
> > +	return 0;
> > +}
> > +
> > +int
> >  rte_eth_dev_set_mc_addr_list(uint8_t port_id,
> >  			     struct ether_addr *mc_addr_set,
> >  			     uint32_t nb_mc_addr)
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index
> > c901a2c..0c6705e 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -960,6 +960,30 @@ struct rte_eth_xstats {
> >  	uint64_t value;
> >  };
> >
> > +/**
> > + * Ethernet device RX queue information strcuture.
> > + * Used to retieve information about configured queue.
> > + */
> > +struct rte_eth_rxq_info {
> > +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> > +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> > +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> > +	uint16_t nb_desc;           /**< configured number of RXDs. */
> > +	uint16_t max_desc;          /**< max allowed number of RXDs. */
> > +	uint16_t min_desc;          /**< min allowed number of RXDs. */
> > +} __rte_cache_aligned;
> > +
> > +/**
> > + * Ethernet device TX queue information strcuture.
> > + * Used to retieve information about configured queue.
> > + */
> > +struct rte_eth_txq_info {
> > +	struct rte_eth_txconf conf; /**< queue config parameters. */
> > +	uint16_t nb_desc;           /**< configured number of TXDs. */
> > +	uint16_t max_desc;          /**< max allowed number of TXDs. */
> > +	uint16_t min_desc;          /**< min allowed number of TXDs. */
> > +} __rte_cache_aligned;
> > +
> >  struct rte_eth_dev;
> >
> >  struct rte_eth_dev_callback;
> > @@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct
> > rte_eth_dev *dev,  typedef int (*eth_rx_descriptor_done_t)(void *rxq,
> > uint16_t offset);  /**< @internal Check DD bit of specific RX descriptor */
> >
> > +typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
> > +	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
> > +
> > +typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
> > +	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
> > +
> >  typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);  /**<
> > @internal Set MTU. */
> >
> > @@ -1451,9 +1481,13 @@ struct eth_dev_ops {
> >  	rss_hash_update_t rss_hash_update;
> >  	/** Get current RSS hash configuration. */
> >  	rss_hash_conf_get_t rss_hash_conf_get;
> > -	eth_filter_ctrl_t              filter_ctrl;          /**< common filter
> > control*/
> > +	eth_filter_ctrl_t              filter_ctrl;
> > +	/**< common filter control. */
> >  	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
> > -
> > +	eth_rxq_info_get_t rxq_info_get;
> > +	/**< retrieve RX queue information. */
> > +	eth_txq_info_get_t txq_info_get;
> > +	/**< retrieve TX queue information. */
> >  	/** Turn IEEE1588/802.1AS timestamping on. */
> >  	eth_timesync_enable_t timesync_enable;
> >  	/** Turn IEEE1588/802.1AS timestamping off. */ @@ -3721,6 +3755,46 @@
> > int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
> >  		struct rte_eth_rxtx_callback *user_cb);
> Is it targeting R2.1? If no, the new ops should be added at the end of this structure?
> 
> >
> >  /**
> > + * Retrieve information about given port's RX queue.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *   The RX queue on the Ethernet device for which information
> > + *   will be retrieved.
> > + * @param qinfo
> > + *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
> > + *   the information of the Ethernet device.
> > + *
> > + * @return
> > + *   - 0: Success
> > + *   - -ENOTSUP: routine is not supported by the device PMD.
> > + *   - -EINVAL:  The port_id or the queue_id is out of range.
> > + */
> > +int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_rxq_info *qinfo);
> > +
> > +/**
> > + * Retrieve information about given port's TX queue.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *   The TX queue on the Ethernet device for which information
> > + *   will be retrieved.
> > + * @param qinfo
> > + *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
> > + *   the information of the Ethernet device.
> > + *
> > + * @return
> > + *   - 0: Success
> > + *   - -ENOTSUP: routine is not supported by the device PMD.
> > + *   - -EINVAL:  The port_id or the queue_id is out of range.
> > + */
> > +int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
> > +	struct rte_eth_txq_info *qinfo);
> > +
> > +/*
> >   * Retrieve number of available registers for access
> >   *
> >   * @param port_id
> > @@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct
> > rte_dev_eeprom_info *info);
> >   */
> >  int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info
> > *info);
> >
> > -#ifdef __cplusplus
> > -}
> > -#endif
> 
> 
> > -
> >  /**
> >   * Set the list of multicast addresses to filter on an Ethernet device.
> >   *
> > @@ -3882,4 +3952,9 @@ extern int
> > rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
> >   */
> >  extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
> >  					      struct timespec *timestamp);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> This is fix for the issue introduced by new ieee1588. It must be added in R2.1 no matter
> these patches can be applied or not.
> 
> > +
> >  #endif /* _RTE_ETHDEV_H_ */
> > diff --git a/lib/librte_ether/rte_ether_version.map
> > b/lib/librte_ether/rte_ether_version.map
> > index 23cfee9..8de0928 100644
> > --- a/lib/librte_ether/rte_ether_version.map
> > +++ b/lib/librte_ether/rte_ether_version.map
> > @@ -92,6 +92,7 @@ DPDK_2.0 {
> >  	rte_eth_rx_burst;
> >  	rte_eth_rx_descriptor_done;
> >  	rte_eth_rx_queue_count;
> > +	rte_eth_rx_queue_info_get;
> >  	rte_eth_rx_queue_setup;
> >  	rte_eth_set_queue_rate_limit;
> >  	rte_eth_set_vf_rate_limit;
> > @@ -99,6 +100,7 @@ DPDK_2.0 {
> >  	rte_eth_stats_get;
> >  	rte_eth_stats_reset;
> >  	rte_eth_tx_burst;
> > +	rte_eth_tx_queue_info_get;
> >  	rte_eth_tx_queue_setup;
> >  	rte_eth_xstats_get;
> >  	rte_eth_xstats_reset;
> I am not quite sure about the version map. But I have a question: does it need to create new {} for DPDK_2.1?
> And should your changes be in DPDK_2.1?

Ah yes, I think you right - it should be in 2.1, not 2.0.
Will re-spin.
Konstantin

> 
> Regards,
> Helin
> 
> > --
> > 1.8.3.1

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-20 12:19  2% ` [dpdk-dev] [PATCHv3 1/5] " Konstantin Ananyev
  2015-07-22 16:50  0%   ` Zhang, Helin
@ 2015-07-22 18:28  2%   ` Konstantin Ananyev
  2015-07-22 19:48  0%     ` Stephen Hemminger
    1 sibling, 2 replies; 200+ results
From: Konstantin Ananyev @ 2015-07-22 18:28 UTC (permalink / raw)
  To: dev

Add the ability for the upper layer to query RX/TX queue information.

Add new structures:
struct rte_eth_rxq_info
struct rte_eth_txq_info

new functions:
rte_eth_rx_queue_info_get
rte_eth_tx_queue_info_get

into rte_etdev API.

Left extra free space in the queue info structures,
so extra fields could be added later without ABI breakage.

v2 changes:
- Add formal check for the qinfo input parameter.
- As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'

v3 changes:
- Updated rte_ether_version.map
- Merged with latest changes

v4 changes:
- rte_ether_version.map: move new functions into DPDK_2.1 sub-space.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c          | 54 +++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h          | 87 +++++++++++++++++++++++++++++++---
 lib/librte_ether/rte_ether_version.map |  2 +
 3 files changed, 137 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 94104ce..a94c119 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3341,6 +3341,60 @@ rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 }
 
 int
+rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_rx_queues) {
+		PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rxq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
+rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo)
+{
+	struct rte_eth_dev *dev;
+
+	if (qinfo == NULL)
+		return -EINVAL;
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_eth_devices[port_id];
+	if (queue_id >= dev->data->nb_tx_queues) {
+		PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
+		return -EINVAL;
+	}
+
+	FUNC_PTR_OR_ERR_RET(*dev->dev_ops->txq_info_get, -ENOTSUP);
+
+	memset(qinfo, 0, sizeof(*qinfo));
+	dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
+	return 0;
+}
+
+int
 rte_eth_dev_set_mc_addr_list(uint8_t port_id,
 			     struct ether_addr *mc_addr_set,
 			     uint32_t nb_mc_addr)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c901a2c..0c6705e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -960,6 +960,30 @@ struct rte_eth_xstats {
 	uint64_t value;
 };
 
+/**
+ * Ethernet device RX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_rxq_info {
+	struct rte_mempool *mp;     /**< mempool used by that queue. */
+	struct rte_eth_rxconf conf; /**< queue config parameters. */
+	uint8_t scattered_rx;       /**< scattered packets RX supported. */
+	uint16_t nb_desc;           /**< configured number of RXDs. */
+	uint16_t max_desc;          /**< max allowed number of RXDs. */
+	uint16_t min_desc;          /**< min allowed number of RXDs. */
+} __rte_cache_aligned;
+
+/**
+ * Ethernet device TX queue information strcuture.
+ * Used to retieve information about configured queue.
+ */
+struct rte_eth_txq_info {
+	struct rte_eth_txconf conf; /**< queue config parameters. */
+	uint16_t nb_desc;           /**< configured number of TXDs. */
+	uint16_t max_desc;          /**< max allowed number of TXDs. */
+	uint16_t min_desc;          /**< min allowed number of TXDs. */
+} __rte_cache_aligned;
+
 struct rte_eth_dev;
 
 struct rte_eth_dev_callback;
@@ -1063,6 +1087,12 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev,
 typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset);
 /**< @internal Check DD bit of specific RX descriptor */
 
+typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
+
+typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
+
 typedef int (*mtu_set_t)(struct rte_eth_dev *dev, uint16_t mtu);
 /**< @internal Set MTU. */
 
@@ -1451,9 +1481,13 @@ struct eth_dev_ops {
 	rss_hash_update_t rss_hash_update;
 	/** Get current RSS hash configuration. */
 	rss_hash_conf_get_t rss_hash_conf_get;
-	eth_filter_ctrl_t              filter_ctrl;          /**< common filter control*/
+	eth_filter_ctrl_t              filter_ctrl;
+	/**< common filter control. */
 	eth_set_mc_addr_list_t set_mc_addr_list; /**< set list of mcast addrs */
-
+	eth_rxq_info_get_t rxq_info_get;
+	/**< retrieve RX queue information. */
+	eth_txq_info_get_t txq_info_get;
+	/**< retrieve TX queue information. */
 	/** Turn IEEE1588/802.1AS timestamping on. */
 	eth_timesync_enable_t timesync_enable;
 	/** Turn IEEE1588/802.1AS timestamping off. */
@@ -3721,6 +3755,46 @@ int rte_eth_remove_tx_callback(uint8_t port_id, uint16_t queue_id,
 		struct rte_eth_rxtx_callback *user_cb);
 
 /**
+ * Retrieve information about given port's RX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The RX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_rxq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_rx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_rxq_info *qinfo);
+
+/**
+ * Retrieve information about given port's TX queue.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The TX queue on the Ethernet device for which information
+ *   will be retrieved.
+ * @param qinfo
+ *   A pointer to a structure of type *rte_eth_txq_info_info* to be filled with
+ *   the information of the Ethernet device.
+ *
+ * @return
+ *   - 0: Success
+ *   - -ENOTSUP: routine is not supported by the device PMD.
+ *   - -EINVAL:  The port_id or the queue_id is out of range.
+ */
+int rte_eth_tx_queue_info_get(uint8_t port_id, uint16_t queue_id,
+	struct rte_eth_txq_info *qinfo);
+
+/*
  * Retrieve number of available registers for access
  *
  * @param port_id
@@ -3793,10 +3867,6 @@ int rte_eth_dev_get_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
  */
 int rte_eth_dev_set_eeprom(uint8_t port_id, struct rte_dev_eeprom_info *info);
 
-#ifdef __cplusplus
-}
-#endif
-
 /**
  * Set the list of multicast addresses to filter on an Ethernet device.
  *
@@ -3882,4 +3952,9 @@ extern int rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
  */
 extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
 					      struct timespec *timestamp);
+
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _RTE_ETHDEV_H_ */
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 23cfee9..2bae2cf 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -117,9 +117,11 @@ DPDK_2.1 {
 	rte_eth_dev_is_valid_port;
 	rte_eth_dev_set_eeprom;
 	rte_eth_dev_set_mc_addr_list;
+	rte_eth_rx_queue_info_get;
 	rte_eth_timesync_disable;
 	rte_eth_timesync_enable;
 	rte_eth_timesync_read_rx_timestamp;
 	rte_eth_timesync_read_tx_timestamp;
+	rte_eth_tx_queue_info_get;
 
 } DPDK_2.0;
-- 
1.8.3.1

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
@ 2015-07-22 19:48  0%     ` Stephen Hemminger
  2015-07-23 10:52  0%       ` Ananyev, Konstantin
    1 sibling, 1 reply; 200+ results
From: Stephen Hemminger @ 2015-07-22 19:48 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: dev

On Wed, 22 Jul 2015 19:28:51 +0100
Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:

> Add the ability for the upper layer to query RX/TX queue information.
> 
> Add new structures:
> struct rte_eth_rxq_info
> struct rte_eth_txq_info
> 
> new functions:
> rte_eth_rx_queue_info_get
> rte_eth_tx_queue_info_get
> 
> into rte_etdev API.
> 
> Left extra free space in the queue info structures,
> so extra fields could be added later without ABI breakage.
> 
> v2 changes:
> - Add formal check for the qinfo input parameter.
> - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> 
> v3 changes:
> - Updated rte_ether_version.map
> - Merged with latest changes
> 
> v4 changes:
> - rte_ether_version.map: move new functions into DPDK_2.1 sub-space.
> 
> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Since all this data should be rxconf already, Is it possible
to do a generic version of this and not have to change every driver.

You only handled the Intel hardware drivers. But there also
all the virtual drivers, other vendors etc.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched
  2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
  2015-07-16 21:25  4% ` Dumitrescu, Cristian
  2015-07-16 21:28  4% ` Neil Horman
@ 2015-07-23 10:18  4% ` Dumitrescu, Cristian
  2 siblings, 0 replies; 200+ results
From: Dumitrescu, Cristian @ 2015-07-23 10:18 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-22 19:48  0%     ` Stephen Hemminger
@ 2015-07-23 10:52  0%       ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2015-07-23 10:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, July 22, 2015 8:48 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
> 
> On Wed, 22 Jul 2015 19:28:51 +0100
> Konstantin Ananyev <konstantin.ananyev@intel.com> wrote:
> 
> > Add the ability for the upper layer to query RX/TX queue information.
> >
> > Add new structures:
> > struct rte_eth_rxq_info
> > struct rte_eth_txq_info
> >
> > new functions:
> > rte_eth_rx_queue_info_get
> > rte_eth_tx_queue_info_get
> >
> > into rte_etdev API.
> >
> > Left extra free space in the queue info structures,
> > so extra fields could be added later without ABI breakage.
> >
> > v2 changes:
> > - Add formal check for the qinfo input parameter.
> > - As suggested rename 'rx_qinfo/tx_qinfo' to 'rxq_info/txq_info'
> >
> > v3 changes:
> > - Updated rte_ether_version.map
> > - Merged with latest changes
> >
> > v4 changes:
> > - rte_ether_version.map: move new functions into DPDK_2.1 sub-space.
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> 
> Since all this data should be rxconf already, Is it possible
> to do a generic version of this and not have to change every driver.

I don't think it is possible to implement these two functions at rte_etdev level only.
At least not with current ethdev/PMD implementation:
-  Inside struct rte_eth_dev_info we have only: 'struct rte_eth_rxconf default_rxconf;'.
We don't have rxconf here for each configured rx queue.
That information is maintained by PMD and inside PMD, different devices have different format for queue structure.
- rte_eth_rxq_info contains not only rxconf but some extra information: mempool in use by that queue,
 min/max possible number of descriptors.
 Also my intention was that in future that structure would be extended to provide some RT info about queue:
 (number of free/used descriptors from SW point of view, etc).  

Konstantin

> 
> You only handled the Intel hardware drivers. But there also
> all the virtual drivers, other vendors etc.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] announce ABI change for librte_table
@ 2015-07-23 10:59  4% Cristian Dumitrescu
  2015-07-23 11:05  4% ` Singh, Jasvinder
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cristian Dumitrescu @ 2015-07-23 10:59 UTC (permalink / raw)
  To: dev

v2 changes:
-changed item on LPM table to add LPM IPv6
-removed item for ACL table and replaced with item on table ops 
-added item for hash tables

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 doc/guides/rel_notes/deprecation.rst |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 5330d3b..677f111 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -35,3 +35,13 @@ Deprecation Notices
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
+
+* librte_table (rte_table_lpm.h, rte_table_lpm_ipv6.h): A new parameter to hold
+  the table name will be added to the LPM table parameter structure.
+
+* librte_table (rte_table.h): New functions for table entry bulk add/delete will
+  be added to the table operations structure.
+
+* librte_table (rte_table_hash.h): Key mask parameter will be added to the hash
+  table parameter structure for 8-byte key and 16-byte key extendible bucket and
+  LRU tables.
-- 
1.7.4.1

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
  2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
  2015-07-23 11:05  4% ` Singh, Jasvinder
@ 2015-07-23 11:07  4% ` Mrzyglod, DanielX T
  2015-07-23 11:34  4% ` Gajdzica, MaciejX T
  2 siblings, 0 replies; 200+ results
From: Mrzyglod, DanielX T @ 2015-07-23 11:07 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 23, 2015 1:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
> 
> v2 changes:
> -changed item on LPM table to add LPM IPv6
> -removed item for ACL table and replaced with item on table ops
> -added item for hash tables
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  doc/guides/rel_notes/deprecation.rst |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 5330d3b..677f111 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -35,3 +35,13 @@ Deprecation Notices
>  * The following fields have been deprecated in rte_eth_stats:
>    imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
>    tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
> +
> +* librte_table (rte_table_lpm.h, rte_table_lpm_ipv6.h): A new parameter to
> hold
> +  the table name will be added to the LPM table parameter structure.
> +
> +* librte_table (rte_table.h): New functions for table entry bulk add/delete will
> +  be added to the table operations structure.
> +
> +* librte_table (rte_table_hash.h): Key mask parameter will be added to the hash
> +  table parameter structure for 8-byte key and 16-byte key extendible bucket
> and
> +  LRU tables.
> --
> 1.7.4.1

Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
  2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
@ 2015-07-23 11:05  4% ` Singh, Jasvinder
  2015-07-23 11:07  4% ` Mrzyglod, DanielX T
  2015-07-23 11:34  4% ` Gajdzica, MaciejX T
  2 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2015-07-23 11:05 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 23, 2015 12:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
> 
> v2 changes:
> -changed item on LPM table to add LPM IPv6 -removed item for ACL table
> and replaced with item on table ops -added item for hash tables
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
  2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
  2015-07-23 11:05  4% ` Singh, Jasvinder
  2015-07-23 11:07  4% ` Mrzyglod, DanielX T
@ 2015-07-23 11:34  4% ` Gajdzica, MaciejX T
  2 siblings, 0 replies; 200+ results
From: Gajdzica, MaciejX T @ 2015-07-23 11:34 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Thursday, July 23, 2015 1:00 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] announce ABI change for librte_table
> 
> v2 changes:
> -changed item on LPM table to add LPM IPv6 -removed item for ACL table and
> replaced with item on table ops -added item for hash tables
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---

Acked-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD
  2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
                         ` (9 preceding siblings ...)
  2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
@ 2015-07-23 14:18  3%       ` Liang, Cunming
  2015-07-27 21:34  0%         ` Thomas Monjalon
  10 siblings, 1 reply; 200+ results
From: Liang, Cunming @ 2015-07-23 14:18 UTC (permalink / raw)
  To: dev, thomas.monjalon; +Cc: shemming

Hi Thomas and all,

This patch set postponed from v2.0, and widely reviewed during this release cycle.
The packet I/O interrupt framework is the prerequisite of all PMDs to support packet I/O intr.
There's no significant change since the last three version(v13~v15), the changes are about ABI and version map fixing.
It missed the rc1 in the last minutes, I'm now asking for the exception to make it go into v2.1 rc2 if nobody will reject it.
Again, thanks for all the comments by Stephen, David, Neil and Thomas.

Thanks,
Steve

> -----Original Message-----
> From: Liang, Cunming
> Sent: Monday, July 20, 2015 11:02 AM
> To: dev@dpdk.org; thomas.monjalon@6wind.com
> Cc: shemming@brocade.com; david.marchand@6wind.com; Zhou, Danny; Liu,
> Yong; nhorman@tuxdriver.com; Liang, Cunming
> Subject: [PATCH v15 00/13] Interrupt mode PMD
> 
> v15 changes
>  - remove unnecessary RTE_NEXT_ABI comment
>  - remove ifdef RTE_NEXT_ABI from header file
> 
> v14 changes
>  - per-patch basis ABI compatibility rework
>  - remove unnecessary 'local: *' from version map
>  - minor comments rework
> 
> v13 changes
>  - version map cleanup for v2.1
>  - replace RTE_EAL_RX_INTR by RTE_NEXT_ABI for ABI compatibility
> 
> Patch series v12
> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
> Acked-by: Danny Zhou <danny.zhou@intel.com>
> 
> v12 changes
>  - bsd cleanup for unused variable warning
>  - fix awkward line split in debug message
> 
> v11 changes
>  - typo cleanup and check kernel style
> 
> v10 changes
>  - code rework to return actual error code
>  - bug fix for lsc when using uio_pci_generic
> 
> v9 changes
>  - code rework to fix open comment
>  - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
>  - new patch to turn off the feature by default so as to avoid v2.1 abi broken
> 
> v8 changes
>  - remove condition check for only vfio-msix
>  - add multiplex intr support when only one intr vector allowed
>  - lsc and rxq interrupt runtime enable decision
>  - add safe event delete while the event wakeup execution happens
> 
> v7 changes
>  - decouple epoll event and intr operation
>  - add condition check in the case intr vector is disabled
>  - renaming some APIs
> 
> v6 changes
>  - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
>  - rewrite rte_intr_rx_wait/rte_intr_rx_set.
>  - using vector number instead of queue_id as interrupt API params.
>  - patch reorder and split.
> 
> v5 changes
>  - Rebase the patchset onto the HEAD
>  - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
>  - Export wait-for-rx interrupt function for shared libraries
>  - Split-off a new patch file for changed struct rte_intr_handle that
>    other patches depend on, to avoid breaking git bisect
>  - Change sample applicaiton to accomodate EAL function spec change
>    accordingly
> 
> v4 changes
>  - Export interrupt enable/disable functions for shared libraries
>  - Adjust position of new-added structure fields and functions to
>    avoid breaking ABI
> 
> v3 changes
>  - Add return value for interrupt enable/disable functions
>  - Move spinlok from PMD to L3fwd-power
>  - Remove unnecessary variables in e1000_mac_info
>  - Fix miscelleous review comments
> 
> v2 changes
>  - Fix compilation issue in Makefile for missed header file.
>  - Consolidate internal and community review comments of v1 patch set.
> 
> The patch series introduce low-latency one-shot rx interrupt into DPDK with
> polling and interrupt mode switch control example.
> 
> DPDK userspace interrupt notification and handling mechanism is based on UIO
> with below limitation:
> 1) It is designed to handle LSC interrupt only with inefficient suspended
>    pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
>    which then wakes up DPDK polling thread). In this way, it introduces
>    non-deterministic wakeup latency for DPDK polling thread as well as packet
>    latency if it is used to handle Rx interrupt.
> 2) UIO only supports a single interrupt vector which has to been shared by
>    LSC interrupt and interrupts assigned to dedicated rx queues.
> 
> This patchset includes below features:
> 1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF
> only)
> .
> 2) Build on top of the VFIO mechanism instead of UIO, so it could support
>    up to 64 interrupt vectors for rx queue interrupts.
> 3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
>    VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
>    user space.
> 4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
>    switch algorithms in L3fwd-power example.
> 
> Known limitations:
> 1) It does not work for UIO due to a single interrupt eventfd shared by LSC
>    and rx queue interrupt handlers causes a mess. [FIXED]
> 2) LSC interrupt is not supported by VF driver, so it is by default disabled
>    in L3fwd-power now. Feel free to turn in on if you want to support both LSC
>    and rx queue interrupts on a PF.
> 
> Cunming Liang (13):
>   eal/linux: add interrupt vectors support in intr_handle
>   eal/linux: add rte_epoll_wait/ctl support
>   eal/linux: add API to set rx interrupt event monitor
>   eal/linux: fix comments typo on vfio msi
>   eal/linux: map eventfd to VFIO MSI-X intr vector
>   eal/linux: standalone intr event fd create support
>   eal/linux: fix lsc read error in uio_pci_generic
>   eal/bsd: dummy for new intr definition
>   eal/bsd: fix inappropriate linuxapp referred in bsd
>   ethdev: add rx intr enable, disable and ctl functions
>   ixgbe: enable rx queue interrupts for both PF and VF
>   igb: enable rx queue interrupts for PF
>   l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
>     switch
> 
>  drivers/net/e1000/igb_ethdev.c                     | 311 ++++++++++--
>  drivers/net/ixgbe/ixgbe_ethdev.c                   | 527
> ++++++++++++++++++++-
>  drivers/net/ixgbe/ixgbe_ethdev.h                   |   4 +
>  examples/l3fwd-power/main.c                        | 205 ++++++--
>  lib/librte_eal/bsdapp/eal/eal_interrupts.c         |  42 ++
>  .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  74 ++-
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map      |   5 +
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c       | 414 ++++++++++++++--
>  .../linuxapp/eal/include/exec-env/rte_interrupts.h | 153 ++++++
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map    |   8 +
>  lib/librte_ether/rte_ethdev.c                      | 147 ++++++
>  lib/librte_ether/rte_ethdev.h                      | 104 ++++
>  lib/librte_ether/rte_ether_version.map             |   4 +
>  13 files changed, 1870 insertions(+), 128 deletions(-)
> 
> --
> 1.8.1.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 0/2] Example: l2fwd-ethtool
  @ 2015-07-23 15:00  4% ` Liang-Min Larry Wang
  2015-07-23 15:00  4%   ` [dpdk-dev] [PATCH v2 1/2] Remove ABI requierment for external library builds Liang-Min Larry Wang
  0 siblings, 1 reply; 200+ results
From: Liang-Min Larry Wang @ 2015-07-23 15:00 UTC (permalink / raw)
  To: dev; +Cc: Liang-Min Larry Wang

This implementation is designed to provide an example illlustrating how to create a
user-space ethtool library from existing ethdev APIs. In contrast to kernel version
of same API (such as ops defined in KNI), the user-space APIs enable a fast-path
(no kernel API calls) for device query and data return. This example implements 19 popular
used Ethtool and Netdevice ops as described in examples/l2fwd-ethtool/lib/rte_ethtool.h,
and commnity support of un-implemented Ethtool and Netdevice ops are very welcomed.

v2 change:
- Separate changes on .mk files into a separate patch file
- Remove requirement of ABI version for external library build
- Fix example/l2fwd-ethtool share object build

Andrew G. Harvey (1):
  Remove ABI requierment for external library builds.

Liang-Min Larry Wang (1):
  examples: new example: l2fwd-ethtool

 examples/Makefile                                |    1 +
 examples/l2fwd-ethtool/Makefile                  |   48 +
 examples/l2fwd-ethtool/l2fwd-app/Makefile        |   58 ++
 examples/l2fwd-ethtool/l2fwd-app/main.c          | 1025 ++++++++++++++++++++++
 examples/l2fwd-ethtool/l2fwd-app/netdev_api.h    |  770 ++++++++++++++++
 examples/l2fwd-ethtool/l2fwd-app/shared_fifo.h   |  159 ++++
 examples/l2fwd-ethtool/lib/Makefile              |   57 ++
 examples/l2fwd-ethtool/lib/rte_ethtool.c         |  336 +++++++
 examples/l2fwd-ethtool/lib/rte_ethtool.h         |  385 ++++++++
 examples/l2fwd-ethtool/nic-control/Makefile      |   55 ++
 examples/l2fwd-ethtool/nic-control/nic_control.c |  614 +++++++++++++
 mk/rte.extlib.mk                                 |    2 +
 mk/rte.lib.mk                                    |    6 +
 13 files changed, 3516 insertions(+)
 create mode 100644 examples/l2fwd-ethtool/Makefile
 create mode 100644 examples/l2fwd-ethtool/l2fwd-app/Makefile
 create mode 100644 examples/l2fwd-ethtool/l2fwd-app/main.c
 create mode 100644 examples/l2fwd-ethtool/l2fwd-app/netdev_api.h
 create mode 100644 examples/l2fwd-ethtool/l2fwd-app/shared_fifo.h
 create mode 100644 examples/l2fwd-ethtool/lib/Makefile
 create mode 100644 examples/l2fwd-ethtool/lib/rte_ethtool.c
 create mode 100644 examples/l2fwd-ethtool/lib/rte_ethtool.h
 create mode 100644 examples/l2fwd-ethtool/nic-control/Makefile
 create mode 100644 examples/l2fwd-ethtool/nic-control/nic_control.c

-- 
2.1.4

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 1/2] Remove ABI requierment for external library builds.
  2015-07-23 15:00  4% ` [dpdk-dev] [PATCH v2 0/2] Example: l2fwd-ethtool Liang-Min Larry Wang
@ 2015-07-23 15:00  4%   ` Liang-Min Larry Wang
  0 siblings, 0 replies; 200+ results
From: Liang-Min Larry Wang @ 2015-07-23 15:00 UTC (permalink / raw)
  To: dev

From: "Andrew G. Harvey" <agh@cisco.com>

Signed-off-by: Andrew G. Harvey <agh@cisco.com>
---
 mk/rte.extlib.mk | 2 ++
 mk/rte.lib.mk    | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/mk/rte.extlib.mk b/mk/rte.extlib.mk
index ba066bc..d2a9b6d 100644
--- a/mk/rte.extlib.mk
+++ b/mk/rte.extlib.mk
@@ -31,6 +31,8 @@
 
 MAKEFLAGS += --no-print-directory
 
+export EXTLIB_BUILD := 1
+
 # we must create the output dir first and recall the same Makefile
 # from this directory
 ifeq ($(NOT_FIRST_CALL),)
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 9ff5cce..63ca640 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -40,11 +40,13 @@ VPATH += $(SRCDIR)
 
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
+ifndef EXTLIB_BUILD
 ifeq ($(CONFIG_RTE_NEXT_ABI),y)
 LIB := $(LIB).1
 endif
 CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
 endif
+endif
 
 
 _BUILD = $(LIB)
@@ -173,12 +175,16 @@ $(RTE_OUTPUT)/lib/$(LIB): $(LIB)
 	@[ -d $(RTE_OUTPUT)/lib ] || mkdir -p $(RTE_OUTPUT)/lib
 	$(Q)cp -f $(LIB) $(RTE_OUTPUT)/lib
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+ifdef EXTLIB_BUILD
+	$(Q)ln -s -f $< $(basename $@)
+else
 ifeq ($(CONFIG_RTE_NEXT_ABI),y)
 	$(Q)ln -s -f $< $(basename $(basename $@))
 else
 	$(Q)ln -s -f $< $(basename $@)
 endif
 endif
+endif
 
 #
 # Clean all generated files
-- 
2.1.4

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  @ 2015-07-24  9:15  3%       ` Ananyev, Konstantin
  2015-07-24  9:24  0%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2015-07-24  9:15 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, July 23, 2015 5:26 PM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
> 
> 2015-07-22 19:28, Konstantin Ananyev:
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		return -EINVAL;
> > +	}
> 
> Please use VALID_PORTID_OR_ERR_RET.
> 
> > + * Ethernet device RX queue information strcuture.
> 
> Typo here (and same for TX).
> 
> > +struct rte_eth_rxq_info {
> > +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> > +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> > +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> > +	uint16_t nb_desc;           /**< configured number of RXDs. */
> 
> Shouldn't we move nb_desc in rte_eth_rxconf?
> So rte_eth_rx_queue_setup() would have less parameters.

I thought about that too, but it seems more drawbacks then pluses with that idea:
1. Right now it is possible to call rte_eth_rx_queue_setup(..., rx_conf=NULL, ...);
In that case rte_eth_rx_queue_setup()will use default for that device rx_conf.
If we'll move mempool into rxconf, will break that ability.
2.  A bit unclear what mempool should be returned as default_rxconf by rte_eth_dev_info_get().
Should it be just NULL.
3. ABI breakage and we (and all customers) will need  to modify each and every RX setup/configure code.

For me it just seems like too much hassle, without any clear advanatage.

> 
> > -#ifdef __cplusplus
> > -}
> > -#endif
> > -
> >  /**
> >   * Set the list of multicast addresses to filter on an Ethernet device.
> >   *
> > @@ -3882,4 +3952,9 @@ extern int rte_eth_timesync_read_rx_timestamp(uint8_t port_id,
> >   */
> >  extern int rte_eth_timesync_read_tx_timestamp(uint8_t port_id,
> >  					      struct timespec *timestamp);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> >  #endif /* _RTE_ETHDEV_H_ */
> 
> Please send this change in a separate patch alone.

Ok, will do.

Konstantin

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-24  9:15  3%       ` Ananyev, Konstantin
@ 2015-07-24  9:24  0%         ` Thomas Monjalon
  2015-07-24 10:50  3%           ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-24  9:24 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

2015-07-24 09:15, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2015-07-22 19:28, Konstantin Ananyev:
> > > +struct rte_eth_rxq_info {
> > > +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> > > +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> > > +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> > > +	uint16_t nb_desc;           /**< configured number of RXDs. */
> > 
> > Shouldn't we move nb_desc in rte_eth_rxconf?
> > So rte_eth_rx_queue_setup() would have less parameters.
> 
> I thought about that too, but it seems more drawbacks then pluses with that idea:
> 1. Right now it is possible to call rte_eth_rx_queue_setup(..., rx_conf=NULL, ...);
> In that case rte_eth_rx_queue_setup()will use default for that device rx_conf.
> If we'll move mempool into rxconf, will break that ability.
> 2.  A bit unclear what mempool should be returned as default_rxconf by rte_eth_dev_info_get().
> Should it be just NULL.

I was only suggesting to move nb_desc, not mempool.
In case rx_conf==NULL, the nb_desc should be the max allowed by the driver.
By the way, we should allow nb_desc==0 as it is done in virtio.
Users shouldn't have to care about queue parameters (including nb_desc) for
a basic usage.

> 3. ABI breakage and we (and all customers) will need  to modify each and every RX setup/configure code.
> 
> For me it just seems like too much hassle, without any clear advanatage.

The advantage is to have a cleaner API. Seems important to me.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-24  9:24  0%         ` Thomas Monjalon
@ 2015-07-24 10:50  3%           ` Ananyev, Konstantin
  2015-07-24 12:40  3%             ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2015-07-24 10:50 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Friday, July 24, 2015 10:24 AM
> To: Ananyev, Konstantin
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
> 
> 2015-07-24 09:15, Ananyev, Konstantin:
> > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > 2015-07-22 19:28, Konstantin Ananyev:
> > > > +struct rte_eth_rxq_info {
> > > > +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> > > > +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> > > > +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> > > > +	uint16_t nb_desc;           /**< configured number of RXDs. */
> > >
> > > Shouldn't we move nb_desc in rte_eth_rxconf?
> > > So rte_eth_rx_queue_setup() would have less parameters.
> >
> > I thought about that too, but it seems more drawbacks then pluses with that idea:
> > 1. Right now it is possible to call rte_eth_rx_queue_setup(..., rx_conf=NULL, ...);
> > In that case rte_eth_rx_queue_setup()will use default for that device rx_conf.
> > If we'll move mempool into rxconf, will break that ability.
> > 2.  A bit unclear what mempool should be returned as default_rxconf by rte_eth_dev_info_get().
> > Should it be just NULL.
> 
> I was only suggesting to move nb_desc, not mempool.

Ah, sorry didn't read it properly first time.
Yes, I think it makes sense to move nb_desc into rxconf, though that means ABI breakage,
and that patch would definitely not make into 2.1. 

> In case rx_conf==NULL, the nb_desc should be the max allowed by the driver.

In my opinion it should be 'preferable by default' PMD value.
Plus, rte_eth_dev_info should contain  min_rx_desc and max_rx_desc,
so user can select nb_rx_desc from the allowed interval, if he needs to.

Konstantin

> By the way, we should allow nb_desc==0 as it is done in virtio.
> Users shouldn't have to care about queue parameters (including nb_desc) for
> a basic usage.
> 
> > 3. ABI breakage and we (and all customers) will need  to modify each and every RX setup/configure code.
> >
> > For me it just seems like too much hassle, without any clear advanatage.
> 
> The advantage is to have a cleaner API. Seems important to me.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCHv4 1/5] ethdev: add new API to retrieve RX/TX queue information
  2015-07-24 10:50  3%           ` Ananyev, Konstantin
@ 2015-07-24 12:40  3%             ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-24 12:40 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

2015-07-24 10:50, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > 2015-07-24 09:15, Ananyev, Konstantin:
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > 2015-07-22 19:28, Konstantin Ananyev:
> > > > > +struct rte_eth_rxq_info {
> > > > > +	struct rte_mempool *mp;     /**< mempool used by that queue. */
> > > > > +	struct rte_eth_rxconf conf; /**< queue config parameters. */
> > > > > +	uint8_t scattered_rx;       /**< scattered packets RX supported. */
> > > > > +	uint16_t nb_desc;           /**< configured number of RXDs. */
> > > >
> > > > Shouldn't we move nb_desc in rte_eth_rxconf?
> > > > So rte_eth_rx_queue_setup() would have less parameters.
> > >
> > > I thought about that too, but it seems more drawbacks then pluses with that idea:
> > > 1. Right now it is possible to call rte_eth_rx_queue_setup(..., rx_conf=NULL, ...);
> > > In that case rte_eth_rx_queue_setup()will use default for that device rx_conf.
> > > If we'll move mempool into rxconf, will break that ability.
> > > 2.  A bit unclear what mempool should be returned as default_rxconf by rte_eth_dev_info_get().
> > > Should it be just NULL.
> > 
> > I was only suggesting to move nb_desc, not mempool.
> 
> Ah, sorry didn't read it properly first time.
> Yes, I think it makes sense to move nb_desc into rxconf, though that means ABI breakage,
> and that patch would definitely not make into 2.1.

You can avoid ABI breakage by using the compat macros and/or NEXT_ABI.
But it shouldn't go into 2.1 as the API shouldn't be changed after RC1.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 4/4] ethdev: check support for rx_queue_count and descriptor_done fns
  @ 2015-07-26 20:44  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-26 20:44 UTC (permalink / raw)
  To: nhorman, Bruce Richardson; +Cc: dev

Neil, Bruce,
Can we move forward?

2015-07-06 17:11, Thomas Monjalon:
> Neil, your ABI expertise is required for this patch.
> 
> 2015-06-15 11:14, Bruce Richardson:
> > On Fri, Jun 12, 2015 at 01:32:56PM -0400, Roger B. Melton wrote:
> > > Hi Bruce,  Comment in-line.  Regards, Roger
> > > 
> > > On 6/12/15 7:28 AM, Bruce Richardson wrote:
> > > >The functions rte_eth_rx_queue_count and rte_eth_descriptor_done are
> > > >supported by very few PMDs. Therefore, it is best to check for support
> > > >for the functions in the ethdev library, so as to avoid crashes
> > > >at run-time if the application goes to use those APIs. The performance
> > > >impact of this change should be very small as this is a predictable
> > > >branch in the function.
> > > >
> > > >Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > >---
> > > >  lib/librte_ether/rte_ethdev.h | 8 ++++++--
> > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > >
> > > >diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > >index 827ca3e..9ad1b6a 100644
> > > >--- a/lib/librte_ether/rte_ethdev.h
> > > >+++ b/lib/librte_ether/rte_ethdev.h
> > > >@@ -2496,6 +2496,8 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
> > > >   *  The queue id on the specific port.
> > > >   * @return
> > > >   *  The number of used descriptors in the specific queue.
> > > >+ *  NOTE: if function is not supported by device this call
> > > >+ *        returns (uint32_t)-ENOTSUP
> > > >   */
> > > >  static inline uint32_t
> > > 
> > > Why not change the return type to int32_t?
> > > In this way, the caller isn't required to make the assumption that a large
> > > queue count indicates an error.  < 0 means error, other wise it's a valid
> > > queue count.
> > > 
> > > This approach would be consistent with other APIs.
> > > 
> > 
> > Yes, good point, I should see about that. One thing I'm unsure of, though, is
> > does this count as ABI breakage? I don't see how it should break any older
> > apps, since the return type is the same size, but I'm not sure as we are 
> > changing the return type of the function.
> > 
> > Neil, can you perhaps comment here? Is changing uint32_t to int32_t ok, from
> > an ABI point of view?
> > 
> > Regards,
> > /Bruce
> 
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD
  2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
@ 2015-07-27 21:34  0%         ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-27 21:34 UTC (permalink / raw)
  To: Liang, Cunming; +Cc: shemming, dev

2015-07-23 14:18, Liang, Cunming:
> Hi Thomas and all,
> 
> This patch set postponed from v2.0, and widely reviewed during this release cycle.
> The packet I/O interrupt framework is the prerequisite of all PMDs to support packet I/O intr.
> There's no significant change since the last three version(v13~v15), the changes are about ABI and version map fixing.
> It missed the rc1 in the last minutes, I'm now asking for the exception to make it go into v2.1 rc2 if nobody will reject it.
> Again, thanks for all the comments by Stephen, David, Neil and Thomas.

As nobody rejects it, it has been applied with minor fixes (see thread comments).
BSD support is missing in this version.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [dpdk-announce] release candidate 2.1.0-rc2
@ 2015-07-27 22:55  4% Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-27 22:55 UTC (permalink / raw)
  To: announce

A new DPDK release candidate is ready for testing:
	http://dpdk.org/browse/dpdk/tag/?id=v2.1.0-rc2

Except some fixes, there are more features in drivers, a new bnx2x driver
and the introduction of the interrupt mode which was pending for a long time.
Having such features in a RC2 is unusual and must be unrepeatable.
>From now, only important fixes and cleanups will be accepted.

Please test it and check the documentation.

Changelog (main enhancements since 2.1.0-rc1)
	- enhancements:
		* interrupt mode
		* new bnx2x driver
		* cxgbe for FreeBSD
		* e1000 82583V jumbo
		* fm10k Tx checksum offload
		* bonding hotplug
		* ring PMD hotplug
	- fixes for:
		* build
		* ABI
		* virtio Rx
		* ixgbe scattered Rx
		* ixgbe stats
		* ixgbe stop
		* fm10k fault interrupt

You are welcome to review or rework the pending fixes for RC3 inclusion:
	http://dpdk.org/dev/patchwork
Some help is also required to check the ABI stability and to validate
the next ABI breakings.

Thank you

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter
  2015-07-20  7:03 13% [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter Jingjing Wu
@ 2015-07-28  8:22  4% ` Lu, Wenzhuo
  2015-07-30  3:38  4% ` Liang, Cunming
  1 sibling, 0 replies; 200+ results
From: Lu, Wenzhuo @ 2015-07-28  8:22 UTC (permalink / raw)
  To: Wu, Jingjing, dev

Hi,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jingjing Wu
> Sent: Monday, July 20, 2015 3:04 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter
> 
> To fix the FVL's flow director issue for SCTP flow, rte_eth_fdir_filter need to be
> change to support SCTP flow keys extension. Here announce the ABI
> deprecation.
> 
> Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 1/4] bnx2x: fix build as shared library
  @ 2015-07-28 15:47  3% ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-28 15:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Build log:
	Must Specify a librte_pmd_bnx2x.so..1 ABI version

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 drivers/net/bnx2x/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/bnx2x/Makefile b/drivers/net/bnx2x/Makefile
index 0de5db9..87f31b6 100644
--- a/drivers/net/bnx2x/Makefile
+++ b/drivers/net/bnx2x/Makefile
@@ -9,6 +9,10 @@ CFLAGS += -O3 -g
 CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -DZLIB_CONST
 
+EXPORT_MAP := rte_pmd_bnx2x_version.map
+
+LIBABIVER := 1
+
 #
 # all source are stored in SRCS-y
 #
-- 
2.4.2

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] pci: make rte_pci_probe void
  @ 2015-07-30  0:34  3%   ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2015-07-30  0:34 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Mon, 20 Apr 2015 15:15:36 +0200
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:

> 2015-04-14 10:55, Stephen Hemminger:
> > Since rte_pci_probe always returns 0 or exits via rte_exit()
> > there is no point in having it return a value.
> > 
> > Just make it void
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Seems partially superseded by this patch:
> 	http://dpdk.org/dev/patchwork/patch/4347/
> 

The patch could be redone, but it would break ABI for no good
reason. Just drop it.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 1/2] log: rte_openlog_stream should be void
  @ 2015-07-30  0:35  3%     ` Stephen Hemminger
  0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2015-07-30  0:35 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Tue, 19 May 2015 11:24:03 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:

> On Fri, Apr 17, 2015 at 08:35:33AM -0700, Stephen Hemminger wrote:
> > Function always returned 0 and no one was checking anyway.
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> > ---
> >  lib/librte_eal/common/eal_common_log.c  | 3 +--
> >  lib/librte_eal/common/include/rte_log.h | 5 +----
> >  2 files changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
> > index ff44d23..3802f9c 100644
> > --- a/lib/librte_eal/common/eal_common_log.c
> > +++ b/lib/librte_eal/common/eal_common_log.c
> > @@ -158,14 +158,13 @@ rte_log_set_history(int enable)
> >  }
> >  
> >  /* Change the stream that will be used by logging system */
> > -int
> > +void
> >  rte_openlog_stream(FILE *f)
> >  {
> >  	if (f == NULL)
> >  		rte_logs.file = default_log_stream;
> >  	else
> >  		rte_logs.file = f;
> > -	return 0;
> >  }
> >  
> >  /* Set global log level */
> > diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> > index f83a0d9..888ee19 100644
> > --- a/lib/librte_eal/common/include/rte_log.h
> > +++ b/lib/librte_eal/common/include/rte_log.h
> > @@ -110,11 +110,8 @@ extern FILE *eal_default_log_stream;
> >   *
> >   * @param f
> >   *   Pointer to the stream.
> > - * @return
> > - *   - 0 on success.
> > - *   - Negative on error.
> >   */
> > -int rte_openlog_stream(FILE *f);
> > +void rte_openlog_stream(FILE *f);
> >  
> >  /**
> >   * Set the global log level.
> > -- 
> > 2.1.4
> > 

Yes it should be void, but technically this is an ABI change.
Please drop the patch.

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2] doc: announce abi change for interrupt mode
  @ 2015-07-30  1:57 14% ` Cunming Liang
  2015-07-30  5:04 13%   ` [dpdk-dev] [PATCH v3] " Cunming Liang
  0 siblings, 1 reply; 200+ results
From: Cunming Liang @ 2015-07-30  1:57 UTC (permalink / raw)
  To: dev

The patch announces the planned ABI changes for interrupt mode on v2.2.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v2 change:
   - rebase to recent master

 doc/guides/rel_notes/deprecation.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 5330d3b..645ce32 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -35,3 +35,11 @@ Deprecation Notices
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
+
+* The ABI changes are planned for struct rte_intr_handle, struct rte_eth_conf
+  and struct eth_dev_ops in order to support interrupt mode feature.
+  The upcoming release 2.1 will not contain these ABI changes by default.
+  This change will be in release 2.2. There's no backwards compatibility planed
+  due to the additional interrupt mode feature enabling.
+  Binaries using this library build prior to version 2.2 will require updating
+  and recompilation.
-- 
1.8.1.4

^ permalink raw reply	[relevance 14%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter
  2015-07-20  7:03 13% [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter Jingjing Wu
  2015-07-28  8:22  4% ` Lu, Wenzhuo
@ 2015-07-30  3:38  4% ` Liang, Cunming
  1 sibling, 0 replies; 200+ results
From: Liang, Cunming @ 2015-07-30  3:38 UTC (permalink / raw)
  To: Wu, Jingjing, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jingjing Wu
> Sent: Monday, July 20, 2015 3:04 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter
> 
> To fix the FVL's flow director issue for SCTP flow, rte_eth_fdir_filter
> need to be change to support SCTP flow keys extension. Here announce
> the ABI deprecation.
> 
> Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 5330d3b..63e19c7 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -35,3 +35,7 @@ Deprecation Notices
>  * The following fields have been deprecated in rte_eth_stats:
>    imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
>    tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
> +
> +* Significant ABI change is planned for struct rte_eth_fdir_filter to extend
> +  the SCTP flow's key input from release 2.1. The change may be enabled in
> +  the upcoming release 2.1 with CONFIG_RTE_NEXT_ABI.
> --
> 2.4.0

Acked-by: Cunming Liang <cunming.liang@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
  2015-07-30  1:57 14% ` [dpdk-dev] [PATCH v2] doc: announce abi change " Cunming Liang
@ 2015-07-30  5:04 13%   ` Cunming Liang
  2015-07-30  5:14  4%     ` Liu, Yong
                       ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Cunming Liang @ 2015-07-30  5:04 UTC (permalink / raw)
  To: dev

The patch announces the planned ABI changes for interrupt mode.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
---
 v3 change:
   - reword for CONFIG_RTE_NEXT_ABI

 v2 change:
   - rebase to recent master
 
 doc/guides/rel_notes/deprecation.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 5330d3b..d36d267 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -35,3 +35,8 @@ Deprecation Notices
 * The following fields have been deprecated in rte_eth_stats:
   imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
   tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
+
+* The ABI changes are planned for struct rte_intr_handle, struct rte_eth_conf
+  and struct eth_dev_ops to support interrupt mode feature from release 2.1.
+  Those changes may be enabled in the upcoming release 2.1
+  with CONFIG_RTE_NEXT_ABI.
-- 
1.8.1.4

^ permalink raw reply	[relevance 13%]

* Re: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
  2015-07-30  5:04 13%   ` [dpdk-dev] [PATCH v3] " Cunming Liang
@ 2015-07-30  5:14  4%     ` Liu, Yong
  2015-07-30 14:25  7%       ` O'Driscoll, Tim
  2015-07-30  8:31  4%     ` He, Shaopeng
  2015-07-31  1:00  4%     ` Zhang, Helin
  2 siblings, 1 reply; 200+ results
From: Liu, Yong @ 2015-07-30  5:14 UTC (permalink / raw)
  To: Liang, Cunming, dev

Acked-by: Marvin Liu <yong.liu@intel.com>

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> Sent: Thursday, July 30, 2015 1:05 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
> 
> The patch announces the planned ABI changes for interrupt mode.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> ---
>  v3 change:
>    - reword for CONFIG_RTE_NEXT_ABI
> 
>  v2 change:
>    - rebase to recent master
> 
>  doc/guides/rel_notes/deprecation.rst | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 5330d3b..d36d267 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -35,3 +35,8 @@ Deprecation Notices
>  * The following fields have been deprecated in rte_eth_stats:
>    imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
>    tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
> +
> +* The ABI changes are planned for struct rte_intr_handle, struct
> rte_eth_conf
> +  and struct eth_dev_ops to support interrupt mode feature from release
> 2.1.
> +  Those changes may be enabled in the upcoming release 2.1
> +  with CONFIG_RTE_NEXT_ABI.
> --
> 1.8.1.4

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
  2015-07-30  5:04 13%   ` [dpdk-dev] [PATCH v3] " Cunming Liang
  2015-07-30  5:14  4%     ` Liu, Yong
@ 2015-07-30  8:31  4%     ` He, Shaopeng
  2015-07-31  1:00  4%     ` Zhang, Helin
  2 siblings, 0 replies; 200+ results
From: He, Shaopeng @ 2015-07-30  8:31 UTC (permalink / raw)
  To: Liang, Cunming, dev

Hi,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> Sent: Thursday, July 30, 2015 1:05 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt
> mode
> 
> The patch announces the planned ABI changes for interrupt mode.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] abi change announce
@ 2015-07-30  9:25  7% Xie, Huawei
  2015-07-30 10:18  4% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Xie, Huawei @ 2015-07-30  9:25 UTC (permalink / raw)
  To: Thomas Monjalon, dev

Hi Thomas:
I am doing virtio/vhost performance optimization, so there is possibly
some change, for example to virtio or vhost virtqueue data structure.
Do i need to announce the ABI change even if the change hasn't been
determined?

/huawei


^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  @ 2015-07-30  9:43  3%           ` Ananyev, Konstantin
  2015-07-30 11:22  0%             ` Olivier MATZ
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2015-07-30  9:43 UTC (permalink / raw)
  To: Olivier MATZ, Zhang, Helin, Martin Weiser; +Cc: dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 10:10 AM
> To: Ananyev, Konstantin; Zhang, Helin; Martin Weiser
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
> 
> Hi Konstantin,
> 
> On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
> > Hi Olivier,
> >
> >> -----Original Message-----
> >> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >> Sent: Thursday, July 30, 2015 9:12 AM
> >> To: Zhang, Helin; Ananyev, Konstantin; Martin Weiser
> >> Cc: dev@dpdk.org
> >> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
> >>
> >> Hi,
> >>
> >> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
> >>> Hi Martin
> >>>
> >>> Thank you very much for the good catch!
> >>>
> >>> The similar situation in i40e, as explained by Konstantin.
> >>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
> >>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
> >> specifically as below.
> >>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
> >> ensure that during mempool/mbuf allocation.
> >>
> >> Indeed it would be good to force the priv_size to be aligned.
> >>
> >> The priv_size could be aligned automatically in
> >> rte_pktmbuf_pool_create(). The only possible problem I could see
> >> is that it would break applications that access to the data buffer
> >> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
> >> best thing to do (I didn't find any applications like this in dpdk).
> >
> >
> > Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?
> 
> Hmm maybe it would break more applications: an odd priv_size is
> probably rare, but a priv_size that is not aligned to 8 bytes is
> maybe more common.

My thought was that rte_mempool_create() was just introduced in 2.1,
so if we add extra requirement for the input parameter now -
there would be no ABI breakage, and not many people started to use it already. 
For me just seems a bit easier and more straightforward then silent alignment -
user would not have wrong assumptions here.
Though if you think that a silent alignment would be more convenient
for most users - I wouldn't insist.
Konstantin

> It's maybe safer to align the size transparently?
> 
> 
> Regards,
> Olivier
> 
> 
> 
> >
> >>
> >> For applications that directly use rte_mempool_create() instead of
> >> rte_pktmbuf_pool_create(), we could add a check using an assert in
> >> rte_pktmbuf_init() and some words in the documentation.
> >>
> >> The question is: what should be the proper alignment? I would say
> >> at least 8 bytes, but maybe cache_aligned is an option too.
> >
> > 8 bytes seems enough to me.
> >
> > Konstantin
> >
> >>
> >> Regards,
> >> Olivier
> >>
> >>
> >>>
> >>> Regards,
> >>> Helin
> >>>
> >>>> -----Original Message-----
> >>>> From: Ananyev, Konstantin
> >>>> Sent: Wednesday, July 29, 2015 11:12 AM
> >>>> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
> >>>> Cc: dev@dpdk.org
> >>>> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
> >>>> private area size is odd
> >>>>
> >>>> Hi Martin,
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
> >>>>> Sent: Wednesday, July 29, 2015 4:07 PM
> >>>>> To: Zhang, Helin; olivier.matz@6wind.com
> >>>>> Cc: dev@dpdk.org
> >>>>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
> >>>>> mbuf private area size is odd
> >>>>>
> >>>>> Hi Helin, Hi Olivier,
> >>>>>
> >>>>> we are seeing an issue with the ixgbe and i40e drivers which we could
> >>>>> track down to our setting of the private area size of the mbufs.
> >>>>> The issue can be easily reproduced with the l2fwd example application
> >>>>> when a small modification is done: just set the priv_size parameter in
> >>>>> the call to the rte_pktmbuf_pool_create function to an odd number like
> >>>>> 1. In our tests this causes every call to rte_eth_rx_burst to return
> >>>>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
> >>>>> packets are received on the interface and the hardware counters do not
> >>>>> report any received packets.
> >>>>
> >>>>   From Niantic datasheet:
> >>>>
> >>>> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
> >>>> advanced receive descriptor programming by the software. The ...
> >>>> Packet Buffer Address (64)
> >>>> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
> >>>> address).
> >>>> Header Buffer Address (64)
> >>>> The physical address of the header buffer with the lowest bit being Descriptor
> >>>> Done (DD).
> >>>> When a packet spans in multiple descriptors, the header buffer address is used
> >>>> only on the first descriptor. During the programming phase, software must set
> >>>> the DD bit to zero (see the description of the DD bit in this section). This means
> >>>> that header buffer addresses are always word aligned."
> >>>>
> >>>> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
> >>>> Header Buffer Address (HBA) to the same value:
> >>>> buf_physaddr + RTE_PKTMBUF_HEADROOM
> >>>> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
> >>>> SW considers that HW already done with it.
> >>>> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
> >>>> word boundary.
> >>>>
> >>>> So the advice would be, right now - don't set priv_size to the odd value.
> >>>> As we don't support split header feature anyway, I think we can fix it just by
> >>>> always setting HBA in the RXD to zero.
> >>>> Could you try the fix for ixgbe below?
> >>>>
> >>>> Same story with FVL, I believe.
> >>>> Konstantin
> >>>>
> >>>>
> >>>>> Interestingly this does not happen if we force the scattered rx path.
> >>>>>
> >>>>> I assume the drivers have some expectations regarding the alignment of
> >>>>> the buf_addr in the mbuf and setting an odd private are size breaks
> >>>>> this alignment in the rte_pktmbuf_init function. If this is the case
> >>>>> then one possible fix might be to enforce an alignment on the private area size.
> >>>>>
> >>>>> Best regards,
> >>>>> Martin
> >>>>
> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> index a0c8847..94967c5 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> >>>> reset_mbuf)
> >>>>
> >>>>                   /* populate the descriptors */
> >>>>                   dma_addr =
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> >>>> -               rxdp[i].read.hdr_addr = dma_addr;
> >>>> +               rxdp[i].read.hdr_addr = 0;
> >>>>                   rxdp[i].read.pkt_addr = dma_addr;
> >>>>           }
> >>>>
> >>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >>>> **rx_pkts,
> >>>>                   rxe->mbuf = nmb;
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> >>>> -               rxdp->read.hdr_addr = dma_addr;
> >>>> +               rxdp->read.hdr_addr = 0;
> >>>>                   rxdp->read.pkt_addr = dma_addr;
> >>>>
> >>>>                   /*
> >>>> @@ -1741,7 +1741,7 @@ next_desc:
> >>>>                           rxe->mbuf = nmb;
> >>>>
> >>>>                           rxm->data_off = RTE_PKTMBUF_HEADROOM;
> >>>> -                       rxdp->read.hdr_addr = dma;
> >>>> +                       rxdp->read.hdr_addr = 0;
> >>>>                           rxdp->read.pkt_addr = dma;
> >>>>                   } else
> >>>>                           rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> >>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
> >>>>                   rxd = &rxq->rx_ring[i];
> >>>> -               rxd->read.hdr_addr = dma_addr;
> >>>> +               rxd->read.hdr_addr = 0;
> >>>>                   rxd->read.pkt_addr = dma_addr;
> >>>>                   rxe[i].mbuf = mbuf;
> >>>>           }
> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> index 6c1647e..16a9c64 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                           RTE_PKTMBUF_HEADROOM);
> >>>>           __m128i dma_addr0, dma_addr1;
> >>>>
> >>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> >>>> +
> >>>>           rxdp = rxq->rx_ring + rxq->rxrearm_start;
> >>>>
> >>>>           /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> >>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
> >>>>                   dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> >>>>
> >>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> >>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> >>>> +
> >>>>                   /* flush desc with pa dma_addr */
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> >>>> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> >>>> reset_mbuf)
> >>>>
> >>>>                   /* populate the descriptors */
> >>>>                   dma_addr =
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> >>>> -               rxdp[i].read.hdr_addr = dma_addr;
> >>>> +               rxdp[i].read.hdr_addr = 0;
> >>>>                   rxdp[i].read.pkt_addr = dma_addr;
> >>>>           }
> >>>>
> >>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >>>> **rx_pkts,
> >>>>                   rxe->mbuf = nmb;
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> >>>> -               rxdp->read.hdr_addr = dma_addr;
> >>>> +               rxdp->read.hdr_addr = 0;
> >>>>                   rxdp->read.pkt_addr = dma_addr;
> >>>>
> >>>>                   /*
> >>>> @@ -1741,7 +1741,7 @@ next_desc:
> >>>>                           rxe->mbuf = nmb;
> >>>>
> >>>>                           rxm->data_off = RTE_PKTMBUF_HEADROOM;
> >>>> -                       rxdp->read.hdr_addr = dma;
> >>>> +                       rxdp->read.hdr_addr = 0;
> >>>>                           rxdp->read.pkt_addr = dma;
> >>>>                   } else
> >>>>                           rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> >>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
> >>>>                   rxd = &rxq->rx_ring[i];
> >>>> -               rxd->read.hdr_addr = dma_addr;
> >>>> +               rxd->read.hdr_addr = 0;
> >>>>                   rxd->read.pkt_addr = dma_addr;
> >>>>                   rxe[i].mbuf = mbuf;
> >>>>           }
> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> index 6c1647e..16a9c64 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                           RTE_PKTMBUF_HEADROOM);
> >>>>           __m128i dma_addr0, dma_addr1;
> >>>>
> >>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> >>>> +
> >>>>           rxdp = rxq->rx_ring + rxq->rxrearm_start;
> >>>>
> >>>>           /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> >>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
> >>>>                   dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> >>>>
> >>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> >>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> >>>> +
> >>>>                   /* flush desc with pa dma_addr */
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> >


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] abi change announce
  2015-07-30  9:25  7% [dpdk-dev] abi change announce Xie, Huawei
@ 2015-07-30 10:18  4% ` Thomas Monjalon
  2015-07-30 10:33  8%   ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2015-07-30 10:18 UTC (permalink / raw)
  To: Xie, Huawei, nhorman; +Cc: dev

2015-07-30 09:25, Xie, Huawei:
> Hi Thomas:
> I am doing virtio/vhost performance optimization, so there is possibly
> some change, for example to virtio or vhost virtqueue data structure.
> Do i need to announce the ABI change even if the change hasn't been
> determined?

I have no strong opinion.
It seems strange to announce something which is not known.
You may be able to introduce your change without previous notice by using
NEXT_ABI if not too invasive.

Neil, an opinion?

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] abi change announce
  2015-07-30 10:18  4% ` Thomas Monjalon
@ 2015-07-30 10:33  8%   ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2015-07-30 10:33 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Thu, Jul 30, 2015 at 12:18:41PM +0200, Thomas Monjalon wrote:
> 2015-07-30 09:25, Xie, Huawei:
> > Hi Thomas:
> > I am doing virtio/vhost performance optimization, so there is possibly
> > some change, for example to virtio or vhost virtqueue data structure.
> > Do i need to announce the ABI change even if the change hasn't been
> > determined?
> 
> I have no strong opinion.
> It seems strange to announce something which is not known.
> You may be able to introduce your change without previous notice by using
> NEXT_ABI if not too invasive.
> 
> Neil, an opinion?
> 

Given the process, you can't announce the change until you know what it is,
since you need to detail in the announcement what the change is going to be.

We have no method to reserve an 'ABI break to be determined later', nor should
we.  Write the code, then we figure out if ABI needs to change and there is a
need to announce.

Neil

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30  9:43  3%           ` Ananyev, Konstantin
@ 2015-07-30 11:22  0%             ` Olivier MATZ
  2015-07-30 13:47  0%               ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Olivier MATZ @ 2015-07-30 11:22 UTC (permalink / raw)
  To: Ananyev, Konstantin, Zhang, Helin, Martin Weiser; +Cc: dev

Hi,

On 07/30/2015 11:43 AM, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>> Sent: Thursday, July 30, 2015 10:10 AM
>> To: Ananyev, Konstantin; Zhang, Helin; Martin Weiser
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
>>
>> Hi Konstantin,
>>
>> On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
>>> Hi Olivier,
>>>
>>>> -----Original Message-----
>>>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>>>> Sent: Thursday, July 30, 2015 9:12 AM
>>>> To: Zhang, Helin; Ananyev, Konstantin; Martin Weiser
>>>> Cc: dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
>>>>
>>>> Hi,
>>>>
>>>> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
>>>>> Hi Martin
>>>>>
>>>>> Thank you very much for the good catch!
>>>>>
>>>>> The similar situation in i40e, as explained by Konstantin.
>>>>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
>>>>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
>>>> specifically as below.
>>>>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
>>>> ensure that during mempool/mbuf allocation.
>>>>
>>>> Indeed it would be good to force the priv_size to be aligned.
>>>>
>>>> The priv_size could be aligned automatically in
>>>> rte_pktmbuf_pool_create(). The only possible problem I could see
>>>> is that it would break applications that access to the data buffer
>>>> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
>>>> best thing to do (I didn't find any applications like this in dpdk).
>>>
>>>
>>> Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?
>>
>> Hmm maybe it would break more applications: an odd priv_size is
>> probably rare, but a priv_size that is not aligned to 8 bytes is
>> maybe more common.
>
> My thought was that rte_mempool_create() was just introduced in 2.1,
> so if we add extra requirement for the input parameter now -
> there would be no ABI breakage, and not many people started to use it already.
> For me just seems a bit easier and more straightforward then silent alignment -
> user would not have wrong assumptions here.
> Though if you think that a silent alignment would be more convenient
> for most users - I wouldn't insist.


Yes, I agree on the principle, but it depends whether this fix
is integrated for 2.1 or not.
I think it may already be a bit late for that, especially as it
is not a very critical bug.

Thomas, what do you think?


Olivier




> Konstantin
>
>> It's maybe safer to align the size transparently?
>>
>>
>> Regards,
>> Olivier
>>
>>
>>
>>>
>>>>
>>>> For applications that directly use rte_mempool_create() instead of
>>>> rte_pktmbuf_pool_create(), we could add a check using an assert in
>>>> rte_pktmbuf_init() and some words in the documentation.
>>>>
>>>> The question is: what should be the proper alignment? I would say
>>>> at least 8 bytes, but maybe cache_aligned is an option too.
>>>
>>> 8 bytes seems enough to me.
>>>
>>> Konstantin
>>>
>>>>
>>>> Regards,
>>>> Olivier
>>>>
>>>>
>>>>>
>>>>> Regards,
>>>>> Helin
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ananyev, Konstantin
>>>>>> Sent: Wednesday, July 29, 2015 11:12 AM
>>>>>> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
>>>>>> Cc: dev@dpdk.org
>>>>>> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
>>>>>> private area size is odd
>>>>>>
>>>>>> Hi Martin,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
>>>>>>> Sent: Wednesday, July 29, 2015 4:07 PM
>>>>>>> To: Zhang, Helin; olivier.matz@6wind.com
>>>>>>> Cc: dev@dpdk.org
>>>>>>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
>>>>>>> mbuf private area size is odd
>>>>>>>
>>>>>>> Hi Helin, Hi Olivier,
>>>>>>>
>>>>>>> we are seeing an issue with the ixgbe and i40e drivers which we could
>>>>>>> track down to our setting of the private area size of the mbufs.
>>>>>>> The issue can be easily reproduced with the l2fwd example application
>>>>>>> when a small modification is done: just set the priv_size parameter in
>>>>>>> the call to the rte_pktmbuf_pool_create function to an odd number like
>>>>>>> 1. In our tests this causes every call to rte_eth_rx_burst to return
>>>>>>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
>>>>>>> packets are received on the interface and the hardware counters do not
>>>>>>> report any received packets.
>>>>>>
>>>>>>    From Niantic datasheet:
>>>>>>
>>>>>> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
>>>>>> advanced receive descriptor programming by the software. The ...
>>>>>> Packet Buffer Address (64)
>>>>>> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
>>>>>> address).
>>>>>> Header Buffer Address (64)
>>>>>> The physical address of the header buffer with the lowest bit being Descriptor
>>>>>> Done (DD).
>>>>>> When a packet spans in multiple descriptors, the header buffer address is used
>>>>>> only on the first descriptor. During the programming phase, software must set
>>>>>> the DD bit to zero (see the description of the DD bit in this section). This means
>>>>>> that header buffer addresses are always word aligned."
>>>>>>
>>>>>> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
>>>>>> Header Buffer Address (HBA) to the same value:
>>>>>> buf_physaddr + RTE_PKTMBUF_HEADROOM
>>>>>> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
>>>>>> SW considers that HW already done with it.
>>>>>> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
>>>>>> word boundary.
>>>>>>
>>>>>> So the advice would be, right now - don't set priv_size to the odd value.
>>>>>> As we don't support split header feature anyway, I think we can fix it just by
>>>>>> always setting HBA in the RXD to zero.
>>>>>> Could you try the fix for ixgbe below?
>>>>>>
>>>>>> Same story with FVL, I believe.
>>>>>> Konstantin
>>>>>>
>>>>>>
>>>>>>> Interestingly this does not happen if we force the scattered rx path.
>>>>>>>
>>>>>>> I assume the drivers have some expectations regarding the alignment of
>>>>>>> the buf_addr in the mbuf and setting an odd private are size breaks
>>>>>>> this alignment in the rte_pktmbuf_init function. If this is the case
>>>>>>> then one possible fix might be to enforce an alignment on the private area size.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Martin
>>>>>>
>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> index a0c8847..94967c5 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>>>>>> reset_mbuf)
>>>>>>
>>>>>>                    /* populate the descriptors */
>>>>>>                    dma_addr =
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>>>>>> -               rxdp[i].read.hdr_addr = dma_addr;
>>>>>> +               rxdp[i].read.hdr_addr = 0;
>>>>>>                    rxdp[i].read.pkt_addr = dma_addr;
>>>>>>            }
>>>>>>
>>>>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>>>>>> **rx_pkts,
>>>>>>                    rxe->mbuf = nmb;
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>>>>>> -               rxdp->read.hdr_addr = dma_addr;
>>>>>> +               rxdp->read.hdr_addr = 0;
>>>>>>                    rxdp->read.pkt_addr = dma_addr;
>>>>>>
>>>>>>                    /*
>>>>>> @@ -1741,7 +1741,7 @@ next_desc:
>>>>>>                            rxe->mbuf = nmb;
>>>>>>
>>>>>>                            rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>>>> -                       rxdp->read.hdr_addr = dma;
>>>>>> +                       rxdp->read.hdr_addr = 0;
>>>>>>                            rxdp->read.pkt_addr = dma;
>>>>>>                    } else
>>>>>>                            rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>>>>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>>>>>                    rxd = &rxq->rx_ring[i];
>>>>>> -               rxd->read.hdr_addr = dma_addr;
>>>>>> +               rxd->read.hdr_addr = 0;
>>>>>>                    rxd->read.pkt_addr = dma_addr;
>>>>>>                    rxe[i].mbuf = mbuf;
>>>>>>            }
>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> index 6c1647e..16a9c64 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                            RTE_PKTMBUF_HEADROOM);
>>>>>>            __m128i dma_addr0, dma_addr1;
>>>>>>
>>>>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>>>>>> +
>>>>>>            rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>>>
>>>>>>            /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>>>>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>>>>>                    dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>>>>>
>>>>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>>>>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>>>>>> +
>>>>>>                    /* flush desc with pa dma_addr */
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>>>>>> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>>>>>> reset_mbuf)
>>>>>>
>>>>>>                    /* populate the descriptors */
>>>>>>                    dma_addr =
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>>>>>> -               rxdp[i].read.hdr_addr = dma_addr;
>>>>>> +               rxdp[i].read.hdr_addr = 0;
>>>>>>                    rxdp[i].read.pkt_addr = dma_addr;
>>>>>>            }
>>>>>>
>>>>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>>>>>> **rx_pkts,
>>>>>>                    rxe->mbuf = nmb;
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>>>>>> -               rxdp->read.hdr_addr = dma_addr;
>>>>>> +               rxdp->read.hdr_addr = 0;
>>>>>>                    rxdp->read.pkt_addr = dma_addr;
>>>>>>
>>>>>>                    /*
>>>>>> @@ -1741,7 +1741,7 @@ next_desc:
>>>>>>                            rxe->mbuf = nmb;
>>>>>>
>>>>>>                            rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>>>> -                       rxdp->read.hdr_addr = dma;
>>>>>> +                       rxdp->read.hdr_addr = 0;
>>>>>>                            rxdp->read.pkt_addr = dma;
>>>>>>                    } else
>>>>>>                            rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>>>>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>>>>>                    rxd = &rxq->rx_ring[i];
>>>>>> -               rxd->read.hdr_addr = dma_addr;
>>>>>> +               rxd->read.hdr_addr = 0;
>>>>>>                    rxd->read.pkt_addr = dma_addr;
>>>>>>                    rxe[i].mbuf = mbuf;
>>>>>>            }
>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> index 6c1647e..16a9c64 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                            RTE_PKTMBUF_HEADROOM);
>>>>>>            __m128i dma_addr0, dma_addr1;
>>>>>>
>>>>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>>>>>> +
>>>>>>            rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>>>
>>>>>>            /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>>>>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>>>>>                    dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>>>>>
>>>>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>>>>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>>>>>> +
>>>>>>                    /* flush desc with pa dma_addr */
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>>>
>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30 11:22  0%             ` Olivier MATZ
@ 2015-07-30 13:47  0%               ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-07-30 13:47 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

2015-07-30 13:22, Olivier MATZ:
> On 07/30/2015 11:43 AM, Ananyev, Konstantin wrote:
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >> On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
> >>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >>>> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
> >>>>> The similar situation in i40e, as explained by Konstantin.
> >>>>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
> >>>>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
> >>>> specifically as below.
> >>>>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
> >>>> ensure that during mempool/mbuf allocation.
> >>>>
> >>>> Indeed it would be good to force the priv_size to be aligned.
> >>>>
> >>>> The priv_size could be aligned automatically in
> >>>> rte_pktmbuf_pool_create(). The only possible problem I could see
> >>>> is that it would break applications that access to the data buffer
> >>>> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
> >>>> best thing to do (I didn't find any applications like this in dpdk).
> >>>
> >>>
> >>> Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?
> >>
> >> Hmm maybe it would break more applications: an odd priv_size is
> >> probably rare, but a priv_size that is not aligned to 8 bytes is
> >> maybe more common.
> >
> > My thought was that rte_mempool_create() was just introduced in 2.1,
> > so if we add extra requirement for the input parameter now -
> > there would be no ABI breakage, and not many people started to use it already.
> > For me just seems a bit easier and more straightforward then silent alignment -
> > user would not have wrong assumptions here.
> > Though if you think that a silent alignment would be more convenient
> > for most users - I wouldn't insist.
> 
> 
> Yes, I agree on the principle, but it depends whether this fix
> is integrated for 2.1 or not.
> I think it may already be a bit late for that, especially as it
> is not a very critical bug.
> 
> Thomas, what do you think?

It is a fix.
Adding a doc comment, an assert and an alignment constraint or a new automatic
alignment in the not yet released function shouldn't hurt.
A patch would be welcome for 2.1. Thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
  2015-07-30  5:14  4%     ` Liu, Yong
@ 2015-07-30 14:25  7%       ` O'Driscoll, Tim
  0 siblings, 0 replies; 200+ results
From: O'Driscoll, Tim @ 2015-07-30 14:25 UTC (permalink / raw)
  To: Liu, Yong, Liang, Cunming, dev, Neil Horman

Hi Neil,

There have been a few deprecation notices like this one submitted. Since you drove the ABI policy, it would be good to get confirmation from you that these are compliant with the policy and that you don't see any issues. Ideally, it would be great if you can review and ack them. If you don't have the time, even just a general indication that you don't see any problems would be useful.


Thanks,
Tim

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Liu, Yong
> Sent: Thursday, July 30, 2015 6:15 AM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] doc: announce abi change for
> interrupt mode
> 
> Acked-by: Marvin Liu <yong.liu@intel.com>
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> > Sent: Thursday, July 30, 2015 1:05 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt
> mode
> >
> > The patch announces the planned ABI changes for interrupt mode.
> >
> > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > ---
> >  v3 change:
> >    - reword for CONFIG_RTE_NEXT_ABI
> >
> >  v2 change:
> >    - rebase to recent master
> >
> >  doc/guides/rel_notes/deprecation.rst | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index 5330d3b..d36d267 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -35,3 +35,8 @@ Deprecation Notices
> >  * The following fields have been deprecated in rte_eth_stats:
> >    imissed, ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
> >    tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff
> > +
> > +* The ABI changes are planned for struct rte_intr_handle, struct
> > rte_eth_conf
> > +  and struct eth_dev_ops to support interrupt mode feature from
> release
> > 2.1.
> > +  Those changes may be enabled in the upcoming release 2.1
> > +  with CONFIG_RTE_NEXT_ABI.
> > --
> > 1.8.1.4

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
  2015-07-30  5:04 13%   ` [dpdk-dev] [PATCH v3] " Cunming Liang
  2015-07-30  5:14  4%     ` Liu, Yong
  2015-07-30  8:31  4%     ` He, Shaopeng
@ 2015-07-31  1:00  4%     ` Zhang, Helin
  2 siblings, 0 replies; 200+ results
From: Zhang, Helin @ 2015-07-31  1:00 UTC (permalink / raw)
  To: Liang, Cunming, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> Sent: Wednesday, July 29, 2015 10:05 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3] doc: announce abi change for interrupt mode
> 
> The patch announces the planned ABI changes for interrupt mode.
> 
> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-13 13:59  4%     ` Neil Horman
  2015-07-17 11:45  7%       ` Mcnamara, John
@ 2015-07-31  9:03  7%       ` Mcnamara, John
  2015-07-31 10:34  4%         ` Neil Horman
  1 sibling, 1 reply; 200+ results
From: Mcnamara, John @ 2015-07-31  9:03 UTC (permalink / raw)
  To: Neil Horman, Chao Zhu; +Cc: dev

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, July 13, 2015 3:00 PM
> To: Mcnamara, John
> Cc: dev@dpdk.org; vladz@cloudius-systems.com
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> 
> On Mon, Jul 13, 2015 at 10:47:03AM +0000, Mcnamara, John wrote:
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Monday, July 13, 2015 11:42 AM
> > > To: Mcnamara, John
> > > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > >
> > > On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > > > Fix for ABI breakage introduced in LRO addition. Moves lro
> > > > bitfield to the end of the struct/member.
> > > >
> > > > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> > > >
> > > > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > > > ---
> > > >  lib/librte_ether/rte_ethdev.h | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > > b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
> > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> > > >  	uint8_t port_id;           /**< Device [external] port
> identifier.
> > > */
> > > >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) /
> OFF(0).
> > > */
> > > >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1)
> /
> > > OFF(0) */
> > > > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> > > >  		all_multicast : 1, /**< RX all multicast mode ON(1) /
> OFF(0).
> > > */
> > > > -		dev_started : 1;   /**< Device state: STARTED(1) /
> STOPPED(0).
> > > */
> > > > +		dev_started : 1,   /**< Device state: STARTED(1) /
> STOPPED(0).
> > > */
> > > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > >  };
> > > >
> > > >  /**
> > > > --
> > > > 1.8.1.4
> > > >
> > > >
> > > I presume the ABI checker stopped complaining about this with the
> > > patch, yes?
> >
> > Hi Neil,
> >
> > Yes, I replied about that in the previous thread.
> >
> Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil

Hi Chao,

Any reply on this.

Neil, if there is no reply to this from the PPC maintainer do you have any objection to this going in as is. 

It at least fixes the LRO ABI breakage on the platforms we can test on.

John

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-31  9:03  7%       ` Mcnamara, John
@ 2015-07-31 10:34  4%         ` Neil Horman
  2015-08-03  2:39  7%           ` Chao Zhu
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2015-07-31 10:34 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

On Fri, Jul 31, 2015 at 09:03:45AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, July 13, 2015 3:00 PM
> > To: Mcnamara, John
> > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > 
> > On Mon, Jul 13, 2015 at 10:47:03AM +0000, Mcnamara, John wrote:
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Monday, July 13, 2015 11:42 AM
> > > > To: Mcnamara, John
> > > > Cc: dev@dpdk.org; vladz@cloudius-systems.com
> > > > Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
> > > >
> > > > On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
> > > > > Fix for ABI breakage introduced in LRO addition. Moves lro
> > > > > bitfield to the end of the struct/member.
> > > > >
> > > > > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> > > > >
> > > > > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> > > > > ---
> > > > >  lib/librte_ether/rte_ethdev.h | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > > > b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
> > > > >  	uint8_t port_id;           /**< Device [external] port
> > identifier.
> > > > */
> > > > >  	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) /
> > OFF(0).
> > > > */
> > > > >  		scattered_rx : 1,  /**< RX of scattered packets is ON(1)
> > /
> > > > OFF(0) */
> > > > > -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
> > > > >  		all_multicast : 1, /**< RX all multicast mode ON(1) /
> > OFF(0).
> > > > */
> > > > > -		dev_started : 1;   /**< Device state: STARTED(1) /
> > STOPPED(0).
> > > > */
> > > > > +		dev_started : 1,   /**< Device state: STARTED(1) /
> > STOPPED(0).
> > > > */
> > > > > +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > > >  };
> > > > >
> > > > >  /**
> > > > > --
> > > > > 1.8.1.4
> > > > >
> > > > >
> > > > I presume the ABI checker stopped complaining about this with the
> > > > patch, yes?
> > >
> > > Hi Neil,
> > >
> > > Yes, I replied about that in the previous thread.
> > >
> > Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil
> 
> Hi Chao,
> 
> Any reply on this.
> 
> Neil, if there is no reply to this from the PPC maintainer do you have any objection to this going in as is. 
> 
> It at least fixes the LRO ABI breakage on the platforms we can test on.
> 
> John
> 
Well, I suppose at this point the only thing its hurting is ppc, so no, no
objections.  But its pretty disheartening for an arch maintainer to dissappear
so soon after adding arch support.

Neil

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-16 22:22  4% ` Vlad Zolotarov
@ 2015-08-02 21:06  7%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2015-08-02 21:06 UTC (permalink / raw)
  To: John McNamara; +Cc: dev

2015-07-17 01:22, Vlad Zolotarov:
> On 07/13/15 13:26, John McNamara wrote:
> > Fix for ABI breakage introduced in LRO addition. Moves
> > lro bitfield to the end of the struct/member.
> >
> > Fixes: 8eecb3295aed (ixgbe: add LRO support)
> >
> > Signed-off-by: John McNamara <john.mcnamara@intel.com>
> 
> Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com>

Applied, thanks

We still don't know if POWER Big Endian ABI is broken.

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-07-31 10:34  4%         ` Neil Horman
@ 2015-08-03  2:39  7%           ` Chao Zhu
  2015-08-03  3:45  4%             ` Chao Zhu
  0 siblings, 1 reply; 200+ results
From: Chao Zhu @ 2015-08-03  2:39 UTC (permalink / raw)
  To: nhorman, thomas.monjalon, Mcnamara, John; +Cc: dev


Really sorry for the delay.
Originally, I thought the email was to asking the ABI checking tools on 
Power which I'm not so familiar with.  So this took me some time to find 
solution. For Power little endian, the build is OK. I'll give feedback 
when I tried Big  endian compilation.

On 2015/7/31 18:34, Neil Horman wrote:
> On Fri, Jul 31, 2015 at 09:03:45AM +0000, Mcnamara, John wrote:
>>> -----Original Message-----
>>> From: Neil Horman [mailto:nhorman@tuxdriver.com]
>>> Sent: Monday, July 13, 2015 3:00 PM
>>> To: Mcnamara, John
>>> Cc: dev@dpdk.org; vladz@cloudius-systems.com
>>> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
>>>
>>> On Mon, Jul 13, 2015 at 10:47:03AM +0000, Mcnamara, John wrote:
>>>>> -----Original Message-----
>>>>> From: Neil Horman [mailto:nhorman@tuxdriver.com]
>>>>> Sent: Monday, July 13, 2015 11:42 AM
>>>>> To: Mcnamara, John
>>>>> Cc: dev@dpdk.org; vladz@cloudius-systems.com
>>>>> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
>>>>>
>>>>> On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
>>>>>> Fix for ABI breakage introduced in LRO addition. Moves lro
>>>>>> bitfield to the end of the struct/member.
>>>>>>
>>>>>> Fixes: 8eecb3295aed (ixgbe: add LRO support)
>>>>>>
>>>>>> Signed-off-by: John McNamara <john.mcnamara@intel.com>
>>>>>> ---
>>>>>>   lib/librte_ether/rte_ethdev.h | 4 ++--
>>>>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_ether/rte_ethdev.h
>>>>>> b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
>>>>>> --- a/lib/librte_ether/rte_ethdev.h
>>>>>> +++ b/lib/librte_ether/rte_ethdev.h
>>>>>> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>>>>>>   	uint8_t port_id;           /**< Device [external] port
>>> identifier.
>>>>> */
>>>>>>   	uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) /
>>> OFF(0).
>>>>> */
>>>>>>   		scattered_rx : 1,  /**< RX of scattered packets is ON(1)
>>> /
>>>>> OFF(0) */
>>>>>> -		lro          : 1,  /**< RX LRO is ON(1) / OFF(0) */
>>>>>>   		all_multicast : 1, /**< RX all multicast mode ON(1) /
>>> OFF(0).
>>>>> */
>>>>>> -		dev_started : 1;   /**< Device state: STARTED(1) /
>>> STOPPED(0).
>>>>> */
>>>>>> +		dev_started : 1,   /**< Device state: STARTED(1) /
>>> STOPPED(0).
>>>>> */
>>>>>> +		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
>>>>>>   };
>>>>>>
>>>>>>   /**
>>>>>> --
>>>>>> 1.8.1.4
>>>>>>
>>>>>>
>>>>> I presume the ABI checker stopped complaining about this with the
>>>>> patch, yes?
>>>> Hi Neil,
>>>>
>>>> Yes, I replied about that in the previous thread.
>>>>
>>> Thank you, I'll ack as soon as Chao confirms its not a problem on ppc Neil
>> Hi Chao,
>>
>> Any reply on this.
>>
>> Neil, if there is no reply to this from the PPC maintainer do you have any objection to this going in as is.
>>
>> It at least fixes the LRO ABI breakage on the platforms we can test on.
>>
>> John
>>
> Well, I suppose at this point the only thing its hurting is ppc, so no, no
> objections.  But its pretty disheartening for an arch maintainer to dissappear
> so soon after adding arch support.
>
> Neil
>

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
  2015-08-03  2:39  7%           ` Chao Zhu
@ 2015-08-03  3:45  4%             ` Chao Zhu
  0 siblings, 0 replies; 200+ results
From: Chao Zhu @ 2015-08-03  3:45 UTC (permalink / raw)
  To: nhorman, thomas.monjalon, Mcnamara, John; +Cc: dev

Confirmed. It can compile on Power8 Big Endian.
Thank you!

On 2015/8/3 10:39, Chao Zhu wrote:
>
> Really sorry for the delay.
> Originally, I thought the email was to asking the ABI checking tools 
> on Power which I'm not so familiar with.  So this took me some time to 
> find solution. For Power little endian, the build is OK. I'll give 
> feedback when I tried Big  endian compilation.
>
> On 2015/7/31 18:34, Neil Horman wrote:
>> On Fri, Jul 31, 2015 at 09:03:45AM +0000, Mcnamara, John wrote:
>>>> -----Original Message-----
>>>> From: Neil Horman [mailto:nhorman@tuxdriver.com]
>>>> Sent: Monday, July 13, 2015 3:00 PM
>>>> To: Mcnamara, John
>>>> Cc: dev@dpdk.org; vladz@cloudius-systems.com
>>>> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
>>>>
>>>> On Mon, Jul 13, 2015 at 10:47:03AM +0000, Mcnamara, John wrote:
>>>>>> -----Original Message-----
>>>>>> From: Neil Horman [mailto:nhorman@tuxdriver.com]
>>>>>> Sent: Monday, July 13, 2015 11:42 AM
>>>>>> To: Mcnamara, John
>>>>>> Cc: dev@dpdk.org; vladz@cloudius-systems.com
>>>>>> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code
>>>>>>
>>>>>> On Mon, Jul 13, 2015 at 11:26:25AM +0100, John McNamara wrote:
>>>>>>> Fix for ABI breakage introduced in LRO addition. Moves lro
>>>>>>> bitfield to the end of the struct/member.
>>>>>>>
>>>>>>> Fixes: 8eecb3295aed (ixgbe: add LRO support)
>>>>>>>
>>>>>>> Signed-off-by: John McNamara <john.mcnamara@intel.com>
>>>>>>> ---
>>>>>>>   lib/librte_ether/rte_ethdev.h | 4 ++--
>>>>>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/lib/librte_ether/rte_ethdev.h
>>>>>>> b/lib/librte_ether/rte_ethdev.h index 79bde89..1c3ace1 100644
>>>>>>> --- a/lib/librte_ether/rte_ethdev.h
>>>>>>> +++ b/lib/librte_ether/rte_ethdev.h
>>>>>>> @@ -1578,9 +1578,9 @@ struct rte_eth_dev_data {
>>>>>>>       uint8_t port_id;           /**< Device [external] port
>>>> identifier.
>>>>>> */
>>>>>>>       uint8_t promiscuous   : 1, /**< RX promiscuous mode ON(1) /
>>>> OFF(0).
>>>>>> */
>>>>>>>           scattered_rx : 1, /**< RX of scattered packets is ON(1)
>>>> /
>>>>>> OFF(0) */
>>>>>>> -        lro          : 1, /**< RX LRO is ON(1) / OFF(0) */
>>>>>>>           all_multicast : 1, /**< RX all multicast mode ON(1) /
>>>> OFF(0).
>>>>>> */
>>>>>>> -        dev_started : 1; /**< Device state: STARTED(1) /
>>>> STOPPED(0).
>>>>>> */
>>>>>>> +        dev_started : 1, /**< Device state: STARTED(1) /
>>>> STOPPED(0).
>>>>>> */
>>>>>>> +        lro         : 1; /**< RX LRO is ON(1) / OFF(0) */
>>>>>>>   };
>>>>>>>
>>>>>>>   /**
>>>>>>> -- 
>>>>>>> 1.8.1.4
>>>>>>>
>>>>>>>
>>>>>> I presume the ABI checker stopped complaining about this with the
>>>>>> patch, yes?
>>>>> Hi Neil,
>>>>>
>>>>> Yes, I replied about that in the previous thread.
>>>>>
>>>> Thank you, I'll ack as soon as Chao confirms its not a problem on 
>>>> ppc Neil
>>> Hi Chao,
>>>
>>> Any reply on this.
>>>
>>> Neil, if there is no reply to this from the PPC maintainer do you 
>>> have any objection to this going in as is.
>>>
>>> It at least fixes the LRO ABI breakage on the platforms we can test on.
>>>
>>> John
>>>
>> Well, I suppose at this point the only thing its hurting is ppc, so 
>> no, no
>> objections.  But its pretty disheartening for an arch maintainer to 
>> dissappear
>> so soon after adding arch support.
>>
>> Neil
>>
>

^ permalink raw reply	[relevance 4%]

Results 801-1000 of ~18000   |  | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2015-04-14 17:55     [dpdk-dev] [PATCH] pci: make rte_pci_probe void Stephen Hemminger
2015-04-20 13:15     ` Thomas Monjalon
2015-07-30  0:34  3%   ` Stephen Hemminger
2015-04-17 15:35     [dpdk-dev] [PATCH 0/2] functions with useless return Stephen Hemminger
2015-04-17 15:35     ` [dpdk-dev] [PATCH 1/2] log: rte_openlog_stream should be void Stephen Hemminger
2015-05-19 10:24       ` Bruce Richardson
2015-07-30  0:35  3%     ` Stephen Hemminger
2015-04-21 17:32     [dpdk-dev] [PATCH v4 0/7] Hyper-V Poll Mode driver Stephen Hemminger
2015-04-21 17:32     ` [dpdk-dev] [PATCH v4 3/7] hv: add basic vmbus support Stephen Hemminger
2015-07-08 23:51  0%   ` Thomas Monjalon
2015-06-05  7:40     [dpdk-dev] [PATCH v1] abi: announce abi changes plan for interrupt mode Cunming Liang
2015-07-30  1:57 14% ` [dpdk-dev] [PATCH v2] doc: announce abi change " Cunming Liang
2015-07-30  5:04 13%   ` [dpdk-dev] [PATCH v3] " Cunming Liang
2015-07-30  5:14  4%     ` Liu, Yong
2015-07-30 14:25  7%       ` O'Driscoll, Tim
2015-07-30  8:31  4%     ` He, Shaopeng
2015-07-31  1:00  4%     ` Zhang, Helin
2015-06-08  5:28     [dpdk-dev] [PATCH v12 00/14] Interrupt mode PMD Cunming Liang
2015-06-19  4:00     ` [dpdk-dev] [PATCH v13 " Cunming Liang
2015-07-09 13:58  3%   ` David Marchand
2015-07-17  6:04  0%     ` Liang, Cunming
2015-07-17  6:16  4%   ` [dpdk-dev] [PATCH v14 00/13] " Cunming Liang
2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
2015-07-19 23:31  0%       ` Thomas Monjalon
2015-07-20  2:02  0%         ` Liang, Cunming
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 06/13] eal/linux: standalone intr event fd create support Cunming Liang
2015-07-17  6:16  8%     ` [dpdk-dev] [PATCH v14 08/13] eal/bsd: dummy for new intr definition Cunming Liang
2015-07-17  6:16  3%     ` [dpdk-dev] [PATCH v14 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
2015-07-17  6:16  1%     ` [dpdk-dev] [PATCH v14 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 12/13] igb: enable rx queue interrupts for PF Cunming Liang
2015-07-17  6:16  2%     ` [dpdk-dev] [PATCH v14 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
2015-07-20  3:02  4%     ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 01/13] eal/linux: add interrupt vectors support in intr_handle Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 02/13] eal/linux: add rte_epoll_wait/ctl support Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 03/13] eal/linux: add API to set rx interrupt event monitor Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 05/13] eal/linux: map eventfd to VFIO MSI-X intr vector Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 06/13] eal/linux: standalone intr event fd create support Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 08/13] eal/bsd: dummy for new intr definition Cunming Liang
2015-07-20  3:02  3%       ` [dpdk-dev] [PATCH v15 10/13] ethdev: add rx intr enable, disable and ctl functions Cunming Liang
2015-07-20  3:02  1%       ` [dpdk-dev] [PATCH v15 11/13] ixgbe: enable rx queue interrupts for both PF and VF Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 12/13] igb: enable rx queue interrupts for PF Cunming Liang
2015-07-20  3:02  2%       ` [dpdk-dev] [PATCH v15 13/13] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch Cunming Liang
2015-07-23 14:18  3%       ` [dpdk-dev] [PATCH v15 00/13] Interrupt mode PMD Liang, Cunming
2015-07-27 21:34  0%         ` Thomas Monjalon
2015-06-12 11:28     [dpdk-dev] [PATCH 0/4] ethdev: Add checks for function support in driver Bruce Richardson
2015-06-15 10:14     ` [dpdk-dev] [PATCH 4/4] ethdev: check support for rx_queue_count and descriptor_done fns Bruce Richardson
2015-07-06 15:11       ` Thomas Monjalon
2015-07-26 20:44  0%     ` Thomas Monjalon
2015-06-25 22:05     [dpdk-dev] [PATCH v2 00/11] Cuckoo hash Pablo de Lara
2015-06-28 22:25     ` [dpdk-dev] [PATCH v3 " Pablo de Lara
2015-07-08 23:23       ` Thomas Monjalon
2015-07-09  8:02  0%     ` Bruce Richardson
2015-07-10 17:24  4%   ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of " Pablo de Lara
2015-07-10 17:24 14%     ` [dpdk-dev] [PATCH v4 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 20:52  0%     ` [dpdk-dev] [PATCH v4 0/7] Cuckoo hash - part 3 of Cuckoo hash Bruce Richardson
2015-07-10 21:57  4%     ` [dpdk-dev] [PATCH v5 " Pablo de Lara
2015-07-10 21:57 14%       ` [dpdk-dev] [PATCH v5 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-10 23:30  4%       ` [dpdk-dev] [PATCH v6 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
2015-07-10 23:30 14%         ` [dpdk-dev] [PATCH v6 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-11  0:18  4%         ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Pablo de Lara
2015-07-11  0:18               ` [dpdk-dev] [PATCH v7 1/7] hash: replace existing hash library with cuckoo hash implementation Pablo de Lara
2015-07-12 22:29  3%             ` Thomas Monjalon
2015-07-13 16:11  3%               ` Bruce Richardson
2015-07-13 16:14  0%                 ` Bruce Richardson
2015-07-13 16:20  0%                   ` Thomas Monjalon
2015-07-13 16:26  0%                     ` Bruce Richardson
2015-07-11  0:18 14%           ` [dpdk-dev] [PATCH v7 6/7] doc: announce ABI change of librte_hash Pablo de Lara
2015-07-12 22:38  8%             ` Thomas Monjalon
2015-07-12 22:46  0%           ` [dpdk-dev] [PATCH v7 0/7] Cuckoo hash - part 3 of Cuckoo hash Thomas Monjalon
2015-06-29 13:42     [dpdk-dev] [PATCH v2 0/7] ethdev: add support for ieee1588 timestamping John McNamara
2015-07-06 13:16     ` [dpdk-dev] [PATCH v3 7/7] abi: announce mbuf addition for ieee1588 in DPDK 2.2 Thomas Monjalon
2015-07-08 13:10       ` Bruce Richardson
2015-07-09 15:51  7%     ` Thomas Monjalon
2015-07-09 16:01  4%       ` Bruce Richardson
2015-07-02 13:50     [dpdk-dev] [PATCH 0/3] doc: added guidelines on dpdk documentation John McNamara
2015-07-10 15:45     ` [dpdk-dev] [PATCH v2 " John McNamara
2015-07-10 15:45  3%   ` [dpdk-dev] [PATCH v2 2/3] " John McNamara
2015-07-02 22:05     [dpdk-dev] [PATCH] mk: enable next abi in static libs Thomas Monjalon
2015-07-06 21:44     ` Thomas Monjalon
2015-07-07 11:14       ` Neil Horman
2015-07-07 12:46         ` Thomas Monjalon
2015-07-07 13:44           ` Neil Horman
2015-07-10 16:07  4%         ` Mcnamara, John
2015-07-11 14:19  7%           ` Neil Horman
2015-07-13 10:14  8%             ` Mcnamara, John
2015-07-03  8:32     [dpdk-dev] [PATCH v9 00/19] unified packet type Helin Zhang
2015-07-09 16:31  4% ` [dpdk-dev] [PATCH v10 " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 01/19] mbuf: redefine packet_type in rte_mbuf Helin Zhang
2015-07-13 15:53  0%     ` Thomas Monjalon
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 02/19] mbuf: add definitions of unified packet types Helin Zhang
2015-07-15 10:19  0%     ` Olivier MATZ
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 03/19] e1000: replace bit mask based packet type with unified packet type Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 04/19] ixgbe: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 05/19] i40e: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 06/19] enic: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 07/19] vmxnet3: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 08/19] fm10k: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 09/19] cxgbe: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 10/19] app/test-pipeline: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 11/19] app/testpmd: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 12/19] app/test: Remove useless code Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 13/19] examples/ip_fragmentation: replace bit mask based packet type with unified packet type Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 14/19] examples/ip_reassembly: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 15/19] examples/l3fwd-acl: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 16/19] examples/l3fwd-power: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 17/19] examples/l3fwd: " Helin Zhang
2015-07-09 16:31  3%   ` [dpdk-dev] [PATCH v10 18/19] examples/tep_termination: " Helin Zhang
2015-07-09 16:31  4%   ` [dpdk-dev] [PATCH v10 19/19] mbuf: remove old packet type bit masks Helin Zhang
2015-07-15 23:00  0%   ` [dpdk-dev] [PATCH v10 00/19] unified packet type Thomas Monjalon
2015-07-15 23:51  0%     ` Zhang, Helin
2015-07-03  8:49     [dpdk-dev] [PATCH v2] doc: announce ABI changes planned for " Helin Zhang
2015-07-07 17:45     ` [dpdk-dev] [PATCH v3] " Helin Zhang
2015-07-09  0:56  4%   ` Wu, Jingjing
2015-07-15 23:37  4%     ` Thomas Monjalon
2015-07-03  9:55     [dpdk-dev] [PATCH v7 0/9] Dynamic memzones Sergio Gonzalez Monroy
2015-07-14  8:57  4% ` [dpdk-dev] [PATCH v8 " Sergio Gonzalez Monroy
2015-07-14  8:57  1%   ` [dpdk-dev] [PATCH v8 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
2015-07-14  8:57 19%   ` [dpdk-dev] [PATCH v8 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
2015-07-15  8:26  4%   ` [dpdk-dev] [PATCH v9 0/9] Dynamic memzones Sergio Gonzalez Monroy
2015-07-15  8:26  1%     ` [dpdk-dev] [PATCH v9 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
2015-07-15  8:26 19%     ` [dpdk-dev] [PATCH v9 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
2015-07-15 16:32  3%     ` [dpdk-dev] [PATCH v10 0/9] Dynamic memzones Sergio Gonzalez Monroy
2015-07-15 16:32  1%       ` [dpdk-dev] [PATCH v10 2/9] eal: memzone allocated by malloc Sergio Gonzalez Monroy
2015-07-15 16:32 19%       ` [dpdk-dev] [PATCH v10 8/9] doc: announce ABI change of librte_malloc Sergio Gonzalez Monroy
2015-07-07  7:58     [dpdk-dev] [PATCH 0/3] fix the issue sctp flow cannot be matched in FVL FDIR Jingjing Wu
2015-07-07  9:04     ` Liu, Yong
2015-07-19 22:54  3%   ` Thomas Monjalon
2015-07-08  1:24     [dpdk-dev] [PATCH] config: revert the CONFIG_RTE_MAX_QUEUES_PER_PORT to 256 Jijiang Liu
2015-07-10 21:58  0% ` Thomas Monjalon
2015-07-08  2:08     [dpdk-dev] [PATCH] doc:announce ABI changes planned for struct rte_eth_dev to support up to 1024 queues per port Jijiang Liu
2015-07-08 11:07     ` Neil Horman
2015-07-10 22:14  4%   ` Thomas Monjalon
2015-07-08 11:27     [dpdk-dev] [PATCH] Make rte_hash struct internal - Cuckoo hash part 1 Pablo de Lara
2015-07-08 11:27     ` [dpdk-dev] [PATCH] hash: move rte_hash structure to C file and make it internal Pablo de Lara
2015-07-08 13:21       ` Bruce Richardson
2015-07-08 16:57         ` Matthew Hall
2015-07-09  8:12  3%       ` Bruce Richardson
2015-07-09 20:42  3%         ` Matthew Hall
2015-07-10 10:27  0%     ` Thomas Monjalon
2015-07-08 14:55     [dpdk-dev] [PATCH v2 2/2] mk: enable next abi preview Thomas Monjalon
2015-07-08 16:44     ` [dpdk-dev] [PATCH v3] " Thomas Monjalon
2015-07-13  7:32  7%   ` Mcnamara, John
2015-07-13  8:48  7%     ` Thomas Monjalon
2015-07-13  9:02  8%       ` [dpdk-dev] [PATCH] mk: fix shared lib build with stable abi Thomas Monjalon
2015-07-13  9:24  4%         ` Mcnamara, John
2015-07-13  9:32  7%           ` Thomas Monjalon
2015-07-09  2:47 21% [dpdk-dev] [PATCH] doc: announce ABI change of rte_fdir_filter, rte_fdir_masks Wenzhuo Lu
2015-07-09  7:32  4% ` Thomas Monjalon
2015-07-09  8:39  4%   ` Lu, Wenzhuo
2015-07-10  2:24 21% ` [dpdk-dev] [PATCH v2] doc: announce ABI change of rte_eth_fdir_filter, rte_eth_fdir_masks Wenzhuo Lu
2015-07-09  4:58     [dpdk-dev] [PATCH v4 00/11] Introducing the TILE-Gx platform Zhigang Lu
2015-07-09  4:58  5% ` [dpdk-dev] [PATCH v4 04/11] config: remove RTE_LIBNAME definition Zhigang Lu
2015-07-09  8:25     [dpdk-dev] [PATCH v5 00/11] Introducing the TILE-Gx platform Zhigang Lu
2015-07-09  8:25  5% ` [dpdk-dev] [PATCH v5 04/11] config: remove RTE_LIBNAME definition Zhigang Lu
2015-07-09 13:30  3% [dpdk-dev] [PATCH v4 0/7] ethdev: add support for ieee1588 timestamping John McNamara
2015-07-10  0:43  0% ` Thomas Monjalon
2015-07-13 10:26  7% [dpdk-dev] [PATCH] ethdev: fix ABI breakage in lro code John McNamara
2015-07-13 10:42  7% ` Neil Horman
2015-07-13 10:46  4%   ` Thomas Monjalon
2015-07-13 10:47  4%   ` Mcnamara, John
2015-07-13 13:59  4%     ` Neil Horman
2015-07-17 11:45  7%       ` Mcnamara, John
2015-07-17 12:25  4%         ` Neil Horman
2015-07-31  9:03  7%       ` Mcnamara, John
2015-07-31 10:34  4%         ` Neil Horman
2015-08-03  2:39  7%           ` Chao Zhu
2015-08-03  3:45  4%             ` Chao Zhu
2015-07-16 22:22  4% ` Vlad Zolotarov
2015-08-02 21:06  7%   ` Thomas Monjalon
2015-07-13 14:17  3% [dpdk-dev] [PATCH v5 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
2015-07-13 14:17  9% ` [dpdk-dev] [PATCH v5 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
2015-07-13 16:25  3% [dpdk-dev] [PATCH] hash: rename unused field to "reserved" Bruce Richardson
2015-07-13 16:28  0% ` Bruce Richardson
2015-07-13 16:38  3% ` [dpdk-dev] [PATCH v2] " Bruce Richardson
2015-07-13 17:29  0%   ` Thomas Monjalon
2015-07-15  8:08  3%   ` Olga Shern
2015-07-15 13:11  3% [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Maryam Tahhan
2015-07-15 13:11  9% ` [dpdk-dev] [PATCH v6 4/9] ethdev: remove HW specific stats in stats structs Maryam Tahhan
2015-07-16  7:54  0% ` [dpdk-dev] [PATCH v6 0/9] Expose IXGBE extended stats to DPDK apps Olivier MATZ
2015-07-16 11:36 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_cfgfile Cristian Dumitrescu
2015-07-16 11:50  4% ` Gajdzica, MaciejX T
2015-07-16 12:28  4% ` Mrzyglod, DanielX T
2015-07-16 12:49  4% ` Singh, Jasvinder
2015-07-16 12:19 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_port Cristian Dumitrescu
2015-07-16 12:25  4% ` Gajdzica, MaciejX T
2015-07-16 12:25  4% ` Thomas Monjalon
2015-07-16 15:09  4%   ` Dumitrescu, Cristian
2015-07-16 12:30  4% ` Mrzyglod, DanielX T
2015-07-16 12:49  4% ` Singh, Jasvinder
2015-07-16 15:27 14% [dpdk-dev] [PATCH v2] " Cristian Dumitrescu
2015-07-16 15:51  4% ` Singh, Jasvinder
2015-07-17  7:56  4% ` Gajdzica, MaciejX T
2015-07-17  8:08  4% ` Mrzyglod, DanielX T
2015-07-16 16:59 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_table Cristian Dumitrescu
2015-07-17  7:54  4% ` Gajdzica, MaciejX T
2015-07-17  8:09  4% ` Mrzyglod, DanielX T
2015-07-17 12:02  4% ` Singh, Jasvinder
2015-07-16 17:07 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_pipeline Cristian Dumitrescu
2015-07-17  7:54  4% ` Gajdzica, MaciejX T
2015-07-17  8:10  4% ` Mrzyglod, DanielX T
2015-07-17 12:03  4% ` Singh, Jasvinder
2015-07-16 21:21 14% [dpdk-dev] [PATCH] doc: announce ABI change for librte_sched Stephen Hemminger
2015-07-16 21:25  4% ` Dumitrescu, Cristian
2015-07-16 21:28  4% ` Neil Horman
2015-07-23 10:18  4% ` Dumitrescu, Cristian
2015-07-16 21:34     [dpdk-dev] [PATCH v5 0/4] rte_sched: cleanup and deprecation Stephen Hemminger
2015-07-16 21:34  3% ` [dpdk-dev] [PATCH v5 4/4] rte_sched: hide structure of port hierarchy Stephen Hemminger
2015-07-19 10:52  4% [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
2015-07-19 10:52 36% ` [dpdk-dev] [PATCH 1/4] doc: rename ABI chapter to deprecation Thomas Monjalon
2015-07-21 13:20  7%   ` Dumitrescu, Cristian
2015-07-21 14:03  7%     ` Thomas Monjalon
2015-07-19 21:32  0% ` [dpdk-dev] [PATCH 0/4] ethdev/eal API fixes Thomas Monjalon
2015-07-20 10:45  0% ` Neil Horman
2015-07-20  7:03 13% [dpdk-dev] [PATCH] doc: announce ABI change for rte_eth_fdir_filter Jingjing Wu
2015-07-28  8:22  4% ` Lu, Wenzhuo
2015-07-30  3:38  4% ` Liang, Cunming
2015-07-20 12:19     [dpdk-dev] [PATCHv3 0/5] ethdev: add new API to retrieve RX/TX queue information Konstantin Ananyev
2015-07-20 12:19  2% ` [dpdk-dev] [PATCHv3 1/5] " Konstantin Ananyev
2015-07-22 16:50  0%   ` Zhang, Helin
2015-07-22 17:00  0%     ` Ananyev, Konstantin
2015-07-22 18:28  2%   ` [dpdk-dev] [PATCHv4 " Konstantin Ananyev
2015-07-22 19:48  0%     ` Stephen Hemminger
2015-07-23 10:52  0%       ` Ananyev, Konstantin
2015-07-23 16:26         ` Thomas Monjalon
2015-07-24  9:15  3%       ` Ananyev, Konstantin
2015-07-24  9:24  0%         ` Thomas Monjalon
2015-07-24 10:50  3%           ` Ananyev, Konstantin
2015-07-24 12:40  3%             ` Thomas Monjalon
2015-07-20 14:12     [dpdk-dev] [PATCH] examples: new example: l2fwd-ethtool Liang-Min Larry Wang
2015-07-23 15:00  4% ` [dpdk-dev] [PATCH v2 0/2] Example: l2fwd-ethtool Liang-Min Larry Wang
2015-07-23 15:00  4%   ` [dpdk-dev] [PATCH v2 1/2] Remove ABI requierment for external library builds Liang-Min Larry Wang
2015-07-21 14:10  3% [dpdk-dev] [PATCH] hash: move field hash_func_init_val in rte_hash struct Pablo de Lara
2015-07-22  9:08  0% ` Thomas Monjalon
2015-07-22 18:28     [dpdk-dev] [PATCHv4 0/5] ethdev: add new API to retrieve RX/TX queue information Konstantin Ananyev
2015-07-23 10:59  4% [dpdk-dev] [PATCH v2] announce ABI change for librte_table Cristian Dumitrescu
2015-07-23 11:05  4% ` Singh, Jasvinder
2015-07-23 11:07  4% ` Mrzyglod, DanielX T
2015-07-23 11:34  4% ` Gajdzica, MaciejX T
2015-07-27 22:55  4% [dpdk-dev] [dpdk-announce] release candidate 2.1.0-rc2 Thomas Monjalon
2015-07-28 15:47     [dpdk-dev] [PATCH 0/4] some fixes for bnx2x Thomas Monjalon
2015-07-28 15:47  3% ` [dpdk-dev] [PATCH 1/4] bnx2x: fix build as shared library Thomas Monjalon
2015-07-29 15:07     [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd Martin Weiser
2015-07-29 18:12     ` Ananyev, Konstantin
2015-07-29 20:24       ` Zhang, Helin
2015-07-30  8:12         ` Olivier MATZ
2015-07-30  9:00           ` Ananyev, Konstantin
2015-07-30  9:10             ` Olivier MATZ
2015-07-30  9:43  3%           ` Ananyev, Konstantin
2015-07-30 11:22  0%             ` Olivier MATZ
2015-07-30 13:47  0%               ` Thomas Monjalon
2015-07-30  9:25  7% [dpdk-dev] abi change announce Xie, Huawei
2015-07-30 10:18  4% ` Thomas Monjalon
2015-07-30 10:33  8%   ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).