ABI - search results

DPDK patches and discussions
 help / color / mirror / Atom feed

Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download:

* Re: [dpdk-dev] [RFC PATCH 0/6] mempool: add bucket mempool driver
  @ 2018-01-17 15:03  0%   ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-01-17 15:03 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

Hi Olivier,

first of all many thanks for the review. See my replies/comments below.
Also I'll reply to the the specific patch mails as well.

On 12/14/2017 04:36 PM, Olivier MATZ wrote:
> Hi Andrew,
>
> Please find some comments about this patchset below.
> I'll also send some comments as replies to the specific patch.
>
> On Fri, Nov 24, 2017 at 04:06:25PM +0000, Andrew Rybchenko wrote:
>> The patch series adds bucket mempool driver which allows to allocate
>> (both physically and virtually) contiguous blocks of objects and adds
>> mempool API to do it. It is still capable to provide separate objects,
>> but it is definitely more heavy-weight than ring/stack drivers.
>>
>> The target usecase is dequeue in blocks and enqueue separate objects
>> back (which are collected in buckets to be dequeued). So, the memory
>> pool with bucket driver is created by an application and provided to
>> networking PMD receive queue. The choice of bucket driver is done using
>> rte_eth_dev_pool_ops_supported(). A PMD that relies upon contiguous
>> block allocation should report the bucket driver as the only supported
>> and preferred one.
> So, you are planning to use this driver for a future/existing PMD?

Yes, we're going to use it in the sfc PMD in the case of dedicated FW
variant which utilizes the bucketing.

> Do you have numbers about the performance gain, in which conditions,
> etc... ? And are there conditions where there is a performance loss ?

Our idea here is to use it together HW/FW which understand the bucketing.
It adds some load on CPU to track buckets, but block/bucket dequeue allows
to compensate it. We'll try to prepare performance figures when we have
solution close to final. Hopefully pretty soon.

>> The number of objects in the contiguous block is a function of bucket
>> memory size (.config option) and total element size.
> The size of the bucket memory is hardcoded to 32KB.
> Why this value ?

It is just an example. In fact we test mainly with 64K and 128K.

> Won't that be an issue if the user wants to use larger objects?

Ideally it should be start-time configurable, but it requires a way
to specify driver-specific parameters passed to mempool on allocation.
Right now we decided to keep the task for the future since there is
no clear understanding on how it should look like.
If you have ideas, please, share, we would be thankful.

>> As I understand it breaks ABI so it requires 3 acks in accordance with
>> policy, deprecation notice and mempool shared library version bump.
>> If there is a way to avoid ABI breakage, please, let us know.
> If my understanding is correct, the ABI breakage is caused by the
> addition of the new block dequeue operation, right?

Yes and we'll have more ops to make population of objects customizable.

Thanks,
Andrew.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] ethdev: increase flow type limit from 32 to 64
  @ 2018-01-17 16:56  0%   ` Ferruh Yigit
  2018-01-18  9:24  0%     ` Rybalchenko, Kirill
  2018-01-18 12:25  0%     ` Ferruh Yigit
  0 siblings, 2 replies; 200+ results
From: Ferruh Yigit @ 2018-01-17 16:56 UTC (permalink / raw)
  To: Kirill Rybalchenko, dev; +Cc: andrey.chilikin, thomas

On 1/15/2018 5:33 PM, Kirill Rybalchenko wrote:
> Increase the internal limit for flow types from 32 to 64
> to support future flow type extensions.
> Change type of variables from uint32_t[] to uint64_t[]:
> rte_eth_fdir_info.flow_types_mask
> rte_eth_hash_global_conf.sym_hash_enable_mask
> rte_eth_hash_global_conf.valid_bit_mask
> 
> This modification affects the following components:
> net/i40e
> net/ixgbe
> app/testpmd
> 
> v2:
> implement versioning of rte_eth_dev_filter_ctrl function
> for ABI backward compatibility with version 17.11 and older
> 
> v3:
> fix code style warnings
> 
> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>

Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>


I suggest keeping deprecation notice and clean versioning in next release, does
it make sense?

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: add deprecation notice for physmem layout function
@ 2018-01-17 17:17 10% Anatoly Burakov
  2018-01-18 10:32 13% ` [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes Anatoly Burakov
  0 siblings, 1 reply; 200+ results
From: Anatoly Burakov @ 2018-01-17 17:17 UTC (permalink / raw)
  To: dev; +Cc: Neil Horman, John McNamara, Marko Kovacevic

Due to coming changes outlined in memory hotplug RFC, that
function will no longer serve any meaningful purpose.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    Patch outlining future changes:
    http://dpdk.org/dev/patchwork/patch/32467/

 doc/guides/rel_notes/deprecation.rst | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 13e8543..28f217d 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -8,6 +8,10 @@ API and ABI deprecation notices are to be posted here.
 Deprecation Notices
 -------------------
 
+* eal: due to internal data layoyut reorganization, function
+  ``rte_eal_get_physmem_layout`` will be deprecated in v18.05 and removed in
+  subsequent releases.
+
 * eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
   The format of device command line parameters will change. The bus will need
   to be explicitly stated in the device declaration. The enum ``rte_devtype``
-- 
2.7.4

^ permalink raw reply	[relevance 10%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for ring structure
  @ 2018-01-17 21:07  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-17 21:07 UTC (permalink / raw)
  To: Olivier MATZ
  Cc: dev, bruce.richardson, john.mcnamara, daniel.verkamp, konstantin.ananyev

08/12/2017 18:01, Thomas Monjalon:
> 08/12/2017 15:14, Olivier MATZ:
> > > +* ring: The alignment constraints on the ring structure will be relaxed
> > > +  to one cache line instead of two, and an empty cache line padding will
> > > +  be added between the producer and consumer structures. The size of the
> > > +  structure and the offset of the fields will remain the same on
> > > +  platforms with 64B cache line, but will change on other platforms.
> > 
> > It looks this patch was forgotten.
> > It has 3 acks but was not integrated in 17.11.
> > Or did I miss something?
> 
> It seems I missed something. Sorry about that.
> The release 18.02 should be ABI stable.
> While happy to experiment such stability on one release,
> it seems I forgot to notify you on this thread.
> Sorry again

Applied for change planned in 18.05.
Sorry again for the delay.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/2] lib/cryptodev: add support to set session private data
  @ 2018-01-18  6:52  4%                               ` Gujjar, Abhinandan S
  2018-01-22  6:51  0%                                 ` Gujjar, Abhinandan S
  0 siblings, 1 reply; 200+ results
From: Gujjar, Abhinandan S @ 2018-01-18  6:52 UTC (permalink / raw)
  To: Akhil Goyal, De Lara Guarch, Pablo, Doherty, Declan, Jacob, Jerin
  Cc: dev, Vangati, Narender, Rao, Nikhil

Hi Akhil,

> -----Original Message-----
> From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> Sent: Wednesday, January 17, 2018 4:23 PM
> To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; Jacob, Jerin
> <Jerin.JacobKollanukkaran@cavium.com>
> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>; Rao,
> Nikhil <nikhil.rao@intel.com>
> Subject: Re: [PATCH 1/2] lib/cryptodev: add support to set session private data
> 
> Hi Abhinandan,
> On 1/17/2018 3:35 PM, Gujjar, Abhinandan S wrote:
> > Hi Akhil,
> >
> >> -----Original Message-----
> >> From: De Lara Guarch, Pablo
> >> Sent: Wednesday, January 17, 2018 3:16 PM
> >> To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Akhil Goyal
> >> <akhil.goyal@nxp.com>; Doherty, Declan <declan.doherty@intel.com>;
> >> Jacob, Jerin <Jerin.JacobKollanukkaran@cavium.com>
> >> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> >> Rao, Nikhil <nikhil.rao@intel.com>
> >> Subject: RE: [PATCH 1/2] lib/cryptodev: add support to set session
> >> private data
> >>
> >> Hi Abhinandan,
> >>
> >>> -----Original Message-----
> >>> From: Gujjar, Abhinandan S
> >>> Sent: Wednesday, January 17, 2018 6:35 AM
> >>> To: Akhil Goyal <akhil.goyal@nxp.com>; Doherty, Declan
> >>> <declan.doherty@intel.com>; De Lara Guarch, Pablo
> >>> <pablo.de.lara.guarch@intel.com>; Jacob, Jerin
> >>> <Jerin.JacobKollanukkaran@cavium.com>
> >>> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> >>> Rao, Nikhil <nikhil.rao@intel.com>
> >>> Subject: RE: [PATCH 1/2] lib/cryptodev: add support to set session
> >>> private data
> >>>
> >>> Hi Akhil,
> >>>
> >>
> >> ...
> >>
> >>> I guess, you are suggesting below changes:
> >>> diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> >>> b/lib/librte_cryptodev/rte_cryptodev.h
> >>> index 56958a6..057c39a 100644
> >>> --- a/lib/librte_cryptodev/rte_cryptodev.h
> >>> +++ b/lib/librte_cryptodev/rte_cryptodev.h
> >>> @@ -892,6 +892,8 @@ struct rte_cryptodev_data {
> >>>
> >>>   /** Cryptodev symmetric crypto session */  struct
> >>> rte_cryptodev_sym_session {
> >>> +       uint16_t private_data_offset;
> >>> +       /**< Private data offset */
> >>>          __extension__ void *sess_private_data[0];
> >>>          /**< Private session material */  }; I am ok with this.
> >>>
> >>> Declan/Pablo,
> >>> Is this ok? Do you see any impact on performance or anything else
> >>> has to be considered?
> >>
> >> This is breaking ABI, and since there is a zero length array, this
> >> latter has to be at the end of the structure.
> >> Therefore, this is not a valid option unless ABI deprecation is
> >> announced and then it could be merged in the next release.
> > What is your opinion on this?
> > Should we consider retaining the enum rte_crypto_op_private_data_type?
> 
> As per our previous discussion we are anyway pushing crypto adapter to next
> release, then we do have time for the deprecation notice to be sent.
Not sure, it is really worth breaking ABI or have an enum.
> Or you can reserve the first byte of private data (internal to library) in the session
> to check whether the private data is valid or not.
Regarding reserving the first byte which validates the rest of the metadata data,
unless this byte is also included part of rte_cryptodev_sym_session_create()
i.e. 
memset(sess, 0, (sizeof(void *) * nb_drivers + private_data_flag));
and in
rte_cryptodev_get_header_session_size(void)
{
	/*
	 * Header contains pointers to the private data
	 * of all registered drivers
	 */
	return (sizeof(void *) * nb_drivers + private_data_flag);
}
Without above changes, the flag content can't be just trusted. Do you agree?

Pablo/Declan,
Hope the changes are ok? ABI breakage or anything has to be considered again?
> 
> IMO, private data offset in session is a better approach instead of adding one
> more enum. Others can suggest.
@Others, please provide your inputs so that I can prepare the next patch.

-Abhinandan
> 
> -Akhil
> >>
> >> Pablo
> > Abhinandan
> >


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] ethdev: increase flow type limit from 32 to 64
  2018-01-17 16:56  0%   ` Ferruh Yigit
@ 2018-01-18  9:24  0%     ` Rybalchenko, Kirill
  2018-01-18 12:25  0%     ` Ferruh Yigit
  1 sibling, 0 replies; 200+ results
From: Rybalchenko, Kirill @ 2018-01-18  9:24 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Chilikin, Andrey, thomas



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Wednesday 17 January 2018 16:57
> To: Rybalchenko, Kirill <kirill.rybalchenko@intel.com>; dev@dpdk.org
> Cc: Chilikin, Andrey <andrey.chilikin@intel.com>; thomas@monjalon.net
> Subject: Re: [dpdk-dev] [PATCH v3] ethdev: increase flow type limit from 32
> to 64
> 
> On 1/15/2018 5:33 PM, Kirill Rybalchenko wrote:
> > Increase the internal limit for flow types from 32 to 64 to support
> > future flow type extensions.
> > Change type of variables from uint32_t[] to uint64_t[]:
> > rte_eth_fdir_info.flow_types_mask
> > rte_eth_hash_global_conf.sym_hash_enable_mask
> > rte_eth_hash_global_conf.valid_bit_mask
> >
> > This modification affects the following components:
> > net/i40e
> > net/ixgbe
> > app/testpmd
> >
> > v2:
> > implement versioning of rte_eth_dev_filter_ctrl function for ABI
> > backward compatibility with version 17.11 and older
> >
> > v3:
> > fix code style warnings
> >
> > Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> 
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> 
> I suggest keeping deprecation notice and clean versioning in next release,
> does it make sense?

Yes, I think it should be done in this way, just to keep source codes  tidy.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-01-17 17:17 10% [dpdk-dev] [PATCH] doc: add deprecation notice for physmem layout function Anatoly Burakov
@ 2018-01-18 10:32 13% ` Anatoly Burakov
  2018-01-23 10:36  0%   ` Mcnamara, John
                     ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Anatoly Burakov @ 2018-01-18 10:32 UTC (permalink / raw)
  To: dev; +Cc: Neil Horman, John McNamara, Marko Kovacevic

Due to coming changes outlined in memory hotplug RFC, there will
be several API/ABI changes.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    Patch outlining future changes:
    http://dpdk.org/dev/patchwork/patch/32467/
    
    v2: added rte_mem_config and rte_memzone changes to the announcement

 doc/guides/rel_notes/deprecation.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 13e8543..93cbeea 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -8,6 +8,15 @@ API and ABI deprecation notices are to be posted here.
 Deprecation Notices
 -------------------
 
+* eal: due to internal data layoyut reorganization, there will be changes to
+  several structures and functions as a result of coming changes to support
+  memory hotplug in v18.05.
+  ``rte_eal_get_physmem_layout`` will be deprecated and removed in subsequent
+  releases.
+  ``rte_mem_config`` contents will change due to switch to memseg lists.
+  ``rte_memzone`` member ``memseg_id`` will no longer serve any useful purpose
+  and will be removed.
+
 * eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
   The format of device command line parameters will change. The bus will need
   to be explicitly stated in the device declaration. The enum ``rte_devtype``
-- 
2.7.4

^ permalink raw reply	[relevance 13%]

* Re: [dpdk-dev] [PATCH v3] ethdev: increase flow type limit from 32 to 64
  2018-01-17 16:56  0%   ` Ferruh Yigit
  2018-01-18  9:24  0%     ` Rybalchenko, Kirill
@ 2018-01-18 12:25  0%     ` Ferruh Yigit
  1 sibling, 0 replies; 200+ results
From: Ferruh Yigit @ 2018-01-18 12:25 UTC (permalink / raw)
  To: Kirill Rybalchenko, dev; +Cc: andrey.chilikin, thomas

On 1/17/2018 4:56 PM, Ferruh Yigit wrote:
> On 1/15/2018 5:33 PM, Kirill Rybalchenko wrote:
>> Increase the internal limit for flow types from 32 to 64
>> to support future flow type extensions.
>> Change type of variables from uint32_t[] to uint64_t[]:
>> rte_eth_fdir_info.flow_types_mask
>> rte_eth_hash_global_conf.sym_hash_enable_mask
>> rte_eth_hash_global_conf.valid_bit_mask
>>
>> This modification affects the following components:
>> net/i40e
>> net/ixgbe
>> app/testpmd
>>
>> v2:
>> implement versioning of rte_eth_dev_filter_ctrl function
>> for ABI backward compatibility with version 17.11 and older
>>
>> v3:
>> fix code style warnings
>>
>> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> 
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  @ 2018-01-18 14:45  3%                                               ` Matan Azrad
  2018-01-18 14:51  0%                                                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Matan Azrad @ 2018-01-18 14:45 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

HI

From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > Hi Konstantine
> >
> > > Hi Matan,
> > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > It is a public function used
> > > > > > > > > > > > > > > > > > > > > > by drivers, so need to be
> > > > > > > > > > > > > > > > > > > > > > protected
> > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > > > > protected by the
> > > > > > > > > > same
> > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I don't think so, I can understand
> > > > > > > > > > > > > > > > > > > to use the ownership lock here(as in
> > > > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > timing?(ask in the next moment and
> > > > > > > > > > > > > > > > > > > you may get another
> > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > crash.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > identifies
> > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > functions.
> > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > release synchronization will be managed by
> ethdev."
> > > > > > > > > > > > > > > > > > To me it means that ethdev layer has
> > > > > > > > > > > > > > > > > > to make sure that all accesses to
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > > > atomic.
> > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > process
> > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > name,
> > > ...) ?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Under race condition - in the worst case
> > > > > > > > > > > > > > > > it might crash, though for that you'll
> > > > > > > > > > > > > > > > have to be really
> > > unlucky.
> > > > > > > > > > > > > > > > Though in most cases as you said it would
> > > > > > > > > > > > > > > > just not operate
> > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > I think if we start to protect dev->name
> > > > > > > > > > > > > > > > by lock we need to do it for all instances
> > > > > > > > > > > > > > > > (both read and
> > > write).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Since under the ownership rules, the user
> > > > > > > > > > > > > > > must take ownership
> > > > > > > > of a
> > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > So? The user still should take ownership of a
> > > > > > > > > > > > > device before using it
> > > > > > > > (by
> > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()
> > > > > > > > > > > > > > -
> > > > > > >strcmp().
> > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > return
> > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > Which wrong device do you mean? I guess it is
> > > > > > > > > > > > > the device which
> > > > > > > > > > currently is
> > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > rte_free() for related resources, while It can
> > > > > > > > > > > > > > still be in use by someone
> > > > > else.
> > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > > > must take
> > > > > > > > > > ownership
> > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > release), so
> > > > > > > > > > > > no issue here.
> > > > > > > > > > > >
> > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > > > itself (without
> > > > > > > > setting
> > > > > > > > > > any
> > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > for it.
> > > > > > > > > > > > At the same time second process decides to call
> > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > (let
> > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > > > modify (process 1)
> > > > > > > > same
> > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > >
> > > > > > > > > > > Do you mean that process 0 will compare successfully
> > > > > > > > > > > the process 1
> > > > > > > > new
> > > > > > > > > > port name?
> > > > > > > > > >
> > > > > > > > > > Yes.
> > > > > > > > > >
> > > > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > > > will not compare
> > > > > > > > the
> > > > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > > > UNUSED
> > > > > > > > > > > state.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > >
> > > > > > > > > Someone in process 0 should attach it using protected
> > > > > > > > > attach_secondary
> > > > > > > > somewhere in your scenario.
> > > > > > > >
> > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > See the function with inline comments:
> > > > > > >
> > > > > > > struct rte_eth_dev *
> > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > 	unsigned i;
> > > > > > >
> > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > >
> > > > > > > 	    	The below state are in local process memory,
> > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > current i),
> > > > > > update its local state to ATTACHED and write the name,
> > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > process
> > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> must
> > > > > > take the lock
> > > > > > > and it can't, because it is currently locked by process 1.
> > > > > >
> > > > > > Ok I see.
> > > > > > Thanks for your patience.
> > > > > > BTW, that means that if let say process 0 will call
> > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be
> > > > > > used for different devices and 2 processes will overwrite the
> > > > > > same
> > > > > rte_eth_dev_data[port_id]?
> > > > >
> > > > > No, contrary to the state, the lock itself is in shared memory,
> > > > > so 2 processes cannot allocate port in the same time.(you can
> > > > > see it in the next patch of this series).
> > >
> > > I am not talking about racing here.
> > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > >rte_eth_dev_allocate("xxx")
> > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it assigns
> > > new dev ("xxx") to port N.
> > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > >rte_eth_dev_allocate("yyy").
> > > From its perspective port N is still free:  rte_eth_devices[N].state
> > > == RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the same
> port.
> > >
> >
> > Yes you right, this is a problem(not related actually to port
> > ownership)
> 
> Yep that's true - it was there before your patches.
> 
> > but look:
> > As I understand the secondary processes are not allowed to create a
> > ports and they must to use attach_secondary API, but there is not
> hardcoded check which prevent them to do it.
> 
> Secondary processes ae the ability to allocate their own vdevs and probably it
> should stay like that.
> I just thought it is a good opportunity to fix it while you are on these changes
> anyway, but ok we can leave it for now.
> 
Looks like the fix should break ABI(moving the state to the shared memory), let's try to fix it in the next version :)

> Konstantin

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:45  3%                                               ` Matan Azrad
@ 2018-01-18 14:51  0%                                                 ` Ananyev, Konstantin
  2018-01-18 15:00  0%                                                   ` Matan Azrad
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2018-01-18 14:51 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 2:45 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [PATCH v2 2/6] ethdev: add port ownership
> 
> HI
> 
> From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > > Hi Konstantine
> > >
> > > > Hi Matan,
> > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > > It is a public function used
> > > > > > > > > > > > > > > > > > > > > > > by drivers, so need to be
> > > > > > > > > > > > > > > > > > > > > > > protected
> > > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > > > > > protected by the
> > > > > > > > > > > same
> > > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I don't think so, I can understand
> > > > > > > > > > > > > > > > > > > > to use the ownership lock here(as in
> > > > > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > > timing?(ask in the next moment and
> > > > > > > > > > > > > > > > > > > > you may get another
> > > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > > crash.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > > identifies
> > > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > > release synchronization will be managed by
> > ethdev."
> > > > > > > > > > > > > > > > > > > To me it means that ethdev layer has
> > > > > > > > > > > > > > > > > > > to make sure that all accesses to
> > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > > > > atomic.
> > > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > > name,
> > > > ...) ?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Under race condition - in the worst case
> > > > > > > > > > > > > > > > > it might crash, though for that you'll
> > > > > > > > > > > > > > > > > have to be really
> > > > unlucky.
> > > > > > > > > > > > > > > > > Though in most cases as you said it would
> > > > > > > > > > > > > > > > > just not operate
> > > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > > I think if we start to protect dev->name
> > > > > > > > > > > > > > > > > by lock we need to do it for all instances
> > > > > > > > > > > > > > > > > (both read and
> > > > write).
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Since under the ownership rules, the user
> > > > > > > > > > > > > > > > must take ownership
> > > > > > > > > of a
> > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > So? The user still should take ownership of a
> > > > > > > > > > > > > > device before using it
> > > > > > > > > (by
> > > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()
> > > > > > > > > > > > > > > -
> > > > > > > >strcmp().
> > > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > > return
> > > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > > Which wrong device do you mean? I guess it is
> > > > > > > > > > > > > > the device which
> > > > > > > > > > > currently is
> > > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > > rte_free() for related resources, while It can
> > > > > > > > > > > > > > > still be in use by someone
> > > > > > else.
> > > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > > > > must take
> > > > > > > > > > > ownership
> > > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > > release), so
> > > > > > > > > > > > > no issue here.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > > > > itself (without
> > > > > > > > > setting
> > > > > > > > > > > any
> > > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > > for it.
> > > > > > > > > > > > > At the same time second process decides to call
> > > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > > (let
> > > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > > > > modify (process 1)
> > > > > > > > > same
> > > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > > >
> > > > > > > > > > > > Do you mean that process 0 will compare successfully
> > > > > > > > > > > > the process 1
> > > > > > > > > new
> > > > > > > > > > > port name?
> > > > > > > > > > >
> > > > > > > > > > > Yes.
> > > > > > > > > > >
> > > > > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > > > > will not compare
> > > > > > > > > the
> > > > > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > > > > UNUSED
> > > > > > > > > > > > state.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > > >
> > > > > > > > > > Someone in process 0 should attach it using protected
> > > > > > > > > > attach_secondary
> > > > > > > > > somewhere in your scenario.
> > > > > > > > >
> > > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > > See the function with inline comments:
> > > > > > > >
> > > > > > > > struct rte_eth_dev *
> > > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > > 	unsigned i;
> > > > > > > >
> > > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > > >
> > > > > > > > 	    	The below state are in local process memory,
> > > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > > current i),
> > > > > > > update its local state to ATTACHED and write the name,
> > > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > > process
> > > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> > must
> > > > > > > take the lock
> > > > > > > > and it can't, because it is currently locked by process 1.
> > > > > > >
> > > > > > > Ok I see.
> > > > > > > Thanks for your patience.
> > > > > > > BTW, that means that if let say process 0 will call
> > > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be
> > > > > > > used for different devices and 2 processes will overwrite the
> > > > > > > same
> > > > > > rte_eth_dev_data[port_id]?
> > > > > >
> > > > > > No, contrary to the state, the lock itself is in shared memory,
> > > > > > so 2 processes cannot allocate port in the same time.(you can
> > > > > > see it in the next patch of this series).
> > > >
> > > > I am not talking about racing here.
> > > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > > >rte_eth_dev_allocate("xxx")
> > > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it assigns
> > > > new dev ("xxx") to port N.
> > > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > > >rte_eth_dev_allocate("yyy").
> > > > From its perspective port N is still free:  rte_eth_devices[N].state
> > > > == RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the same
> > port.
> > > >
> > >
> > > Yes you right, this is a problem(not related actually to port
> > > ownership)
> >
> > Yep that's true - it was there before your patches.
> >
> > > but look:
> > > As I understand the secondary processes are not allowed to create a
> > > ports and they must to use attach_secondary API, but there is not
> > hardcoded check which prevent them to do it.
> >
> > Secondary processes ae the ability to allocate their own vdevs and probably it
> > should stay like that.
> > I just thought it is a good opportunity to fix it while you are on these changes
> > anyway, but ok we can leave it for now.
> >
> Looks like the fix should break ABI(moving the state to the shared memory), let's try to fix it in the next version :)

Not necessarily - I think we can just  add a check inside te_eth_dev_find_free_port() that 
rte_eth_dev_data[port_id].name is an empty string.
Konstantin


> 
> > Konstantin

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:51  0%                                                 ` Ananyev, Konstantin
@ 2018-01-18 15:00  0%                                                   ` Matan Azrad
  0 siblings, 0 replies; 200+ results
From: Matan Azrad @ 2018-01-18 15:00 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



From: Ananyev, Konstantin, Thursday, January 18, 2018 4:52 PM
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Thursday, January 18, 2018 2:45 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>
> > Subject: RE: [PATCH v2 2/6] ethdev: add port ownership
> >
> > HI
> >
> > From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > > > Hi Konstantine
> > > >
> > > > > Hi Matan,
> > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > > > It is a public function
> > > > > > > > > > > > > > > > > > > > > > > > used by drivers, so need
> > > > > > > > > > > > > > > > > > > > > > > > to be protected
> > > > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should
> > > > > > > > > > > > > > > > > > > > > > be protected by the
> > > > > > > > > > > > same
> > > > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I don't think so, I can
> > > > > > > > > > > > > > > > > > > > > understand to use the ownership
> > > > > > > > > > > > > > > > > > > > > lock here(as in port
> > > > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary
> too.
> > > > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > > > timing?(ask in the next moment
> > > > > > > > > > > > > > > > > > > > > and you may get another
> > > > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > > > crash.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > > > identifies
> > > > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > > > release synchronization will be
> > > > > > > > > > > > > > > > > > > > managed by
> > > ethdev."
> > > > > > > > > > > > > > > > > > > > To me it means that ethdev layer
> > > > > > > > > > > > > > > > > > > > has to make sure that all accesses
> > > > > > > > > > > > > > > > > > > > to rte_eth_dev_data[].name are
> > > > > > > > atomic.
> > > > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > > > name,
> > > > > ...) ?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The second will get True or False and that is
> it.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Under race condition - in the worst
> > > > > > > > > > > > > > > > > > case it might crash, though for that
> > > > > > > > > > > > > > > > > > you'll have to be really
> > > > > unlucky.
> > > > > > > > > > > > > > > > > > Though in most cases as you said it
> > > > > > > > > > > > > > > > > > would just not operate
> > > > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > > > I think if we start to protect
> > > > > > > > > > > > > > > > > > dev->name by lock we need to do it for
> > > > > > > > > > > > > > > > > > all instances (both read and
> > > > > write).
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Since under the ownership rules, the
> > > > > > > > > > > > > > > > > user must take ownership
> > > > > > > > > > of a
> > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So? The user still should take ownership of
> > > > > > > > > > > > > > > a device before using it
> > > > > > > > > > (by
> > > > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > > > It can just read it without owning it, but no
> managing it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Let say thread 0 doing
> > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocat
> > > > > > > > > > > > > > > > ed()
> > > > > > > > > > > > > > > > -
> > > > > > > > >strcmp().
> > > > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > > > return
> > > > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > > > Which wrong device do you mean? I guess it
> > > > > > > > > > > > > > > is the device which
> > > > > > > > > > > > currently is
> > > > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > > > rte_free() for related resources, while It
> > > > > > > > > > > > > > > > can still be in use by someone
> > > > > > > else.
> > > > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK
> > > > > > > > > > > > > > > entity) must take
> > > > > > > > > > > > ownership
> > > > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > > > release), so
> > > > > > > > > > > > > > no issue here.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > > > Suppose we have a process it created ring port
> > > > > > > > > > > > > > for itself (without
> > > > > > > > > > setting
> > > > > > > > > > > > any
> > > > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > > > for it.
> > > > > > > > > > > > > > At the same time second process decides to
> > > > > > > > > > > > > > call
> > > > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > > > (let
> > > > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > > > They could collide trying to read (process 0)
> > > > > > > > > > > > > > and modify (process 1)
> > > > > > > > > > same
> > > > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Do you mean that process 0 will compare
> > > > > > > > > > > > > successfully the process 1
> > > > > > > > > > new
> > > > > > > > > > > > port name?
> > > > > > > > > > > >
> > > > > > > > > > > > Yes.
> > > > > > > > > > > >
> > > > > > > > > > > > > The state are in local process memory - so
> > > > > > > > > > > > > process 0 will not compare
> > > > > > > > > > the
> > > > > > > > > > > > process 1 port, from its point of view this port
> > > > > > > > > > > > is in UNUSED
> > > > > > > > > > > > > state.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > > > >
> > > > > > > > > > > Someone in process 0 should attach it using
> > > > > > > > > > > protected attach_secondary
> > > > > > > > > > somewhere in your scenario.
> > > > > > > > > >
> > > > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > > > See the function with inline comments:
> > > > > > > > >
> > > > > > > > > struct rte_eth_dev *
> > > > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > > > 	unsigned i;
> > > > > > > > >
> > > > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > > > >
> > > > > > > > > 	    	The below state are in local process memory,
> > > > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > > > current i),
> > > > > > > > update its local state to ATTACHED and write the name,
> > > > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > > > process
> > > > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> > > must
> > > > > > > > take the lock
> > > > > > > > > and it can't, because it is currently locked by process 1.
> > > > > > > >
> > > > > > > > Ok I see.
> > > > > > > > Thanks for your patience.
> > > > > > > > BTW, that means that if let say process 0 will call
> > > > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id
> > > > > > > > be used for different devices and 2 processes will
> > > > > > > > overwrite the same
> > > > > > > rte_eth_dev_data[port_id]?
> > > > > > >
> > > > > > > No, contrary to the state, the lock itself is in shared
> > > > > > > memory, so 2 processes cannot allocate port in the same
> > > > > > > time.(you can see it in the next patch of this series).
> > > > >
> > > > > I am not talking about racing here.
> > > > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > > > >rte_eth_dev_allocate("xxx")
> > > > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it
> > > > > assigns new dev ("xxx") to port N.
> > > > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > > > >rte_eth_dev_allocate("yyy").
> > > > > From its perspective port N is still free:
> > > > > rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED, so it will
> > > > > assign new dev ("yyy") to the same
> > > port.
> > > > >
> > > >
> > > > Yes you right, this is a problem(not related actually to port
> > > > ownership)
> > >
> > > Yep that's true - it was there before your patches.
> > >
> > > > but look:
> > > > As I understand the secondary processes are not allowed to create
> > > > a ports and they must to use attach_secondary API, but there is
> > > > not
> > > hardcoded check which prevent them to do it.
> > >
> > > Secondary processes ae the ability to allocate their own vdevs and
> > > probably it should stay like that.
> > > I just thought it is a good opportunity to fix it while you are on
> > > these changes anyway, but ok we can leave it for now.
> > >
> > Looks like the fix should break ABI(moving the state to the shared
> > memory), let's try to fix it in the next version :)
> 
> Not necessarily - I think we can just  add a check inside
> te_eth_dev_find_free_port() that rte_eth_dev_data[port_id].name is an
> empty string.

Good idea, I will add it (actually the first patch in this series allows it).

Thanks.


> Konstantin
> 
> 
> >
> > > Konstantin

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC 15/24] vhost: add virtio pci framework
  @ 2018-01-19 13:44  2% ` Stefan Hajnoczi
  0 siblings, 0 replies; 200+ results
From: Stefan Hajnoczi @ 2018-01-19 13:44 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, Yuanhan Liu, wei.w.wang, mst, zhiyong.yang,
	jasowang, Stefan Hajnoczi

The virtio-vhost-user transport will involve a virtio pci device driver.
There is currently no librte_virtio API that we can reusable.

This commit is a hack that duplicates the virtio pci code from
drivers/net/.  A better solution would be to extract the code cleanly
from drivers/net/ and share it.  Or perhaps we could backport SPDK's
lib/virtio.  I don't have time to do either right now so I've just
copied the code, removed virtio-net and ethdev parts, and renamed
symbols to avoid link errors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 drivers/librte_vhost/Makefile     |   4 +-
 drivers/librte_vhost/virtio_pci.h | 267 ++++++++++++++++++++
 drivers/librte_vhost/virtqueue.h  | 181 ++++++++++++++
 drivers/librte_vhost/virtio_pci.c | 504 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 955 insertions(+), 1 deletion(-)
 create mode 100644 drivers/librte_vhost/virtio_pci.h
 create mode 100644 drivers/librte_vhost/virtqueue.h
 create mode 100644 drivers/librte_vhost/virtio_pci.c

diff --git a/drivers/librte_vhost/Makefile b/drivers/librte_vhost/Makefile
index ccbbce3af..8a56c32af 100644
--- a/drivers/librte_vhost/Makefile
+++ b/drivers/librte_vhost/Makefile
@@ -21,7 +21,9 @@ LDLIBS += -lrte_eal -lrte_mempool -lrte_mbuf -lrte_ethdev -lrte_net
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c iotlb.c socket.c vhost.c \
-					vhost_user.c virtio_net.c trans_af_unix.c
+					vhost_user.c virtio_net.c \
+					trans_af_unix.c \
+					virtio_pci.c
 
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h
diff --git a/drivers/librte_vhost/virtio_pci.h b/drivers/librte_vhost/virtio_pci.h
new file mode 100644
index 000000000..7afc24853
--- /dev/null
+++ b/drivers/librte_vhost/virtio_pci.h
@@ -0,0 +1,267 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+/* XXX This file is based on drivers/net/virtio/virtio_pci.h.  It would be
+ * better to create a shared rte_virtio library instead of duplicating this
+ * code.
+ */
+
+#ifndef _VIRTIO_PCI_H_
+#define _VIRTIO_PCI_H_
+
+#include <stdint.h>
+
+#include <rte_log.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+#include <rte_spinlock.h>
+
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_VIRTIO_PCI_CONFIG RTE_LOGTYPE_USER2
+
+struct virtqueue;
+
+/* VirtIO PCI vendor/device ID. */
+#define VIRTIO_PCI_VENDORID     0x1AF4
+#define VIRTIO_PCI_LEGACY_DEVICEID_VHOST_USER 0x1017
+#define VIRTIO_PCI_MODERN_DEVICEID_VHOST_USER 0x1058
+
+/* VirtIO ABI version, this must match exactly. */
+#define VIRTIO_PCI_ABI_VERSION 0
+
+/*
+ * VirtIO Header, located in BAR 0.
+ */
+#define VIRTIO_PCI_HOST_FEATURES  0  /* host's supported features (32bit, RO)*/
+#define VIRTIO_PCI_GUEST_FEATURES 4  /* guest's supported features (32, RW) */
+#define VIRTIO_PCI_QUEUE_PFN      8  /* physical address of VQ (32, RW) */
+#define VIRTIO_PCI_QUEUE_NUM      12 /* number of ring entries (16, RO) */
+#define VIRTIO_PCI_QUEUE_SEL      14 /* current VQ selection (16, RW) */
+#define VIRTIO_PCI_QUEUE_NOTIFY   16 /* notify host regarding VQ (16, RW) */
+#define VIRTIO_PCI_STATUS         18 /* device status register (8, RW) */
+#define VIRTIO_PCI_ISR		  19 /* interrupt status register, reading
+				      * also clears the register (8, RO) */
+/* Only if MSIX is enabled: */
+#define VIRTIO_MSI_CONFIG_VECTOR  20 /* configuration change vector (16, RW) */
+#define VIRTIO_MSI_QUEUE_VECTOR	  22 /* vector for selected VQ notifications
+				      (16, RW) */
+
+/* The bit of the ISR which indicates a device has an interrupt. */
+#define VIRTIO_PCI_ISR_INTR   0x1
+/* The bit of the ISR which indicates a device configuration change. */
+#define VIRTIO_PCI_ISR_CONFIG 0x2
+/* Vector value used to disable MSI for queue. */
+#define VIRTIO_MSI_NO_VECTOR 0xFFFF
+
+/* VirtIO device IDs. */
+#define VIRTIO_ID_VHOST_USER  0x18
+
+/* Status byte for guest to report progress. */
+#define VIRTIO_CONFIG_STATUS_RESET     0x00
+#define VIRTIO_CONFIG_STATUS_ACK       0x01
+#define VIRTIO_CONFIG_STATUS_DRIVER    0x02
+#define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
+#define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_FAILED    0x80
+
+/*
+ * Each virtqueue indirect descriptor list must be physically contiguous.
+ * To allow us to malloc(9) each list individually, limit the number
+ * supported to what will fit in one page. With 4KB pages, this is a limit
+ * of 256 descriptors. If there is ever a need for more, we can switch to
+ * contigmalloc(9) for the larger allocations, similar to what
+ * bus_dmamem_alloc(9) does.
+ *
+ * Note the sizeof(struct vring_desc) is 16 bytes.
+ */
+#define VIRTIO_MAX_INDIRECT ((int) (PAGE_SIZE / 16))
+
+/* Do we get callbacks when the ring is completely used, even if we've
+ * suppressed them? */
+#define VIRTIO_F_NOTIFY_ON_EMPTY	24
+
+/* Can the device handle any descriptor layout? */
+#define VIRTIO_F_ANY_LAYOUT		27
+
+/* We support indirect buffer descriptors */
+#define VIRTIO_RING_F_INDIRECT_DESC	28
+
+#define VIRTIO_F_VERSION_1		32
+#define VIRTIO_F_IOMMU_PLATFORM	33
+
+/*
+ * Some VirtIO feature bits (currently bits 28 through 31) are
+ * reserved for the transport being used (eg. virtio_ring), the
+ * rest are per-device feature bits.
+ */
+#define VIRTIO_TRANSPORT_F_START 28
+#define VIRTIO_TRANSPORT_F_END   34
+
+/* The Guest publishes the used index for which it expects an interrupt
+ * at the end of the avail ring. Host should ignore the avail->flags field. */
+/* The Host publishes the avail index for which it expects a kick
+ * at the end of the used ring. Guest should ignore the used->flags field. */
+#define VIRTIO_RING_F_EVENT_IDX		29
+
+/* Common configuration */
+#define VIRTIO_PCI_CAP_COMMON_CFG	1
+/* Notifications */
+#define VIRTIO_PCI_CAP_NOTIFY_CFG	2
+/* ISR Status */
+#define VIRTIO_PCI_CAP_ISR_CFG		3
+/* Device specific configuration */
+#define VIRTIO_PCI_CAP_DEVICE_CFG	4
+/* PCI configuration access */
+#define VIRTIO_PCI_CAP_PCI_CFG		5
+
+/* This is the PCI capability header: */
+struct virtio_pci_cap {
+	uint8_t cap_vndr;		/* Generic PCI field: PCI_CAP_ID_VNDR */
+	uint8_t cap_next;		/* Generic PCI field: next ptr. */
+	uint8_t cap_len;		/* Generic PCI field: capability length */
+	uint8_t cfg_type;		/* Identifies the structure. */
+	uint8_t bar;			/* Where to find it. */
+	uint8_t padding[3];		/* Pad to full dword. */
+	uint32_t offset;		/* Offset within bar. */
+	uint32_t length;		/* Length of the structure, in bytes. */
+};
+
+struct virtio_pci_notify_cap {
+	struct virtio_pci_cap cap;
+	uint32_t notify_off_multiplier;	/* Multiplier for queue_notify_off. */
+};
+
+/* Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
+struct virtio_pci_common_cfg {
+	/* About the whole device. */
+	uint32_t device_feature_select;	/* read-write */
+	uint32_t device_feature;	/* read-only */
+	uint32_t guest_feature_select;	/* read-write */
+	uint32_t guest_feature;		/* read-write */
+	uint16_t msix_config;		/* read-write */
+	uint16_t num_queues;		/* read-only */
+	uint8_t device_status;		/* read-write */
+	uint8_t config_generation;	/* read-only */
+
+	/* About a specific virtqueue. */
+	uint16_t queue_select;		/* read-write */
+	uint16_t queue_size;		/* read-write, power of 2. */
+	uint16_t queue_msix_vector;	/* read-write */
+	uint16_t queue_enable;		/* read-write */
+	uint16_t queue_notify_off;	/* read-only */
+	uint32_t queue_desc_lo;		/* read-write */
+	uint32_t queue_desc_hi;		/* read-write */
+	uint32_t queue_avail_lo;	/* read-write */
+	uint32_t queue_avail_hi;	/* read-write */
+	uint32_t queue_used_lo;		/* read-write */
+	uint32_t queue_used_hi;		/* read-write */
+};
+
+struct virtio_hw;
+
+struct virtio_pci_ops {
+	void (*read_dev_cfg)(struct virtio_hw *hw, size_t offset,
+			     void *dst, int len);
+	void (*write_dev_cfg)(struct virtio_hw *hw, size_t offset,
+			      const void *src, int len);
+	void (*reset)(struct virtio_hw *hw);
+
+	uint8_t (*get_status)(struct virtio_hw *hw);
+	void    (*set_status)(struct virtio_hw *hw, uint8_t status);
+
+	uint64_t (*get_features)(struct virtio_hw *hw);
+	void     (*set_features)(struct virtio_hw *hw, uint64_t features);
+
+	uint8_t (*get_isr)(struct virtio_hw *hw);
+
+	uint16_t (*set_config_irq)(struct virtio_hw *hw, uint16_t vec);
+
+	uint16_t (*set_queue_irq)(struct virtio_hw *hw, struct virtqueue *vq,
+			uint16_t vec);
+
+	uint16_t (*get_queue_num)(struct virtio_hw *hw, uint16_t queue_id);
+	int (*setup_queue)(struct virtio_hw *hw, struct virtqueue *vq);
+	void (*del_queue)(struct virtio_hw *hw, struct virtqueue *vq);
+	void (*notify_queue)(struct virtio_hw *hw, struct virtqueue *vq);
+};
+
+struct virtio_hw {
+	uint64_t    guest_features;
+	uint32_t    max_queue_pairs;
+	uint16_t    started;
+	uint8_t	    use_msix;
+	uint16_t    internal_id;
+	uint32_t    notify_off_multiplier;
+	uint8_t     *isr;
+	uint16_t    *notify_base;
+	struct virtio_pci_common_cfg *common_cfg;
+	void	    *dev_cfg;
+	/*
+	 * App management thread and virtio interrupt handler thread
+	 * both can change device state, this lock is meant to avoid
+	 * such a contention.
+	 */
+	rte_spinlock_t state_lock;
+
+	struct virtqueue **vqs;
+};
+
+/*
+ * While virtio_hw is stored in shared memory, this structure stores
+ * some infos that may vary in the multiple process model locally.
+ * For example, the vtpci_ops pointer.
+ */
+struct virtio_hw_internal {
+	const struct virtio_pci_ops *vtpci_ops;
+};
+
+#define VTPCI_OPS(hw)	(virtio_pci_hw_internal[(hw)->internal_id].vtpci_ops)
+
+extern struct virtio_hw_internal virtio_pci_hw_internal[8];
+
+/*
+ * How many bits to shift physical queue address written to QUEUE_PFN.
+ * 12 is historical, and due to x86 page size.
+ */
+#define VIRTIO_PCI_QUEUE_ADDR_SHIFT 12
+
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_PCI_VRING_ALIGN 4096
+
+enum virtio_msix_status {
+	VIRTIO_MSIX_NONE = 0,
+	VIRTIO_MSIX_DISABLED = 1,
+	VIRTIO_MSIX_ENABLED = 2
+};
+
+static inline int
+virtio_pci_with_feature(struct virtio_hw *hw, uint64_t bit)
+{
+	return (hw->guest_features & (1ULL << bit)) != 0;
+}
+
+/*
+ * Function declaration from virtio_pci.c
+ */
+int virtio_pci_init(struct rte_pci_device *dev, struct virtio_hw *hw);
+void virtio_pci_reset(struct virtio_hw *);
+
+void virtio_pci_reinit_complete(struct virtio_hw *);
+
+uint8_t virtio_pci_get_status(struct virtio_hw *);
+void virtio_pci_set_status(struct virtio_hw *, uint8_t);
+
+uint64_t virtio_pci_negotiate_features(struct virtio_hw *, uint64_t);
+
+void virtio_pci_write_dev_config(struct virtio_hw *, size_t, const void *, int);
+
+void virtio_pci_read_dev_config(struct virtio_hw *, size_t, void *, int);
+
+uint8_t virtio_pci_isr(struct virtio_hw *);
+
+enum virtio_msix_status virtio_pci_msix_detect(struct rte_pci_device *dev);
+
+extern const struct virtio_pci_ops virtio_pci_modern_ops;
+
+#endif /* _VIRTIO_PCI_H_ */
diff --git a/drivers/librte_vhost/virtqueue.h b/drivers/librte_vhost/virtqueue.h
new file mode 100644
index 000000000..e2ac78eef
--- /dev/null
+++ b/drivers/librte_vhost/virtqueue.h
@@ -0,0 +1,181 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+/* XXX This file is based on drivers/net/virtio/virtqueue.h.  It would be
+ * better to create a shared rte_virtio library instead of duplicating this
+ * code.
+ */
+
+#ifndef _VIRTQUEUE_H_
+#define _VIRTQUEUE_H_
+
+#include <stdint.h>
+#include <linux/virtio_ring.h>
+
+#include <rte_atomic.h>
+#include <rte_memory.h>
+#include <rte_mempool.h>
+
+#include "virtio_pci.h"
+
+/*
+ * Per virtio_config.h in Linux.
+ *     For virtio_pci on SMP, we don't need to order with respect to MMIO
+ *     accesses through relaxed memory I/O windows, so smp_mb() et al are
+ *     sufficient.
+ *
+ */
+#define virtio_mb()	rte_smp_mb()
+#define virtio_rmb()	rte_smp_rmb()
+#define virtio_wmb()	rte_smp_wmb()
+
+#define VIRTQUEUE_MAX_NAME_SZ 32
+
+/**
+ * The maximum virtqueue size is 2^15. Use that value as the end of
+ * descriptor chain terminator since it will never be a valid index
+ * in the descriptor table. This is used to verify we are correctly
+ * handling vq_free_cnt.
+ */
+#define VQ_RING_DESC_CHAIN_END 32768
+
+struct vq_desc_extra {
+	void *cookie;
+	uint16_t ndescs;
+};
+
+struct virtqueue {
+	struct virtio_hw  *hw; /**< virtio_hw structure pointer. */
+	struct vring vq_ring;  /**< vring keeping desc, used and avail */
+	/**
+	 * Last consumed descriptor in the used table,
+	 * trails vq_ring.used->idx.
+	 */
+	uint16_t vq_used_cons_idx;
+	uint16_t vq_nentries;  /**< vring desc numbers */
+	uint16_t vq_free_cnt;  /**< num of desc available */
+	uint16_t vq_avail_idx; /**< sync until needed */
+	uint16_t vq_free_thresh; /**< free threshold */
+
+	void *vq_ring_virt_mem;  /**< linear address of vring*/
+	unsigned int vq_ring_size;
+
+	rte_iova_t vq_ring_mem; /**< physical address of vring */
+
+	const struct rte_memzone *mz; /**< memzone backing vring */
+
+	/**
+	 * Head of the free chain in the descriptor table. If
+	 * there are no free descriptors, this will be set to
+	 * VQ_RING_DESC_CHAIN_END.
+	 */
+	uint16_t  vq_desc_head_idx;
+	uint16_t  vq_desc_tail_idx;
+	uint16_t  vq_queue_index;   /**< PCI queue index */
+	uint16_t  *notify_addr;
+	struct vq_desc_extra vq_descx[0];
+};
+
+/* Chain all the descriptors in the ring with an END */
+static inline void
+vring_desc_init(struct vring_desc *dp, uint16_t n)
+{
+	uint16_t i;
+
+	for (i = 0; i < n - 1; i++)
+		dp[i].next = (uint16_t)(i + 1);
+	dp[i].next = VQ_RING_DESC_CHAIN_END;
+}
+
+/**
+ * Tell the backend not to interrupt us.
+ */
+static inline void
+virtqueue_disable_intr(struct virtqueue *vq)
+{
+	vq->vq_ring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
+}
+
+/**
+ * Tell the backend to interrupt us.
+ */
+static inline void
+virtqueue_enable_intr(struct virtqueue *vq)
+{
+	vq->vq_ring.avail->flags &= (~VRING_AVAIL_F_NO_INTERRUPT);
+}
+
+/**
+ *  Dump virtqueue internal structures, for debug purpose only.
+ */
+void virtqueue_dump(struct virtqueue *vq);
+
+static inline int
+virtqueue_full(const struct virtqueue *vq)
+{
+	return vq->vq_free_cnt == 0;
+}
+
+#define VIRTQUEUE_NUSED(vq) ((uint16_t)((vq)->vq_ring.used->idx - (vq)->vq_used_cons_idx))
+
+static inline void
+vq_update_avail_idx(struct virtqueue *vq)
+{
+	virtio_wmb();
+	vq->vq_ring.avail->idx = vq->vq_avail_idx;
+}
+
+static inline void
+vq_update_avail_ring(struct virtqueue *vq, uint16_t desc_idx)
+{
+	uint16_t avail_idx;
+	/*
+	 * Place the head of the descriptor chain into the next slot and make
+	 * it usable to the host. The chain is made available now rather than
+	 * deferring to virtqueue_notify() in the hopes that if the host is
+	 * currently running on another CPU, we can keep it processing the new
+	 * descriptor.
+	 */
+	avail_idx = (uint16_t)(vq->vq_avail_idx & (vq->vq_nentries - 1));
+	if (unlikely(vq->vq_ring.avail->ring[avail_idx] != desc_idx))
+		vq->vq_ring.avail->ring[avail_idx] = desc_idx;
+	vq->vq_avail_idx++;
+}
+
+static inline int
+virtqueue_kick_prepare(struct virtqueue *vq)
+{
+	return !(vq->vq_ring.used->flags & VRING_USED_F_NO_NOTIFY);
+}
+
+static inline void
+virtqueue_notify(struct virtqueue *vq)
+{
+	/*
+	 * Ensure updated avail->idx is visible to host.
+	 * For virtio on IA, the notificaiton is through io port operation
+	 * which is a serialization instruction itself.
+	 */
+	VTPCI_OPS(vq->hw)->notify_queue(vq->hw, vq);
+}
+
+#ifdef RTE_LIBRTE_VIRTIO_DEBUG_DUMP
+#define VIRTQUEUE_DUMP(vq) do { \
+	uint16_t used_idx, nused; \
+	used_idx = (vq)->vq_ring.used->idx; \
+	nused = (uint16_t)(used_idx - (vq)->vq_used_cons_idx); \
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, \
+	  "VQ: - size=%d; free=%d; used=%d; desc_head_idx=%d;" \
+	  " avail.idx=%d; used_cons_idx=%d; used.idx=%d;" \
+	  " avail.flags=0x%x; used.flags=0x%x\n", \
+	  (vq)->vq_nentries, (vq)->vq_free_cnt, nused, \
+	  (vq)->vq_desc_head_idx, (vq)->vq_ring.avail->idx, \
+	  (vq)->vq_used_cons_idx, (vq)->vq_ring.used->idx, \
+	  (vq)->vq_ring.avail->flags, (vq)->vq_ring.used->flags); \
+} while (0)
+#else
+#define VIRTQUEUE_DUMP(vq) do { } while (0)
+#endif
+
+#endif /* _VIRTQUEUE_H_ */
diff --git a/drivers/librte_vhost/virtio_pci.c b/drivers/librte_vhost/virtio_pci.c
new file mode 100644
index 000000000..f1a23bbbf
--- /dev/null
+++ b/drivers/librte_vhost/virtio_pci.c
@@ -0,0 +1,504 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+#include <stdint.h>
+
+/* XXX This file is based on drivers/net/virtio/virtio_pci.c.  It would be
+ * better to create a shared rte_virtio library instead of duplicating this
+ * code.
+ */
+
+#ifdef RTE_EXEC_ENV_LINUXAPP
+ #include <dirent.h>
+ #include <fcntl.h>
+#endif
+
+#include <rte_io.h>
+#include <rte_bus.h>
+
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+/*
+ * Following macros are derived from linux/pci_regs.h, however,
+ * we can't simply include that header here, as there is no such
+ * file for non-Linux platform.
+ */
+#define PCI_CAPABILITY_LIST	0x34
+#define PCI_CAP_ID_VNDR		0x09
+#define PCI_CAP_ID_MSIX		0x11
+
+/*
+ * The remaining space is defined by each driver as the per-driver
+ * configuration space.
+ */
+#define VIRTIO_PCI_CONFIG(hw) \
+		(((hw)->use_msix == VIRTIO_MSIX_ENABLED) ? 24 : 20)
+
+static inline int
+check_vq_phys_addr_ok(struct virtqueue *vq)
+{
+	/* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit,
+	 * and only accepts 32 bit page frame number.
+	 * Check if the allocated physical memory exceeds 16TB.
+	 */
+	if ((vq->vq_ring_mem + vq->vq_ring_size - 1) >>
+			(VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) {
+		RTE_LOG(ERR, VIRTIO_PCI_CONFIG, "vring address shouldn't be above 16TB!\n");
+		return 0;
+	}
+
+	return 1;
+}
+
+static inline void
+io_write64_twopart(uint64_t val, uint32_t *lo, uint32_t *hi)
+{
+	rte_write32(val & ((1ULL << 32) - 1), lo);
+	rte_write32(val >> 32,		     hi);
+}
+
+static void
+modern_read_dev_config(struct virtio_hw *hw, size_t offset,
+		       void *dst, int length)
+{
+	int i;
+	uint8_t *p;
+	uint8_t old_gen, new_gen;
+
+	do {
+		old_gen = rte_read8(&hw->common_cfg->config_generation);
+
+		p = dst;
+		for (i = 0;  i < length; i++)
+			*p++ = rte_read8((uint8_t *)hw->dev_cfg + offset + i);
+
+		new_gen = rte_read8(&hw->common_cfg->config_generation);
+	} while (old_gen != new_gen);
+}
+
+static void
+modern_write_dev_config(struct virtio_hw *hw, size_t offset,
+			const void *src, int length)
+{
+	int i;
+	const uint8_t *p = src;
+
+	for (i = 0;  i < length; i++)
+		rte_write8((*p++), (((uint8_t *)hw->dev_cfg) + offset + i));
+}
+
+static uint64_t
+modern_get_features(struct virtio_hw *hw)
+{
+	uint32_t features_lo, features_hi;
+
+	rte_write32(0, &hw->common_cfg->device_feature_select);
+	features_lo = rte_read32(&hw->common_cfg->device_feature);
+
+	rte_write32(1, &hw->common_cfg->device_feature_select);
+	features_hi = rte_read32(&hw->common_cfg->device_feature);
+
+	return ((uint64_t)features_hi << 32) | features_lo;
+}
+
+static void
+modern_set_features(struct virtio_hw *hw, uint64_t features)
+{
+	rte_write32(0, &hw->common_cfg->guest_feature_select);
+	rte_write32(features & ((1ULL << 32) - 1),
+		    &hw->common_cfg->guest_feature);
+
+	rte_write32(1, &hw->common_cfg->guest_feature_select);
+	rte_write32(features >> 32,
+		    &hw->common_cfg->guest_feature);
+}
+
+static uint8_t
+modern_get_status(struct virtio_hw *hw)
+{
+	return rte_read8(&hw->common_cfg->device_status);
+}
+
+static void
+modern_set_status(struct virtio_hw *hw, uint8_t status)
+{
+	rte_write8(status, &hw->common_cfg->device_status);
+}
+
+static void
+modern_reset(struct virtio_hw *hw)
+{
+	modern_set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
+	modern_get_status(hw);
+}
+
+static uint8_t
+modern_get_isr(struct virtio_hw *hw)
+{
+	return rte_read8(hw->isr);
+}
+
+static uint16_t
+modern_set_config_irq(struct virtio_hw *hw, uint16_t vec)
+{
+	rte_write16(vec, &hw->common_cfg->msix_config);
+	return rte_read16(&hw->common_cfg->msix_config);
+}
+
+static uint16_t
+modern_set_queue_irq(struct virtio_hw *hw, struct virtqueue *vq, uint16_t vec)
+{
+	rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+	rte_write16(vec, &hw->common_cfg->queue_msix_vector);
+	return rte_read16(&hw->common_cfg->queue_msix_vector);
+}
+
+static uint16_t
+modern_get_queue_num(struct virtio_hw *hw, uint16_t queue_id)
+{
+	rte_write16(queue_id, &hw->common_cfg->queue_select);
+	return rte_read16(&hw->common_cfg->queue_size);
+}
+
+static int
+modern_setup_queue(struct virtio_hw *hw, struct virtqueue *vq)
+{
+	uint64_t desc_addr, avail_addr, used_addr;
+	uint16_t notify_off;
+
+	if (!check_vq_phys_addr_ok(vq))
+		return -1;
+
+	desc_addr = vq->vq_ring_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_PCI_VRING_ALIGN);
+
+	rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+
+	io_write64_twopart(desc_addr, &hw->common_cfg->queue_desc_lo,
+				      &hw->common_cfg->queue_desc_hi);
+	io_write64_twopart(avail_addr, &hw->common_cfg->queue_avail_lo,
+				       &hw->common_cfg->queue_avail_hi);
+	io_write64_twopart(used_addr, &hw->common_cfg->queue_used_lo,
+				      &hw->common_cfg->queue_used_hi);
+
+	notify_off = rte_read16(&hw->common_cfg->queue_notify_off);
+	vq->notify_addr = (void *)((uint8_t *)hw->notify_base +
+				notify_off * hw->notify_off_multiplier);
+
+	rte_write16(1, &hw->common_cfg->queue_enable);
+
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "queue %u addresses:\n", vq->vq_queue_index);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "\t desc_addr: %" PRIx64 "\n", desc_addr);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "\t aval_addr: %" PRIx64 "\n", avail_addr);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "\t used_addr: %" PRIx64 "\n", used_addr);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "\t notify addr: %p (notify offset: %u)\n",
+		vq->notify_addr, notify_off);
+
+	return 0;
+}
+
+static void
+modern_del_queue(struct virtio_hw *hw, struct virtqueue *vq)
+{
+	rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+
+	io_write64_twopart(0, &hw->common_cfg->queue_desc_lo,
+				  &hw->common_cfg->queue_desc_hi);
+	io_write64_twopart(0, &hw->common_cfg->queue_avail_lo,
+				  &hw->common_cfg->queue_avail_hi);
+	io_write64_twopart(0, &hw->common_cfg->queue_used_lo,
+				  &hw->common_cfg->queue_used_hi);
+
+	rte_write16(0, &hw->common_cfg->queue_enable);
+}
+
+static void
+modern_notify_queue(struct virtio_hw *hw __rte_unused, struct virtqueue *vq)
+{
+	rte_write16(vq->vq_queue_index, vq->notify_addr);
+}
+
+const struct virtio_pci_ops virtio_pci_modern_ops = {
+	.read_dev_cfg	= modern_read_dev_config,
+	.write_dev_cfg	= modern_write_dev_config,
+	.reset		= modern_reset,
+	.get_status	= modern_get_status,
+	.set_status	= modern_set_status,
+	.get_features	= modern_get_features,
+	.set_features	= modern_set_features,
+	.get_isr	= modern_get_isr,
+	.set_config_irq	= modern_set_config_irq,
+	.set_queue_irq  = modern_set_queue_irq,
+	.get_queue_num	= modern_get_queue_num,
+	.setup_queue	= modern_setup_queue,
+	.del_queue	= modern_del_queue,
+	.notify_queue	= modern_notify_queue,
+};
+
+
+void
+virtio_pci_read_dev_config(struct virtio_hw *hw, size_t offset,
+		      void *dst, int length)
+{
+	VTPCI_OPS(hw)->read_dev_cfg(hw, offset, dst, length);
+}
+
+void
+virtio_pci_write_dev_config(struct virtio_hw *hw, size_t offset,
+		       const void *src, int length)
+{
+	VTPCI_OPS(hw)->write_dev_cfg(hw, offset, src, length);
+}
+
+uint64_t
+virtio_pci_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
+{
+	uint64_t features;
+
+	/*
+	 * Limit negotiated features to what the driver, virtqueue, and
+	 * host all support.
+	 */
+	features = host_features & hw->guest_features;
+	VTPCI_OPS(hw)->set_features(hw, features);
+
+	return features;
+}
+
+void
+virtio_pci_reset(struct virtio_hw *hw)
+{
+	VTPCI_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
+	/* flush status write */
+	VTPCI_OPS(hw)->get_status(hw);
+}
+
+void
+virtio_pci_reinit_complete(struct virtio_hw *hw)
+{
+	virtio_pci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK);
+}
+
+void
+virtio_pci_set_status(struct virtio_hw *hw, uint8_t status)
+{
+	if (status != VIRTIO_CONFIG_STATUS_RESET)
+		status |= VTPCI_OPS(hw)->get_status(hw);
+
+	VTPCI_OPS(hw)->set_status(hw, status);
+}
+
+uint8_t
+virtio_pci_get_status(struct virtio_hw *hw)
+{
+	return VTPCI_OPS(hw)->get_status(hw);
+}
+
+uint8_t
+virtio_pci_isr(struct virtio_hw *hw)
+{
+	return VTPCI_OPS(hw)->get_isr(hw);
+}
+
+static void *
+get_cfg_addr(struct rte_pci_device *dev, struct virtio_pci_cap *cap)
+{
+	uint8_t  bar    = cap->bar;
+	uint32_t length = cap->length;
+	uint32_t offset = cap->offset;
+	uint8_t *base;
+
+	if (bar >= PCI_MAX_RESOURCE) {
+		RTE_LOG(ERR, VIRTIO_PCI_CONFIG, "invalid bar: %u\n", bar);
+		return NULL;
+	}
+
+	if (offset + length < offset) {
+		RTE_LOG(ERR, VIRTIO_PCI_CONFIG, "offset(%u) + length(%u) overflows\n",
+			offset, length);
+		return NULL;
+	}
+
+	if (offset + length > dev->mem_resource[bar].len) {
+		RTE_LOG(ERR, VIRTIO_PCI_CONFIG,
+			"invalid cap: overflows bar space: %u > %" PRIu64 "\n",
+			offset + length, dev->mem_resource[bar].len);
+		return NULL;
+	}
+
+	base = dev->mem_resource[bar].addr;
+	if (base == NULL) {
+		RTE_LOG(ERR, VIRTIO_PCI_CONFIG, "bar %u base addr is NULL\n", bar);
+		return NULL;
+	}
+
+	return base + offset;
+}
+
+#define PCI_MSIX_ENABLE 0x8000
+
+static int
+virtio_read_caps(struct rte_pci_device *dev, struct virtio_hw *hw)
+{
+	uint8_t pos;
+	struct virtio_pci_cap cap;
+	int ret;
+
+	if (rte_pci_map_device(dev)) {
+		RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "failed to map pci device!\n");
+		return -1;
+	}
+
+	ret = rte_pci_read_config(dev, &pos, 1, PCI_CAPABILITY_LIST);
+	if (ret < 0) {
+		RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "failed to read pci capability list\n");
+		return -1;
+	}
+
+	while (pos) {
+		ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
+		if (ret < 0) {
+			RTE_LOG(ERR, VIRTIO_PCI_CONFIG,
+				"failed to read pci cap at pos: %x\n", pos);
+			break;
+		}
+
+		if (cap.cap_vndr == PCI_CAP_ID_MSIX) {
+			/* Transitional devices would also have this capability,
+			 * that's why we also check if msix is enabled.
+			 * 1st byte is cap ID; 2nd byte is the position of next
+			 * cap; next two bytes are the flags.
+			 */
+			uint16_t flags = ((uint16_t *)&cap)[1];
+
+			if (flags & PCI_MSIX_ENABLE)
+				hw->use_msix = VIRTIO_MSIX_ENABLED;
+			else
+				hw->use_msix = VIRTIO_MSIX_DISABLED;
+		}
+
+		if (cap.cap_vndr != PCI_CAP_ID_VNDR) {
+			RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG,
+				"[%2x] skipping non VNDR cap id: %02x\n",
+				pos, cap.cap_vndr);
+			goto next;
+		}
+
+		RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG,
+			"[%2x] cfg type: %u, bar: %u, offset: %04x, len: %u\n",
+			pos, cap.cfg_type, cap.bar, cap.offset, cap.length);
+
+		switch (cap.cfg_type) {
+		case VIRTIO_PCI_CAP_COMMON_CFG:
+			hw->common_cfg = get_cfg_addr(dev, &cap);
+			break;
+		case VIRTIO_PCI_CAP_NOTIFY_CFG:
+			rte_pci_read_config(dev, &hw->notify_off_multiplier,
+					4, pos + sizeof(cap));
+			hw->notify_base = get_cfg_addr(dev, &cap);
+			break;
+		case VIRTIO_PCI_CAP_DEVICE_CFG:
+			hw->dev_cfg = get_cfg_addr(dev, &cap);
+			break;
+		case VIRTIO_PCI_CAP_ISR_CFG:
+			hw->isr = get_cfg_addr(dev, &cap);
+			break;
+		}
+
+next:
+		pos = cap.cap_next;
+	}
+
+	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
+	    hw->dev_cfg == NULL    || hw->isr == NULL) {
+		RTE_LOG(INFO, VIRTIO_PCI_CONFIG, "no modern virtio pci device found.\n");
+		return -1;
+	}
+
+	RTE_LOG(INFO, VIRTIO_PCI_CONFIG, "found modern virtio pci device.\n");
+
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "common cfg mapped at: %p\n", hw->common_cfg);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "device cfg mapped at: %p\n", hw->dev_cfg);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "isr cfg mapped at: %p\n", hw->isr);
+	RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "notify base: %p, notify off multiplier: %u\n",
+		hw->notify_base, hw->notify_off_multiplier);
+
+	return 0;
+}
+
+struct virtio_hw_internal virtio_pci_hw_internal[8];
+
+/*
+ * Return -1:
+ *   if there is error mapping with VFIO/UIO.
+ *   if port map error when driver type is KDRV_NONE.
+ *   if whitelisted but driver type is KDRV_UNKNOWN.
+ * Return 1 if kernel driver is managing the device.
+ * Return 0 on success.
+ */
+int
+virtio_pci_init(struct rte_pci_device *dev, struct virtio_hw *hw)
+{
+	static size_t internal_id;
+
+	if (internal_id >=
+	    sizeof(virtio_pci_hw_internal) / sizeof(*virtio_pci_hw_internal)) {
+		RTE_LOG(INFO, VIRTIO_PCI_CONFIG, "too many virtio pci devices.\n");
+		return -1;
+	}
+
+	/*
+	 * Try if we can succeed reading virtio pci caps, which exists
+	 * only on modern pci device.
+	 */
+	if (virtio_read_caps(dev, hw) != 0) {
+		RTE_LOG(INFO, VIRTIO_PCI_CONFIG, "legacy virtio pci is not supported.\n");
+		return -1;
+	}
+
+	RTE_LOG(INFO, VIRTIO_PCI_CONFIG, "modern virtio pci detected.\n");
+	hw->internal_id = internal_id++;
+	virtio_pci_hw_internal[hw->internal_id].vtpci_ops =
+		&virtio_pci_modern_ops;
+	return 0;
+}
+
+enum virtio_msix_status
+virtio_pci_msix_detect(struct rte_pci_device *dev)
+{
+	uint8_t pos;
+	struct virtio_pci_cap cap;
+	int ret;
+
+	ret = rte_pci_read_config(dev, &pos, 1, PCI_CAPABILITY_LIST);
+	if (ret < 0) {
+		RTE_LOG(DEBUG, VIRTIO_PCI_CONFIG, "failed to read pci capability list\n");
+		return VIRTIO_MSIX_NONE;
+	}
+
+	while (pos) {
+		ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
+		if (ret < 0) {
+			RTE_LOG(ERR, VIRTIO_PCI_CONFIG,
+				"failed to read pci cap at pos: %x\n", pos);
+			break;
+		}
+
+		if (cap.cap_vndr == PCI_CAP_ID_MSIX) {
+			uint16_t flags = ((uint16_t *)&cap)[1];
+
+			if (flags & PCI_MSIX_ENABLE)
+				return VIRTIO_MSIX_ENABLED;
+			else
+				return VIRTIO_MSIX_DISABLED;
+		}
+
+		pos = cap.cap_next;
+	}
+
+	return VIRTIO_MSIX_NONE;
+}
-- 
2.14.3

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCHv4 1/5] buildtools: Add tool to check EXPERIMENTAL api exports
  @ 2018-01-21 18:31  3%     ` Thomas Monjalon
  2018-01-21 22:07  0%       ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-01-21 18:31 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev, john.mcnamara, bruce.richardson

Hi Neil,

Sorry for the very late review.
I thought review had been done by others but it seems not.
Please find my comments below.

13/12/2017 16:17, Neil Horman:
>  create mode 100755 buildtools/experimentalsyms.sh

When adding a new file, you must reference it in MAINTAINERS.
Please add it in the section "ABI versioning".

> --- /dev/null
> +++ b/buildtools/experimentalsyms.sh

I think the file name should include the word "check".
What about check-experimental-syms.sh ?

> @@ -0,0 +1,35 @@
> +#!/bin/sh

You must insert a SPDX license and copyright here.

> +if [ -d $MAPFILE ]
> +then
> +	exit 0
> +fi
> +
> +if [ -d $OBJFILE ]
> +then
> +	exit 0
> +fi

Why checking for not being a directory?
I guess you could check being a readable file (-r)?
Should it return an error?

> +for i in `awk 'BEGIN {found=0}
> +		/.*EXPERIMENTAL.*/ {found=1}
> +		/.*}.*;/ {found=0}
> +		/.*;/ {if (found == 1) print $1}' $MAPFILE`
> +do
> +	SYM=`echo $i | sed -e"s/;//"`
> +	objdump -t $OBJFILE | grep -q "\.text.*$SYM"
> +	IN_TEXT=$?
> +	objdump -t $OBJFILE | grep -q "\.text\.experimental.*$SYM"
> +	IN_EXP=$?
> +	if [ $IN_TEXT -eq 0 -a $IN_EXP -ne 0 ]
> +	then
> +		echo "$SYM is not flagged as experimental"
> +		echo "but is listed in version map"
> +		echo "Please add __experimental to the definition of $SYM"
> +		exit 1
> +	fi
> +done
> +exit 0

exit 0 is useless at the end of a script.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCHv4 2/5] compat: Add __experimental macro
  @ 2018-01-21 18:37  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-21 18:37 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev, john.mcnamara, bruce.richardson

Hi,

I know I should have spotted these comments earlier,
I'm sorry to be late on this review.

13/12/2017 16:17, Neil Horman:
> +#ifndef ALLOW_EXPERIMENTAL_APIS
>  
> +#define __experimental \

These macros should be in the DPDK namespace:
	RTE_ALLOW_EXPERIMENTAL_API (no need of S)
	__rte_experimental

> +__attribute__((deprecated("Symbol is not yet part of stable abi"), \

Nit: s/abi/ABI/

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCHv4 5/5] doc: Add ABI __experimental tag documentation
  @ 2018-01-21 20:14  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-21 20:14 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev, john.mcnamara, bruce.richardson

13/12/2017 16:17, Neil Horman:
> --- a/doc/guides/contributing/versioning.rst
> +++ b/doc/guides/contributing/versioning.rst
> +Note that marking an API as experimental is a two step process.  To mark an API
> +as experimental, the symbols which are desired to be exported must be placed in
> +an EXPERIMENTAL version block in the corresponding libraries' version map
> +script. Secondly, the corresponding definitions of those exported functions, and
> +their forward declarations (in the development header files), must be marked
> +with the __experimental tag (see rte_compat.h).  The DPDK build makefiles
> +preform a check to ensure that the map file and the C code reflect the same
> +list of symbols.

Thanks for this text.

Bruce already commented about the type "preform".

Ferruh already commented about adding a string in doxygen header.

Ferruh already commented about adding sentences for new API.

I add that it would be the right place to explain the effect of the
attribute, and how it can be disabled at compilation.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] checkpatches.sh: Add checks for ABI symbol addition
  @ 2018-01-21 20:29  9%   ` Thomas Monjalon
  2018-01-22  1:54  4%     ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-01-21 20:29 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, john.mcnamara, bruce.richardson, Ferruh Yigit, Stephen Hemminger

Hi,

16/01/2018 19:22, Neil Horman:
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
>  Developers and Maintainers Tools
>  M: Thomas Monjalon <thomas@monjalon.net>
> +M: Neil Horman <nhorman@tuxdriver.com>
>  F: MAINTAINERS
>  F: devtools/check-dup-includes.sh
>  F: devtools/check-maintainers.sh
> @@ -52,6 +53,7 @@ F: devtools/get-maintainer.sh
>  F: devtools/git-log-fixes.sh
>  F: devtools/load-devel-config
>  F: devtools/test-build.sh
> +F: devtools/validate-new-api.sh
>  F: license/

I really think it should be in the section "ABI versioning""

> --- a/devtools/checkpatches.sh
> +++ b/devtools/checkpatches.sh
> +export VALIDATE_NEW_API=$(dirname $(readlink -e $0))/validate-new-api.sh

Why export?

>  print_usage () {
>  	cat <<- END_OF_HELP
> +	$(dirname $0)
>  	usage: $(basename $0) [-q] [-v] [-nX|patch1 [patch2] ...]]

This dirname is a debug leftover?

> @@ -96,9 +100,25 @@ check () { # <patch> <commit> <title>
>  	else
>  		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
>  	fi
> -	[ $? -ne 0 ] || return 0
> +	reta=$?

What means reta?

> +
>  	$verbose || printf '\n### %s\n\n' "$3"
>  	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
> +
> +	echo
> +	echo "Checking API additions/removals:"

You should respect $verbose before printing such header.

> +	if [ -n "$1" ] ; then
> +		report=$($VALIDATE_NEW_API $1)
> +	elif [ -n "$2" ] ; then
> +		report=$(git format-patch \
> +			 --find-renames --no-stat --stdout -1 $commit |
> +			$VALIDATE_NEW_API -)
> +	else
> +		report=$($VALIDATE_NEW_API -)
> +	fi
> +	[ $? -ne 0 -o $reta -ne 0 ] || return 0
> +	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
> +
>  	status=$(($status + 1))
>  }

> --- /dev/null
> +++ b/devtools/validate-new-api.sh

About the file name, is it only for new API?
You don't like check-symbol-change.sh ?
It may be stupid, but I thought "validate" is more related to full test,
like validate-abi.sh does for the ABI, and "check" can be a partial
test like done in checkpatches.sh.

> +		}' > ./$mapdb
> +
> +		sort -u $mapdb > ./$mapdb.2
> +		mv -f $mapdb.2 $mapdb
[...]
> +mapfile=`mktemp mapdb.XXXXXX`
[...]
> +rm -f $mapfile

If you create temporary file, you should remove it in a trap cleanup,
in case of interrupted processing.
The best is to avoid temp file, but use variables instead.

^ permalink raw reply	[relevance 9%]

* Re: [dpdk-dev] [PATCHv4 1/5] buildtools: Add tool to check EXPERIMENTAL api exports
  2018-01-21 18:31  3%     ` Thomas Monjalon
@ 2018-01-21 22:07  0%       ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2018-01-21 22:07 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, john.mcnamara, bruce.richardson

On Sun, Jan 21, 2018 at 07:31:31PM +0100, Thomas Monjalon wrote:
> Hi Neil,
> 
> Sorry for the very late review.
> I thought review had been done by others but it seems not.
> Please find my comments below.
> 
> 13/12/2017 16:17, Neil Horman:
> >  create mode 100755 buildtools/experimentalsyms.sh
> 
> When adding a new file, you must reference it in MAINTAINERS.
> Please add it in the section "ABI versioning".
> 
yup

> > --- /dev/null
> > +++ b/buildtools/experimentalsyms.sh
> 
> I think the file name should include the word "check".
> What about check-experimental-syms.sh ?
> 
> > @@ -0,0 +1,35 @@
> > +#!/bin/sh
> 
> You must insert a SPDX license and copyright here.
> 
Will do.

> > +if [ -d $MAPFILE ]
> > +then
> > +	exit 0
> > +fi
> > +
> > +if [ -d $OBJFILE ]
> > +then
> > +	exit 0
> > +fi
> 
> Why checking for not being a directory?
> I guess you could check being a readable file (-r)?
> Should it return an error?
> 
The objfile check is out of date (had it in place initially, and is no longer
needed).  the mapfile check is there because dpdk apps use the same C_TO_O rule
and have no mapfile variable set.

> > +for i in `awk 'BEGIN {found=0}
> > +		/.*EXPERIMENTAL.*/ {found=1}
> > +		/.*}.*;/ {found=0}
> > +		/.*;/ {if (found == 1) print $1}' $MAPFILE`
> > +do
> > +	SYM=`echo $i | sed -e"s/;//"`
> > +	objdump -t $OBJFILE | grep -q "\.text.*$SYM"
> > +	IN_TEXT=$?
> > +	objdump -t $OBJFILE | grep -q "\.text\.experimental.*$SYM"
> > +	IN_EXP=$?
> > +	if [ $IN_TEXT -eq 0 -a $IN_EXP -ne 0 ]
> > +	then
> > +		echo "$SYM is not flagged as experimental"
> > +		echo "but is listed in version map"
> > +		echo "Please add __experimental to the definition of $SYM"
> > +		exit 1
> > +	fi
> > +done
> > +exit 0
> 
> exit 0 is useless at the end of a script.
Its there for clarity of exit value.  I prefer to be expicit in that.
Neil

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCHv4 3/5] Makefiles: Add experimental tag check and warnings to trigger on use
  @ 2018-01-22  1:34  3%       ` Neil Horman
  2018-01-22  1:37  0%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-01-22  1:34 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, Ferruh Yigit, john.mcnamara, bruce.richardson

On Sun, Jan 21, 2018 at 07:54:44PM +0100, Thomas Monjalon wrote:
> 12/01/2018 13:44, Neil Horman:
> > On Fri, Jan 12, 2018 at 11:49:57AM +0000, Ferruh Yigit wrote:
> > > On 1/11/2018 8:50 PM, Neil Horman wrote:
> > > > On Thu, Jan 11, 2018 at 08:06:43PM +0000, Ferruh Yigit wrote:
> > > >> On 12/13/2017 3:17 PM, Neil Horman wrote:
> > > >>> --- a/app/test-eventdev/Makefile
> > > >>> +++ b/app/test-eventdev/Makefile
> > > >>> @@ -32,6 +32,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > >>>  
> > > >>>  APP = dpdk-test-eventdev
> > > >>>  
> > > >>> +CFLAGS += -DALLOW_EXPERIMENTAL_APIS
> > > >>
> > > >> Do we need this internally in DPDK? For application developers this is great,
> > > >> they will get warning unless explicitly stated that they are OK with it.
> > > >>
> > > > I'm not sure what you're asking here.  As I mentioned in the initial post, I
> > > > think we should give blanket permission to any in-tree dpdk library to allow the
> > > > use of experimental API's, but that doesn't really imply that all developers
> > > > would wan't it disabled all the time.  That is to say, I could envision a
> > > > library author who, early in development would want to get a warning issued if
> > > > they used an unstable API, and, only once they were happy with their design and
> > > > choice of API usage, turn the warning off.
> > > 
> > > I got your point, but I think whoever using an experimental API in another
> > > component in DPDK is almost always the author of the that experimental API, so
> > > not sure this check is ever really needed within dpdk.
> > > 
> > I would have thought so too, but it doesn't really bear up.  The example I used
> > to convince myself of a more granular approach was commit
> > 9c38b704d280ac128003238d7d80bf07fa556a7d where the rte_service API was
> > introduced as experimental by Nikhil Rao, and then later used in the eal library
> > as part of commit a894d4815f79b0d76527d9c42b23327de1501711 by Harry van Haaren.
> > Its no big deal because, as we agree, internal usage should be considered safe,
> > but it seemed clear that differing authors were using each others code
> > (potentially oblivious to the experimental nature of those APIs)
> > 
> > > But OK, I guess it won't hurt to have more granular approach.
> > > 
> > > > 
> > > >> Do we have any option than allowing them in DPDK library? And when experimental
> > > >> API modified the users in the DPDK library internally guaranteed to be updated.
> > > >> Why not globally allow this for all DPDK internally?
> > > >>
> > > > For the reason I gave above.  We certainly could enable this in a more top-level
> > > > makefile so that for in-library systems it was opt-in rather than opt-out, but I
> > > > chose a more granular approach because I could envision newer libraries wanting
> > > > it on.  I also felt, generally speaking, that where warning flags were
> > > > concerned, it generally desireous to have them on by default, and make people
> > > > explicitly choose to turn them off.
> 
> I think DPDK developpers look at the EXPERIMENTAL warning in the doxygen
> above the prototype.
I'm not sure I agree with that, but regardless, my initial reasoning for writing
this tag was to call attention to experimental API's during review, rather than
their use during development, so I didn't gripe about ABI changes on expemted
code.   Additionally, weather they look at the docs or not, they can
pre-emptively turn off the warning if they choose.

> And when API will be switched to stable, we probably won't remove the flag
> in the makefile to disable allowing experimental.
Well, that remains to be seen I suppose.

> So at the end, we could just allow experimental API for all internal libs.
Thats a rather bootstrapping argument.  You are effecitvely saying that no one
developing will ever want to be warned of using experimental APIs in DPDK, so
lets just turn it off everyone.  I would really rather let individual developers
make that call at the time they author something new.

> 

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCHv4 3/5] Makefiles: Add experimental tag check and warnings to trigger on use
  2018-01-22  1:34  3%       ` Neil Horman
@ 2018-01-22  1:37  0%         ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-22  1:37 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev, Ferruh Yigit, john.mcnamara, bruce.richardson

22/01/2018 02:34, Neil Horman:
> On Sun, Jan 21, 2018 at 07:54:44PM +0100, Thomas Monjalon wrote:
> > 12/01/2018 13:44, Neil Horman:
> > > On Fri, Jan 12, 2018 at 11:49:57AM +0000, Ferruh Yigit wrote:
> > > > On 1/11/2018 8:50 PM, Neil Horman wrote:
> > > > > On Thu, Jan 11, 2018 at 08:06:43PM +0000, Ferruh Yigit wrote:
> > > > >> On 12/13/2017 3:17 PM, Neil Horman wrote:
> > > > >>> --- a/app/test-eventdev/Makefile
> > > > >>> +++ b/app/test-eventdev/Makefile
> > > > >>> @@ -32,6 +32,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > >>>  
> > > > >>>  APP = dpdk-test-eventdev
> > > > >>>  
> > > > >>> +CFLAGS += -DALLOW_EXPERIMENTAL_APIS
> > > > >>
> > > > >> Do we need this internally in DPDK? For application developers this is great,
> > > > >> they will get warning unless explicitly stated that they are OK with it.
> > > > >>
> > > > > I'm not sure what you're asking here.  As I mentioned in the initial post, I
> > > > > think we should give blanket permission to any in-tree dpdk library to allow the
> > > > > use of experimental API's, but that doesn't really imply that all developers
> > > > > would wan't it disabled all the time.  That is to say, I could envision a
> > > > > library author who, early in development would want to get a warning issued if
> > > > > they used an unstable API, and, only once they were happy with their design and
> > > > > choice of API usage, turn the warning off.
> > > > 
> > > > I got your point, but I think whoever using an experimental API in another
> > > > component in DPDK is almost always the author of the that experimental API, so
> > > > not sure this check is ever really needed within dpdk.
> > > > 
> > > I would have thought so too, but it doesn't really bear up.  The example I used
> > > to convince myself of a more granular approach was commit
> > > 9c38b704d280ac128003238d7d80bf07fa556a7d where the rte_service API was
> > > introduced as experimental by Nikhil Rao, and then later used in the eal library
> > > as part of commit a894d4815f79b0d76527d9c42b23327de1501711 by Harry van Haaren.
> > > Its no big deal because, as we agree, internal usage should be considered safe,
> > > but it seemed clear that differing authors were using each others code
> > > (potentially oblivious to the experimental nature of those APIs)
> > > 
> > > > But OK, I guess it won't hurt to have more granular approach.
> > > > 
> > > > > 
> > > > >> Do we have any option than allowing them in DPDK library? And when experimental
> > > > >> API modified the users in the DPDK library internally guaranteed to be updated.
> > > > >> Why not globally allow this for all DPDK internally?
> > > > >>
> > > > > For the reason I gave above.  We certainly could enable this in a more top-level
> > > > > makefile so that for in-library systems it was opt-in rather than opt-out, but I
> > > > > chose a more granular approach because I could envision newer libraries wanting
> > > > > it on.  I also felt, generally speaking, that where warning flags were
> > > > > concerned, it generally desireous to have them on by default, and make people
> > > > > explicitly choose to turn them off.
> > 
> > I think DPDK developpers look at the EXPERIMENTAL warning in the doxygen
> > above the prototype.
> I'm not sure I agree with that, but regardless, my initial reasoning for writing
> this tag was to call attention to experimental API's during review, rather than
> their use during development, so I didn't gripe about ABI changes on expemted
> code.   Additionally, weather they look at the docs or not, they can
> pre-emptively turn off the warning if they choose.
> 
> > And when API will be switched to stable, we probably won't remove the flag
> > in the makefile to disable allowing experimental.
> Well, that remains to be seen I suppose.
> 
> > So at the end, we could just allow experimental API for all internal libs.
> Thats a rather bootstrapping argument.  You are effecitvely saying that no one
> developing will ever want to be warned of using experimental APIs in DPDK, so
> lets just turn it off everyone.  I would really rather let individual developers
> make that call at the time they author something new.

I don't see the benefit,
but I am OK to keep it like this.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev]  [PATCH 0/5] dpdk: enhance EXPERIMENTAL api tagging
      @ 2018-01-22  1:48  4% ` Neil Horman
  2018-01-22  1:48  4%   ` [dpdk-dev] [[PATCH v5] 1/5] buildtools: Add tool to check EXPERIMENTAL api exports Neil Horman
                     ` (2 more replies)
  2 siblings, 3 replies; 200+ results
From: Neil Horman @ 2018-01-22  1:48 UTC (permalink / raw)
  To: dev

Hey all-
        A few days ago, I was lamenting the fact that, when reviewing patches I
would frequently complain about ABI changes that were actually considered safe
because they were part of the EXPERIMENTAL api set.  John M. asked me then what
I might do to improve the situation, and the following patch set is a proposal
that I've come up with.

        In thinking about the problem I identified two issues that I think we
can improve on in this area:

1) Make experimental api calls more visible in the source code.  That is to say,
when reviewing patches, it would be nice to have some sort of visual reference
that indicates that the changes being made are part of an experimental API and
therefore ABI concerns need not be addressed

2) Make experimenal api usage more visible to consumers of the DPDK, so that
they can make a more informed decision about the API's they consume in their
application.  We make an effort to document all the experimental API's, but
there is no guarantee that a user will check the documentation before making use
of a new library.

This patch set attempts to achieve both of the above goals.  To do this I've
added an __experimental macro tag, suitable for inserting into api forward
declarations and definitions.

The presence of the tag in the header and c files where the api code resides
increases the likelyhood that any patch submitted against them will include the
tag in the context, making it clear to reviewers that ABI stability isn't a
concern here.

Also, This tag marks each function it is used on with an attibute causing any
use of the fuction to emit a warning during the build
with a message indicating that the API call in question is not yet part of the
stable interface.  Developers can then make an informed decision to suppress
that warning or not.

Because there is internal use of several experimental API's, this set also
includes a new override macro ALLOW_EXPERIMENTAL_API to automatically
suprress these warnings.  I think its fair to assume that, for internal use, we
almost always want to suppress these warnings, as by definition any change to
the apis (even their removal) must be done in parallel with an appropriate
change in the calling locations, lest the dpdk build itself break.

Neil

---
v5 Changes
* Clean ups suggested by Thomas

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation
  2018-01-22  1:48  4% ` [dpdk-dev] [PATCH 0/5] dpdk: enhance EXPERIMENTAL api tagging Neil Horman
  2018-01-22  1:48  4%   ` [dpdk-dev] [[PATCH v5] 1/5] buildtools: Add tool to check EXPERIMENTAL api exports Neil Horman
  2018-01-22  1:48  5%   ` [dpdk-dev] [[PATCH v5] 2/5] compat: Add __rte_experimental macro Neil Horman
@ 2018-01-22  1:48 10%   ` Neil Horman
  2018-01-23 10:35  4%     ` Mcnamara, John
  2 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-01-22  1:48 UTC (permalink / raw)
  To: dev; +Cc: Neil Horman, Thomas Monjalon, Mcnamara, John, Bruce Richardson

Document the need to add the __experimental tag to appropriate functions

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Thomas Monjalon <thomas@monjalon.net>
CC: "Mcnamara, John" <john.mcnamara@intel.com>
CC: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/contributing/versioning.rst | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/doc/guides/contributing/versioning.rst b/doc/guides/contributing/versioning.rst
index 400090628..b4de9ed04 100644
--- a/doc/guides/contributing/versioning.rst
+++ b/doc/guides/contributing/versioning.rst
@@ -50,6 +50,20 @@ those new APIs and start finding issues with them, new DPDK APIs will be
 automatically marked as ``experimental`` to allow for a period of stabilization
 before they become part of a tracked ABI.
 
+Note that marking an API as experimental is a multi step process.  To mark an API
+as experimental, the symbols which are desired to be exported must be placed in
+an EXPERIMENTAL version block in the corresponding libraries' version map
+script. Secondly, the corresponding definitions of those exported functions, and
+their forward declarations (in the development header files), must be marked
+with the __rte_experimental tag (see rte_compat.h).  The DPDK build makefiles
+perform a check to ensure that the map file and the C code reflect the same
+list of symbols.  This check can be circumvented by defining
+ALLOW_EXPERIMENTAL_API during compilation in the corresponding library Makefile.
+
+In addition to tagging the code with __rte_experimental, the
+doxygen markup must also contain the EXPERIMENTAL string, and the MAINTAINER
+file should note that the library contains EXPERIMENTAL APIs.
+
 ABI versions, once released, are available until such time as their
 deprecation has been noted in the Release Notes for at least one major release
 cycle. For example consider the case where the ABI for DPDK 2.0 has been
-- 
2.14.3

^ permalink raw reply	[relevance 10%]

* [dpdk-dev] [[PATCH v5] 1/5] buildtools: Add tool to check EXPERIMENTAL api exports
  2018-01-22  1:48  4% ` [dpdk-dev] [PATCH 0/5] dpdk: enhance EXPERIMENTAL api tagging Neil Horman
@ 2018-01-22  1:48  4%   ` Neil Horman
  2018-01-22  1:48  5%   ` [dpdk-dev] [[PATCH v5] 2/5] compat: Add __rte_experimental macro Neil Horman
  2018-01-22  1:48 10%   ` [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation Neil Horman
  2 siblings, 0 replies; 200+ results
From: Neil Horman @ 2018-01-22  1:48 UTC (permalink / raw)
  To: dev; +Cc: Neil Horman, Thomas Monjalon, Mcnamara, John, Bruce Richardson

This tools reads the given version map for a directory, and checks to
ensure that, for each symbol listed in the export list, the corresponding
definition is tagged as __rte_experimental, erroring out if its not.  In this
way, we can ensure that the EXPERIMENTAL api is kept in sync with the tags

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Thomas Monjalon <thomas@monjalon.net>
CC: "Mcnamara, John" <john.mcnamara@intel.com>
CC: Bruce Richardson <bruce.richardson@intel.com>
---
 MAINTAINERS                           |  1 +
 buildtools/check-experimental-syms.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)
 create mode 100755 buildtools/check-experimental-syms.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index b51c2d096..446d2545d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -77,6 +77,7 @@ M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
 F: doc/guides/rel_notes/deprecation.rst
 F: devtools/validate-abi.sh
+F: buildtools/check-experimental-syms.sh
 
 Driver information
 F: buildtools/pmdinfogen/
diff --git a/buildtools/check-experimental-syms.sh b/buildtools/check-experimental-syms.sh
new file mode 100755
index 000000000..7d21de35c
--- /dev/null
+++ b/buildtools/check-experimental-syms.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+# SPDX-License-Identifier: BSD-3-Clause
+
+MAPFILE=$1
+OBJFILE=$2
+
+if [ -d $MAPFILE ]
+then
+	exit 0
+fi
+
+for i in `awk 'BEGIN {found=0}
+		/.*EXPERIMENTAL.*/ {found=1}
+		/.*}.*;/ {found=0}
+		/.*;/ {if (found == 1) print $1}' $MAPFILE`
+do
+	SYM=`echo $i | sed -e"s/;//"`
+	objdump -t $OBJFILE | grep -q "\.text.*$SYM"
+	IN_TEXT=$?
+	objdump -t $OBJFILE | grep -q "\.text\.experimental.*$SYM"
+	IN_EXP=$?
+	if [ $IN_TEXT -eq 0 -a $IN_EXP -ne 0 ]
+	then
+		echo "$SYM is not flagged as experimental"
+		echo "but is listed in version map"
+		echo "Please add __rte_experimental to the definition of $SYM"
+		exit 1
+	fi
+done
+exit 0
+
-- 
2.14.3

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [[PATCH v5] 2/5] compat: Add __rte_experimental macro
  2018-01-22  1:48  4% ` [dpdk-dev] [PATCH 0/5] dpdk: enhance EXPERIMENTAL api tagging Neil Horman
  2018-01-22  1:48  4%   ` [dpdk-dev] [[PATCH v5] 1/5] buildtools: Add tool to check EXPERIMENTAL api exports Neil Horman
@ 2018-01-22  1:48  5%   ` Neil Horman
  2018-01-22  1:48 10%   ` [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation Neil Horman
  2 siblings, 0 replies; 200+ results
From: Neil Horman @ 2018-01-22  1:48 UTC (permalink / raw)
  To: dev; +Cc: Neil Horman, Thomas Monjalon, Mcnamara, John, Bruce Richardson

The __rte_experimental macro tags a given exported function as being part of
the EXPERIMENTAL api.  Use of this tag will cause any caller of the
function (that isn't removed by dead code elimination) to emit a warning
that the user is making use of an API whos stabilty isn't guaranteed.
It also places the function in the .text.experimental section, which is
used to validate the tag against the corresponding library version map

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Thomas Monjalon <thomas@monjalon.net>
CC: "Mcnamara, John" <john.mcnamara@intel.com>
CC: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_compat/rte_compat.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
index 41e8032ba..ad8f81ffe 100644
--- a/lib/librte_compat/rte_compat.h
+++ b/lib/librte_compat/rte_compat.h
@@ -101,5 +101,16 @@
  */
 #endif
 
+#ifndef ALLOW_EXPERIMENTAL_API
 
+#define __rte_experimental \
+__attribute__((deprecated("Symbol is not yet part of stable ABI"), \
+section(".text.experimental")))
+
+#else
+
+#define __rte_experimental \
+__attribute__((section(".text.experimental")))
+
+#endif
 #endif /* _RTE_COMPAT_H_ */
-- 
2.14.3

^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH v2] checkpatches.sh: Add checks for ABI symbol addition
  2018-01-21 20:29  9%   ` Thomas Monjalon
@ 2018-01-22  1:54  4%     ` Neil Horman
  2018-01-22  2:05  4%       ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-01-22  1:54 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, john.mcnamara, bruce.richardson, Ferruh Yigit, Stephen Hemminger

On Sun, Jan 21, 2018 at 09:29:18PM +0100, Thomas Monjalon wrote:
> Hi,
> 
> 16/01/2018 19:22, Neil Horman:
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> >  Developers and Maintainers Tools
> >  M: Thomas Monjalon <thomas@monjalon.net>
> > +M: Neil Horman <nhorman@tuxdriver.com>
> >  F: MAINTAINERS
> >  F: devtools/check-dup-includes.sh
> >  F: devtools/check-maintainers.sh
> > @@ -52,6 +53,7 @@ F: devtools/get-maintainer.sh
> >  F: devtools/git-log-fixes.sh
> >  F: devtools/load-devel-config
> >  F: devtools/test-build.sh
> > +F: devtools/validate-new-api.sh
> >  F: license/
> 
> I really think it should be in the section "ABI versioning""
> 
I can do that.

> > --- a/devtools/checkpatches.sh
> > +++ b/devtools/checkpatches.sh
> > +export VALIDATE_NEW_API=$(dirname $(readlink -e $0))/validate-new-api.sh
> 
> Why export?
> 
As I recall I had needed that in an earlier incantation of this script, but its
likely not needed any longer

> >  print_usage () {
> >  	cat <<- END_OF_HELP
> > +	$(dirname $0)
> >  	usage: $(basename $0) [-q] [-v] [-nX|patch1 [patch2] ...]]
> 
> This dirname is a debug leftover?
Yes.

> 
> > @@ -96,9 +100,25 @@ check () { # <patch> <commit> <title>
> >  	else
> >  		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
> >  	fi
> > -	[ $? -ne 0 ] || return 0
> > +	reta=$?
> 
> What means reta?
> 
just a subindex on a return code.

> > +
> >  	$verbose || printf '\n### %s\n\n' "$3"
> >  	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
> > +
> > +	echo
> > +	echo "Checking API additions/removals:"
> 
> You should respect $verbose before printing such header.
> 
I can add a quiet/verbose mode option, but I didn't think it was needed here
since its being run automatically from within checkpatches.

> > +	if [ -n "$1" ] ; then
> > +		report=$($VALIDATE_NEW_API $1)
> > +	elif [ -n "$2" ] ; then
> > +		report=$(git format-patch \
> > +			 --find-renames --no-stat --stdout -1 $commit |
> > +			$VALIDATE_NEW_API -)
> > +	else
> > +		report=$($VALIDATE_NEW_API -)
> > +	fi
> > +	[ $? -ne 0 -o $reta -ne 0 ] || return 0
> > +	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
> > +
> >  	status=$(($status + 1))
> >  }
> 
> > --- /dev/null
> > +++ b/devtools/validate-new-api.sh
> 
> About the file name, is it only for new API?
> You don't like check-symbol-change.sh ?
> It may be stupid, but I thought "validate" is more related to full test,
> like validate-abi.sh does for the ABI, and "check" can be a partial
> test like done in checkpatches.sh.
> 
I can change the name, but to answer your question, its realy meant to validate
any changes to a version map, so change.sh suffixes might be more appropriate.

> > +		}' > ./$mapdb
> > +
> > +		sort -u $mapdb > ./$mapdb.2
> > +		mv -f $mapdb.2 $mapdb
> [...]
> > +mapfile=`mktemp mapdb.XXXXXX`
> [...]
> > +rm -f $mapfile
> 
> If you create temporary file, you should remove it in a trap cleanup,
> in case of interrupted processing.
> The best is to avoid temp file, but use variables instead.
I'm not going to be able to avoid a temp file, since the number of changes in a
map are inditerminate.  I can trap and clean up the temp files though.

I'm still in transit, so it will likely be a few days before I can get to this.

Neil

> 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [dpdk-announce] release candidate 18.02-rc1
@ 2018-01-22  2:02  3% Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-22  2:02 UTC (permalink / raw)
  To: announce

A new DPDK release candidate is ready for testing:
	http://dpdk.org/browse/dpdk/tag/?id=v18.02-rc1

Special attention was paid to not break the ABI in this release.
It means 18.02 could replace 17.11 without rebuilding the applications.
However it is advised to keep using 17.11 LTS for long term deployments.

The release notes are not complete yet:
	http://dpdk.org/doc/guides/rel_notes/release_18_02.html
Some highlights:
	- new license header (SPDX tag)
	- bbdev (Wireless Base Band) device class
	- ethdev probe notifications and port ownership
	- Hyper-V platform driver
	- AVF (Adaptive Virtual Function) ethdev driver
	- IPsec offload in DPAA
	- DPAA eventdev driver
	- OPDL (Ordered Packet Distribution Library) eventdev driver

Some features may be integrated in 18.02-rc2:
	- rawdev
	- hotplug events API
	- AMD drivers
	- meson build system

Some planned features are postponed to 18.05:
	- meter profile
	- new devargs syntax
	- multi-process channel
	- dynamic memory management
	- VF representor
	- vhost-pci
	- virtio-crypto

The planned release date for 18.02 is in two weeks.
We are very short in time. It will be probably a bit late,
but most changes must be done before the Chinese New Year holidays.

Please start testing and fixing bugs now.
When finding a new bug, please report it on bugzilla:
	http://dpdk.org/tracker/

Thank you everyone

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2] checkpatches.sh: Add checks for ABI symbol addition
  2018-01-22  1:54  4%     ` Neil Horman
@ 2018-01-22  2:05  4%       ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-22  2:05 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, john.mcnamara, bruce.richardson, Ferruh Yigit, Stephen Hemminger

22/01/2018 02:54, Neil Horman:
> On Sun, Jan 21, 2018 at 09:29:18PM +0100, Thomas Monjalon wrote:
> > >  	$verbose || printf '\n### %s\n\n' "$3"
> > >  	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
> > > +
> > > +	echo
> > > +	echo "Checking API additions/removals:"
> > 
> > You should respect $verbose before printing such header.
> > 
> I can add a quiet/verbose mode option, but I didn't think it was needed here
> since its being run automatically from within checkpatches.

I mean there is a verbose option already.
So you just have to take it into account when printing.
Thanks

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/2] lib/cryptodev: add support to set session private data
  2018-01-18  6:52  4%                               ` Gujjar, Abhinandan S
@ 2018-01-22  6:51  0%                                 ` Gujjar, Abhinandan S
  0 siblings, 0 replies; 200+ results
From: Gujjar, Abhinandan S @ 2018-01-22  6:51 UTC (permalink / raw)
  To: 'Akhil Goyal',
	De Lara Guarch, Pablo, Doherty, Declan, 'Jacob, Jerin'
  Cc: 'dev@dpdk.org', Vangati, Narender, Rao, Nikhil

Hi All,

> -----Original Message-----
> From: Gujjar, Abhinandan S
> Sent: Thursday, January 18, 2018 12:22 PM
> To: Akhil Goyal <akhil.goyal@nxp.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; Jacob, Jerin
> <Jerin.JacobKollanukkaran@cavium.com>
> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>; Rao,
> Nikhil <nikhil.rao@intel.com>
> Subject: RE: [PATCH 1/2] lib/cryptodev: add support to set session private data
> 
> Hi Akhil,
> 
> > -----Original Message-----
> > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > Sent: Wednesday, January 17, 2018 4:23 PM
> > To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; De Lara
> > Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Doherty, Declan
> > <declan.doherty@intel.com>; Jacob, Jerin
> > <Jerin.JacobKollanukkaran@cavium.com>
> > Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>; Rao,
> > Nikhil <nikhil.rao@intel.com>
> > Subject: Re: [PATCH 1/2] lib/cryptodev: add support to set session
> > private data
> >
> > Hi Abhinandan,
> > On 1/17/2018 3:35 PM, Gujjar, Abhinandan S wrote:
> > > Hi Akhil,
> > >
> > >> -----Original Message-----
> > >> From: De Lara Guarch, Pablo
> > >> Sent: Wednesday, January 17, 2018 3:16 PM
> > >> To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Akhil Goyal
> > >> <akhil.goyal@nxp.com>; Doherty, Declan <declan.doherty@intel.com>;
> > >> Jacob, Jerin <Jerin.JacobKollanukkaran@cavium.com>
> > >> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> > >> Rao, Nikhil <nikhil.rao@intel.com>
> > >> Subject: RE: [PATCH 1/2] lib/cryptodev: add support to set session
> > >> private data
> > >>
> > >> Hi Abhinandan,
> > >>
> > >>> -----Original Message-----
> > >>> From: Gujjar, Abhinandan S
> > >>> Sent: Wednesday, January 17, 2018 6:35 AM
> > >>> To: Akhil Goyal <akhil.goyal@nxp.com>; Doherty, Declan
> > >>> <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > >>> <pablo.de.lara.guarch@intel.com>; Jacob, Jerin
> > >>> <Jerin.JacobKollanukkaran@cavium.com>
> > >>> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> > >>> Rao, Nikhil <nikhil.rao@intel.com>
> > >>> Subject: RE: [PATCH 1/2] lib/cryptodev: add support to set session
> > >>> private data
> > >>>
> > >>> Hi Akhil,
> > >>>
> > >>
> > >> ...
> > >>
> > >>> I guess, you are suggesting below changes:
> > >>> diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > >>> b/lib/librte_cryptodev/rte_cryptodev.h
> > >>> index 56958a6..057c39a 100644
> > >>> --- a/lib/librte_cryptodev/rte_cryptodev.h
> > >>> +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > >>> @@ -892,6 +892,8 @@ struct rte_cryptodev_data {
> > >>>
> > >>>   /** Cryptodev symmetric crypto session */  struct
> > >>> rte_cryptodev_sym_session {
> > >>> +       uint16_t private_data_offset;
> > >>> +       /**< Private data offset */
> > >>>          __extension__ void *sess_private_data[0];
> > >>>          /**< Private session material */  }; I am ok with this.
> > >>>
> > >>> Declan/Pablo,
> > >>> Is this ok? Do you see any impact on performance or anything else
> > >>> has to be considered?
> > >>
> > >> This is breaking ABI, and since there is a zero length array, this
> > >> latter has to be at the end of the structure.
> > >> Therefore, this is not a valid option unless ABI deprecation is
> > >> announced and then it could be merged in the next release.
> > > What is your opinion on this?
> > > Should we consider retaining the enum rte_crypto_op_private_data_type?
> >
> > As per our previous discussion we are anyway pushing crypto adapter to
> > next release, then we do have time for the deprecation notice to be sent.
> Not sure, it is really worth breaking ABI or have an enum.
> > Or you can reserve the first byte of private data (internal to
> > library) in the session to check whether the private data is valid or not.
> Regarding reserving the first byte which validates the rest of the metadata data,
> unless this byte is also included part of rte_cryptodev_sym_session_create()
> i.e.
> memset(sess, 0, (sizeof(void *) * nb_drivers + private_data_flag)); and in
> rte_cryptodev_get_header_session_size(void)
> {
> 	/*
> 	 * Header contains pointers to the private data
> 	 * of all registered drivers
> 	 */
> 	return (sizeof(void *) * nb_drivers + private_data_flag); } Without above
> changes, the flag content can't be just trusted. Do you agree?
I will send the next patch based on above approach.
> 
> Pablo/Declan,
> Hope the changes are ok? ABI breakage or anything has to be considered again?
> >
> > IMO, private data offset in session is a better approach instead of
> > adding one more enum. Others can suggest.
> @Others, please provide your inputs so that I can prepare the next patch.
> 
> -Abhinandan
> >
> > -Akhil
> > >>
> > >> Pablo
> > > Abhinandan
> > >


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] build: make compat a universal dependency
@ 2018-01-22 15:42  3% Bruce Richardson
  2018-01-22 17:43  0% ` Luca Boccassi
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2018-01-22 15:42 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson

By making "compat" lib (which consists of a header only) a dependency of
the EAL, we make the header file available to all other libs, drivers and
apps, and thereby make it less work to do ABI versioning.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/bonding/meson.build    | 2 +-
 lib/librte_distributor/meson.build | 2 +-
 lib/librte_eal/meson.build         | 1 +
 lib/librte_ether/meson.build       | 2 +-
 lib/librte_hash/meson.build        | 2 +-
 lib/librte_lpm/meson.build         | 1 -
 6 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/meson.build b/drivers/net/bonding/meson.build
index 4dc5a5f67..b90abc6de 100644
--- a/drivers/net/bonding/meson.build
+++ b/drivers/net/bonding/meson.build
@@ -6,6 +6,6 @@ sources = files('rte_eth_bond_api.c', 'rte_eth_bond_pmd.c',
 	'rte_eth_bond_args.c', 'rte_eth_bond_8023ad.c', 'rte_eth_bond_alb.c')
 
 deps += 'sched' # needed for rte_bitmap.h
-deps += ['compat', 'ip_frag', 'cmdline']
+deps += ['ip_frag', 'cmdline']
 
 install_headers('rte_eth_bond.h', 'rte_eth_bond_8023ad.h')
diff --git a/lib/librte_distributor/meson.build b/lib/librte_distributor/meson.build
index e9caf8675..dba7e3b2a 100644
--- a/lib/librte_distributor/meson.build
+++ b/lib/librte_distributor/meson.build
@@ -8,4 +8,4 @@ else
 	sources += files('rte_distributor_match_generic.c')
 endif
 headers = files('rte_distributor.h')
-deps += ['mbuf', 'compat']
+deps += ['mbuf']
diff --git a/lib/librte_eal/meson.build b/lib/librte_eal/meson.build
index 9c141b3c3..9f5f0f3ed 100644
--- a/lib/librte_eal/meson.build
+++ b/lib/librte_eal/meson.build
@@ -45,6 +45,7 @@ else
 endif
 
 version = 6  # the version of the EAL API
+deps += 'compat'
 cflags += '-D_GNU_SOURCE'
 sources = common_sources + env_sources
 objs = common_objs + env_objs
diff --git a/lib/librte_ether/meson.build b/lib/librte_ether/meson.build
index f83991268..18f94bf96 100644
--- a/lib/librte_ether/meson.build
+++ b/lib/librte_ether/meson.build
@@ -21,4 +21,4 @@ headers = files('rte_ethdev.h',
 	'rte_tm.h',
 	'rte_tm_driver.h')
 
-deps += ['net', 'compat']
+deps += ['net']
diff --git a/lib/librte_hash/meson.build b/lib/librte_hash/meson.build
index 8e1113789..e139e1d76 100644
--- a/lib/librte_hash/meson.build
+++ b/lib/librte_hash/meson.build
@@ -14,4 +14,4 @@ headers = files('rte_cmp_arm64.h',
 	'rte_thash.h')
 
 sources = files('rte_cuckoo_hash.c', 'rte_fbk_hash.c')
-deps += ['ring', 'compat']
+deps += ['ring']
diff --git a/lib/librte_lpm/meson.build b/lib/librte_lpm/meson.build
index a7c7fa7ae..067849427 100644
--- a/lib/librte_lpm/meson.build
+++ b/lib/librte_lpm/meson.build
@@ -7,4 +7,3 @@ headers = files('rte_lpm.h', 'rte_lpm6.h')
 # since header files have different names, we can install all vector headers
 # without worrying about which architecture we actually need
 headers += files('rte_lpm_altivec.h', 'rte_lpm_neon.h', 'rte_lpm_sse.h')
-deps += ['compat']
-- 
2.14.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] build: make compat a universal dependency
  2018-01-22 15:42  3% [dpdk-dev] [PATCH] build: make compat a universal dependency Bruce Richardson
@ 2018-01-22 17:43  0% ` Luca Boccassi
  2018-01-23  9:26  0%   ` Bruce Richardson
  2018-01-23 10:00  0%   ` Bruce Richardson
  0 siblings, 2 replies; 200+ results
From: Luca Boccassi @ 2018-01-22 17:43 UTC (permalink / raw)
  To: Bruce Richardson, dev

On Mon, 2018-01-22 at 15:42 +0000, Bruce Richardson wrote:
> By making "compat" lib (which consists of a header only) a dependency
> of
> the EAL, we make the header file available to all other libs, drivers
> and
> apps, and thereby make it less work to do ABI versioning.
> 
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  drivers/net/bonding/meson.build    | 2 +-
>  lib/librte_distributor/meson.build | 2 +-
>  lib/librte_eal/meson.build         | 1 +
>  lib/librte_ether/meson.build       | 2 +-
>  lib/librte_hash/meson.build        | 2 +-
>  lib/librte_lpm/meson.build         | 1 -
>  6 files changed, 5 insertions(+), 5 deletions(-)

Acked-by: Luca Boccassi <bluca@debian.org>

How's the Meson patchset looking for 18.02? What's on the TODO list?

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] build: make compat a universal dependency
  2018-01-22 17:43  0% ` Luca Boccassi
@ 2018-01-23  9:26  0%   ` Bruce Richardson
  2018-01-23 10:00  0%   ` Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2018-01-23  9:26 UTC (permalink / raw)
  To: Luca Boccassi; +Cc: dev

On Mon, Jan 22, 2018 at 05:43:06PM +0000, Luca Boccassi wrote:
> On Mon, 2018-01-22 at 15:42 +0000, Bruce Richardson wrote:
> > By making "compat" lib (which consists of a header only) a dependency
> > of
> > the EAL, we make the header file available to all other libs, drivers
> > and
> > apps, and thereby make it less work to do ABI versioning.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  drivers/net/bonding/meson.build    | 2 +-
> >  lib/librte_distributor/meson.build | 2 +-
> >  lib/librte_eal/meson.build         | 1 +
> >  lib/librte_ether/meson.build       | 2 +-
> >  lib/librte_hash/meson.build        | 2 +-
> >  lib/librte_lpm/meson.build         | 1 -
> >  6 files changed, 5 insertions(+), 5 deletions(-)
> 
> Acked-by: Luca Boccassi <bluca@debian.org>
> 
> How's the Meson patchset looking for 18.02? What's on the TODO list?
> 
Since it's going in as experimental, I think the requirements for
completeness are not too strict. I'm not aware of any blocking gaps at
this stage apart from the release note update which has the patch
already submitted. I plan to submit the pull request very soon.

For 18.05, the main objective I think is to complete the build,
especially all drivers, improve the autotests, [e.g. Harry has some
ideas of splitting out the performance tests into a benchmarking
target], get the docs building [I see Kevin sent out a patch for that
already], and a few cleanups. I'm hoping having the build merged in as
experiemental will help encourage maintainers to port their components
over, if not already done, and it will certainly help with maintenance -
the amount of files added, renamed or moved in a release astounded me!

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] build: make compat a universal dependency
  2018-01-22 17:43  0% ` Luca Boccassi
  2018-01-23  9:26  0%   ` Bruce Richardson
@ 2018-01-23 10:00  0%   ` Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2018-01-23 10:00 UTC (permalink / raw)
  To: Luca Boccassi; +Cc: dev

O Mon, Jan 22, 2018 at 05:43:06PM +0000, Luca Boccassi wrote:
> On Mon, 2018-01-22 at 15:42 +0000, Bruce Richardson wrote:
> > By making "compat" lib (which consists of a header only) a dependency
> > of
> > the EAL, we make the header file available to all other libs, drivers
> > and
> > apps, and thereby make it less work to do ABI versioning.
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> >  drivers/net/bonding/meson.build    | 2 +-
> >  lib/librte_distributor/meson.build | 2 +-
> >  lib/librte_eal/meson.build         | 1 +
> >  lib/librte_ether/meson.build       | 2 +-
> >  lib/librte_hash/meson.build        | 2 +-
> >  lib/librte_lpm/meson.build         | 1 -
> >  6 files changed, 5 insertions(+), 5 deletions(-)
> 
> Acked-by: Luca Boccassi <bluca@debian.org>
> 
Applied to dpdk-next-build

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation
  2018-01-22  1:48 10%   ` [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation Neil Horman
@ 2018-01-23 10:35  4%     ` Mcnamara, John
  2018-01-29 21:42  4%       ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Mcnamara, John @ 2018-01-23 10:35 UTC (permalink / raw)
  To: Neil Horman, dev; +Cc: Thomas Monjalon, Richardson, Bruce



> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, January 22, 2018 1:48 AM
> To: dev@dpdk.org
> Cc: Neil Horman <nhorman@tuxdriver.com>; Thomas Monjalon
> <thomas@monjalon.net>; Mcnamara, John <john.mcnamara@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation
> 
> Document the need to add the __experimental tag to appropriate functions
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Thomas Monjalon <thomas@monjalon.net>
> CC: "Mcnamara, John" <john.mcnamara@intel.com>
> CC: Bruce Richardson <bruce.richardson@intel.com>
> ...
> +Note that marking an API as experimental is a multi step process.  To
> +mark an API as experimental, the symbols which are desired to be
> +exported must be placed in an EXPERIMENTAL version block in the
> +corresponding libraries' version map script. Secondly, the
> +corresponding definitions of those exported functions, and their
> +forward declarations (in the development header files), must be marked
> +with the __rte_experimental tag (see rte_compat.h).  The DPDK build
> +makefiles perform a check to ensure that the map file and the C code
> +reflect the same list of symbols.  This check can be circumvented by
> defining ALLOW_EXPERIMENTAL_API during compilation in the corresponding
> library Makefile.
> +
> +In addition to tagging the code with __rte_experimental, the doxygen
> +markup must also contain the EXPERIMENTAL string, and the MAINTAINER
> +file should note that the library contains EXPERIMENTAL APIs.
> +
>  ABI versions, once released, are available until such time as their
> deprecation has been noted in the Release Notes for at least one major
> release  cycle. For example consider the case where the ABI for DPDK 2.0
> has been
> --
> 2.14.3

Thanks for the update, and this work in general.

The rendered docs would probably look a better better with __rte_experimental
and ALLOW_EXPERIMENTAL_API is fixed width backticks ``var`` but that is only
a "nice to have" so no need for a respin.

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-01-18 10:32 13% ` [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes Anatoly Burakov
@ 2018-01-23 10:36  0%   ` Mcnamara, John
  2018-02-05 11:47  0%   ` Bruce Richardson
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Mcnamara, John @ 2018-01-23 10:36 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: Neil Horman, Kovacevic, Marko



> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, January 18, 2018 10:32 AM
> To: dev@dpdk.org
> Cc: Neil Horman <nhorman@tuxdriver.com>; Mcnamara, John
> <john.mcnamara@intel.com>; Kovacevic, Marko <marko.kovacevic@intel.com>
> Subject: [PATCH v2] doc: add deprecation notice for memory hotplug changes
> 
> Due to coming changes outlined in memory hotplug RFC, there will be
> several API/ABI changes.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
  @ 2018-01-23 10:39  4%   ` Mcnamara, John
  2018-02-07 10:10  4%     ` Jerin Jacob
  2018-02-12 16:00  4%   ` Jonas Pfefferle
  1 sibling, 1 reply; 200+ results
From: Mcnamara, John @ 2018-01-23 10:39 UTC (permalink / raw)
  To: Burakov, Anatoly, dev; +Cc: Neil Horman, Kovacevic, Marko



> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Tuesday, January 16, 2018 5:54 PM
> To: dev@dpdk.org
> Cc: Neil Horman <nhorman@tuxdriver.com>; Mcnamara, John
> <john.mcnamara@intel.com>; Kovacevic, Marko <marko.kovacevic@intel.com>
> Subject: [PATCH] doc: add ABI change notice for numa_node_count in eal
> 
> There will be a new function added in v18.05 that will return number of
> detected sockets, which will change the ABI.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 13e8543..9662150 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -8,6 +8,8 @@ API and ABI deprecation notices are to be posted here.
>  Deprecation Notices
>  -------------------
> 
> +* eal: new ``numa_node_count`` member will be added to ``rte_config``
> +structure in v18.05.
>  * eal: several API and ABI changes are planned for ``rte_devargs`` in  v18.02.

In general it is best to leave a blank line between the bullet points. But that
doesn't affect the rendering so:

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [RFC v2 00/17] mempool: add bucket mempool driver
    @ 2018-01-23 13:15  2% ` Andrew Rybchenko
  2018-01-31 16:44  0%   ` Olivier Matz
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
  1 sibling, 2 replies; 200+ results
From: Andrew Rybchenko @ 2018-01-23 13:15 UTC (permalink / raw)
  To: dev
  Cc: Olivier Matz, Santosh Shukla, Jerin Jacob, Hemant Agrawal,
	Shreyansh Jain

The patch series starts from generic enhancements suggested by Olivier.
Basically it adds driver callbacks to calculate required memory size and
to populate objects using provided memory area. It allows to remove
so-called capability flags used before to tell generic code how to
allocate and slice allocated memory into mempool objects.
Clean up which removes get_capabilities and register_memory_area is
not strictly required, but I think right thing to do.
Existing mempool drivers are updated.

I've kept rte_mempool_populate_iova_tab() intact since it seems to
be not directly related XMEM API functions.

The patch series adds bucket mempool driver which allows to allocate
(both physically and virtually) contiguous blocks of objects and adds
mempool API to do it. It is still capable to provide separate objects,
but it is definitely more heavy-weight than ring/stack drivers.
The driver will be used by the future Solarflare driver enhancements
which allow to utilize physical contiguous blocks in the NIC
hardware/firmware.

The target usecase is dequeue in blocks and enqueue separate objects
back (which are collected in buckets to be dequeued). So, the memory
pool with bucket driver is created by an application and provided to
networking PMD receive queue. The choice of bucket driver is done using
rte_eth_dev_pool_ops_supported(). A PMD that relies upon contiguous
block allocation should report the bucket driver as the only supported
and preferred one.

Introduction of the contiguous block dequeue operation is proven by
performance measurements using autotest with minor enhancements:
 - in the original test bulks are powers of two, which is unacceptable
   for us, so they are changed to multiple of contig_block_size;
 - the test code is duplicated to support plain dequeue and
   dequeue_contig_blocks;
 - all the extra test variations (with/without cache etc) are eliminated;
 - a fake read from the dequeued buffer is added (in both cases) to
   simulate mbufs access.

start performance test for bucket (without cache)
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Srate_persec=   111935488
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Srate_persec=   115290931
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Srate_persec=   353055539
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Srate_persec=   353330790
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Srate_persec=   224657407
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Srate_persec=   230411468
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Srate_persec=   706700902
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Srate_persec=   703673139
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Srate_persec=   425236887
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Srate_persec=   437295512
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Srate_persec=  1343409356
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Srate_persec=  1336567397
start performance test for bucket (without cache + contiguous dequeue)
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Crate_persec=   122945536
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Crate_persec=   126458265
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Crate_persec=   374262988
mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Crate_persec=   377316966
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Crate_persec=   244842496
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Crate_persec=   251618917
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Crate_persec=   751226060
mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Crate_persec=   756233010
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Crate_persec=   462068120
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Crate_persec=   476997221
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Crate_persec=  1432171313
mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Crate_persec=  1438829771

The number of objects in the contiguous block is a function of bucket
memory size (.config option) and total element size. In the future
additional API with possibility to pass parameters on mempool allocation
may be added.

It breaks ABI since changes rte_mempool_ops. Also it removes
rte_mempool_ops_register_memory_area() and
rte_mempool_ops_get_capabilities() since corresponding callbacks are
removed.

The target DPDK release is 18.05.

v2:
  - add driver ops to calculate required memory size and populate
    mempool objects, remove extra flags which were required before
    to control it
  - transition of octeontx and dpaa drivers to the new callbacks
  - change info API to get information from driver required to
    API user to know contiguous block size
  - remove get_capabilities (not required any more and may be
    substituted with more in info get API)
  - remove register_memory_area since it is substituted with
    populate callback which can do more
  - use SPDX tags
  - avoid all objects affinity to single lcore
  - fix bucket get_count
  - deprecate XMEM API
  - avoid introduction of a new function to flush cache
  - fix NO_CACHE_ALIGN case in bucket mempool

Andrew Rybchenko (10):
  mempool: fix phys contig check if populate default skipped
  mempool: add op to calculate memory size to be allocated
  mempool/octeontx: add callback to calculate memory size
  mempool: add op to populate objects using provided memory
  mempool/octeontx: implement callback to populate objects
  mempool: remove callback to get capabilities
  mempool: deprecate xmem functions
  mempool/octeontx: prepare to remove register memory area op
  mempool/dpaa: convert to use populate driver op
  mempool: remove callback to register memory area

Artem V. Andreev (7):
  mempool: ensure the mempool is initialized before populating
  mempool/bucket: implement bucket mempool manager
  mempool: support flushing the default cache of the mempool
  mempool: implement abstract mempool info API
  mempool: support block dequeue operation
  mempool/bucket: implement block dequeue operation
  mempool/bucket: do not allow one lcore to grab all buckets

 MAINTAINERS                                        |   9 +
 config/common_base                                 |   2 +
 drivers/mempool/Makefile                           |   1 +
 drivers/mempool/bucket/Makefile                    |  27 +
 drivers/mempool/bucket/rte_mempool_bucket.c        | 626 +++++++++++++++++++++
 .../mempool/bucket/rte_mempool_bucket_version.map  |   4 +
 drivers/mempool/dpaa/dpaa_mempool.c                |  13 +-
 drivers/mempool/octeontx/rte_mempool_octeontx.c    |  63 ++-
 lib/librte_mempool/rte_mempool.c                   | 192 ++++---
 lib/librte_mempool/rte_mempool.h                   | 366 +++++++++---
 lib/librte_mempool/rte_mempool_ops.c               |  48 +-
 lib/librte_mempool/rte_mempool_version.map         |  11 +-
 mk/rte.app.mk                                      |   1 +
 13 files changed, 1184 insertions(+), 179 deletions(-)
 create mode 100644 drivers/mempool/bucket/Makefile
 create mode 100644 drivers/mempool/bucket/rte_mempool_bucket.c
 create mode 100644 drivers/mempool/bucket/rte_mempool_bucket_version.map

-- 
2.7.4

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
@ 2018-01-23 13:23 13% Andrew Rybchenko
  2018-01-31 16:46  4% ` Olivier Matz
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2018-01-23 13:23 UTC (permalink / raw)
  To: dev; +Cc: Olivier Matz

An API/ABI changes are planned for 18.05 [1]:

 * Allow to customize how mempool objects are stored in memory.
 * Deprecate mempool XMEM API.
 * Add mempool driver ops to get information from mempool driver and
   dequeue contiguous blocks of objects if driver supports it.

[1] http://dpdk.org/ml/archives/dev/2018-January/088698.html

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
 doc/guides/rel_notes/deprecation.rst | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad59..9db80da 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,20 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* mempool: several API and ABI changes are planned in v18.05.
+  The following functions, introduced for Xen, which is not supported
+  anymore since v17.11, are hard to use, not used anywhere else in DPDK.
+  Therefore they will be deprecated in v18.05 and removed in v18.08:
+
+  - ``rte_mempool_xmem_create``
+  - ``rte_mempool_xmem_size``
+  - ``rte_mempool_xmem_usage``
+
+  The following changes are planned:
+
+  - removal of ``get_capabilities`` mempool ops and related flags.
+  - substitute ``register_memory_area`` with ``populate`` ops.
+  - addition of new ops to customize required memory chunk calculation,
+    customize objects population and allocate contiguous
+    block of objects if underlying driver supports it.
-- 
2.7.4

^ permalink raw reply	[relevance 13%]

* Re: [dpdk-dev] [PATCH 1/3] app/testpmd: Moved cmdline_flow to librte_cmdline
  @ 2018-01-24 11:57  0%               ` george.dit
  0 siblings, 0 replies; 200+ results
From: george.dit @ 2018-01-24 11:57 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Olivier Matz, Lu, Wenzhuo, dev

Hi again,

I decided to simplify things for now, hence sent a new minimal patch
only for testpmd (not librte_cmdline), which allows external
applications to compile against it without errors.

Thanks again for you time,
Georgios

On Tue, Jan 16, 2018 at 6:54 PM, Adrien Mazarguil
<adrien.mazarguil@6wind.com> wrote:
> On Tue, Jan 16, 2018 at 03:54:57PM +0100, george.dit@gmail.com wrote:
>> Hi Adrien,
>>
>> Thanks for your insights and sorry for omitting the cover letter (this is
>> my first patch in DPDK).
>
> No problem, don't worry about that. Remember to put as much context as close
> as possible to the related changes. The commit log of a patch is in fact a
> more appropriate place since a cover letter is simply a summary of a
> subsequent series and is discarded once applied.
>
>> I understand your concerns. The reason I proposed this patch is to
>> facilitate a more high level vendor agnostic API for configuring and
>> monitoring DPDK-based NICs.
>>
>> To avoid just copying thousands of lines of code into my application, do
>> you think it is feasible to move at least the components (struct context,
>> struct arg, struct token and the parse_* helpers) you mentioned and
>> restructure testpmd in a way that allows applications to re-use all of its
>> parsing features?
>
> Yes I think it's feasible, although at the cost of great effort because
> you'd need to untangle engine code from parser code to expose the former,
> the flow command being a mix of both. Proper APIs must be devised for that,
> hence my question: is it really worth the trouble?
>
> Other contributors already asked me how they could reuse the flow command
> parser to implement similar testpmd commands without copy/pasting the entire
> file, so I'm already convinced separating at least the engine part makes
> sense at the testpmd level. Moving it to librte_cmdline for external
> applications seems more complex though.
>
>> I could give it a try and come back with a new patch, otherwise I am
>> perfectly ok if you want to do it instead.
>
> While I'd certainly like to do it (at least at the testpmd level), it's
> unlikely to happen anytime soon due to other priorities.
>
> Feel free to take care of it if you're motivated enough, just keep in mind
> right now I don't think this should be exposed as a public API. I can change
> my mind if you manage to make it generic enough.
>
>> On Tue, Jan 16, 2018 at 3:31 PM, Adrien Mazarguil <
>> adrien.mazarguil@6wind.com> wrote:
>>
>> > George,
>> >
>> > I missed the original RFC thread [1][2] (you should have used it as a cover
>> > letter for this series BTW) please see below.
>> >
>> > On Tue, Jan 16, 2018 at 10:24:25AM +0100, Olivier Matz wrote:
>> > > Hi,
>> > >
>> > > > On Tue, Jan 16, 2018 at 9:40 AM Olivier Matz <olivier.matz@6wind.com>
>> > wrote:
>> > > >
>> > > > On Tue, Jan 16, 2018 at 08:45:32AM +0000, george.dit@gmail.com wrote:
>> > > > > Hi Georgios,
>> > > > >
>> > > > > On Mon, Jan 15, 2018 at 01:30:35AM +0000, Lu, Wenzhuo wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > > -----Original Message-----
>> > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georgios
>> > Katsikas
>> > > > > > > Sent: Saturday, January 13, 2018 5:01 AM
>> > > > > > > To: olivier.matz@6wind.com
>> > > > > > > Cc: dev@dpdk.org; Georgios Katsikas <george.dit@gmail.com>
>> > > > > > > Subject: [dpdk-dev] [PATCH 1/3] app/testpmd: Moved cmdline_flow
>> > to
>> > > > > > > librte_cmdline
>> > > > > > >
>> > > > > > > Signed-off-by: Georgios Katsikas <george.dit@gmail.com>
>> > > > > > Looks like a good idea to move this code to the lib.
>> > > > > > cc Adrien the author of this file, app/test-pmd/cmdline_flow.c.
>> > > > >
>> > > > > If the command line parsing of rte_flow is something that has some
>> > > > > chances to be shared among multiple applications, I agree it makes
>> > sense
>> > > > > to move it in a library.
>> > > > >
>> > > > > However, my opinion is that it would be better to have a specific
>> > > > > library for it, like librte_flow_cmdline, because I'm not sure that
>> > > > > people linking with librte_cmdline always want to pull the rte_flow
>> > > > > parsing code.
>> >
>> > In my opinion the entire flow command parser has nothing to do outside
>> > testpmd, it's way too specific, even if another application finds it
>> > useful.
>> >
>> > Code duplication being a bad thing, your application could as well compile
>> > or even #include app/test-pmd/cmdline_flow.c directly (not pretty, I know)
>> > since it would have the same internal layout as testpmd. Testpmd's Makefile
>> > could be modified to spit it out as a separate library if needed.
>> >
>> > What could make more sense would be to extract the parser engine for
>> > librte_cmdline's dynamic tokens (e.g. struct context, struct arg, struct
>> > token) and possibly various generic helpers (e.g. parse_string,
>> > parse_mac_addr), but not enum index nor token_list[] and friends which
>> > define the layout of testpmd's flow command.
>> >
>> > This would enable other flow-like commands without duplicating the engine
>> > every time, however I'm still not sure if making it available outside
>> > testpmd is a good idea. Extracting and making it fully generic will require
>> > a considerable amount of work.
>> >
>> > > > Hi Lu, Oliver,
>> > > >
>> > > > Thanks for your feedback!
>> > > > You have a point here, flow commands are only a subset of the parser.
>> > > > Do you want me to create this new library and send another patch?
>> > >
>> > > Let's first wait for Adrien's feedback, he can have counter-arguments.
>> > >
>> > > > I guess I have to use librte_cmdline as a template/example for
>> > creating the
>> > > > librte_flow_cmdline library.
>> > >
>> > > It can be used as an example for Makefile and .map file.
>> >
>> > I'm not opposed to the idea of exporting the parser engine after making it
>> > properly generic, but please reconsider. Testpmd is an application made to
>> > validate PMD functionality. The flow command implementation, as neat as it
>> > is, remains a complex wrapper on top of the cmdline_parse API which wasn't
>> > originally designed for dynamic tokens. Its syntax may evolve without
>> > warning if deemed necessary. Making it public will subject it to exported
>> > API/ABI constraints and likely impede future evolution.
>> >
>> > Assuming your application is not dragging testpmd's legacy, I think it
>> > would
>> > be wiser to re-implement a simpler flow command look-alike on top of a more
>> > suitable command-line handling library if you like its syntax.
>> >
>> > [1] http://dpdk.org/ml/archives/dev/2018-January/086872.html
>> > [2] Message-ID: CAN9HtFDz+imqbCKfs6a0NE0W7iF8C+-KiNB0nCRywimspjfEDg@mail.
>> > gmail.com
>> >
>> > --
>> > Adrien Mazarguil
>> > 6WIND
>> >
>>
>>
>>
>> --
>> Georgios Katsikas
>> Industrial Ph.D. Student
>> Network Intelligence Group
>> Decision, Networks, and Analytics (DNA) Lab
>> RISE SICS
>> E-Mail:  georgios.katsikas@ri.se
>
> --
> Adrien Mazarguil
> 6WIND



-- 
Georgios Katsikas
Industrial Ph.D. Student
Network Intelligence Group
Decision, Networks, and Analytics (DNA) Lab
RISE SICS
E-Mail:  georgios.katsikas@ri.se

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC v2, 1/2] cryptodev: add support to set session private data
  @ 2018-01-24 19:46  4% ` De Lara Guarch, Pablo
  2018-01-25  6:42  0%   ` Akhil Goyal
  2018-01-25 15:37  0%   ` Gujjar, Abhinandan S
  0 siblings, 2 replies; 200+ results
From: De Lara Guarch, Pablo @ 2018-01-24 19:46 UTC (permalink / raw)
  To: Gujjar, Abhinandan S, Doherty, Declan, akhil.goyal,
	Jerin.JacobKollanukkaran
  Cc: dev, Vangati, Narender, Rao, Nikhil



> -----Original Message-----
> From: Gujjar, Abhinandan S
> Sent: Tuesday, January 23, 2018 8:54 AM
> To: Doherty, Declan <declan.doherty@intel.com>; akhil.goyal@nxp.com; De
> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
> Jerin.JacobKollanukkaran@cavium.com
> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Rao, Nikhil
> <nikhil.rao@intel.com>
> Subject: [RFC v2, 1/2] cryptodev: add support to set session private data
> 
> Update rte_crypto_op to indicate private data offset.
> 
> The application may want to store private data along with the
> rte_cryptodev that is transparent to the rte_cryptodev layer.
> For e.g., If an eventdev based application is submitting a
> rte_cryptodev_sym_session operation and wants to indicate event
> information required to construct a new event that will be enqueued to
> eventdev after completion of the rte_cryptodev_sym_session operation.
> This patch provides a mechanism for the application to associate this
> information with the rte_cryptodev_sym_session session.
> The application can set the private data using
> rte_cryptodev_sym_session_set_private_data() and retrieve it using
> rte_cryptodev_sym_session_get_private_data().

Hi Abhinandan,

> 
> Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
> Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
> ---
> Notes:
>         V2:
> 	1. Removed enum rte_crypto_op_private_data_type
> 	2. Corrected formatting
> 
>  lib/librte_cryptodev/rte_crypto.h    |  8 ++++++--
>  lib/librte_cryptodev/rte_cryptodev.h | 32
> ++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_cryptodev/rte_crypto.h
> b/lib/librte_cryptodev/rte_crypto.h
> index 95cf861..14c87c8 100644
> --- a/lib/librte_cryptodev/rte_crypto.h
> +++ b/lib/librte_cryptodev/rte_crypto.h
> @@ -84,8 +84,12 @@ struct rte_crypto_op {
>  	 */
>  	uint8_t sess_type;
>  	/**< operation session type */
> -
> -	uint8_t reserved[5];
> +	uint16_t private_data_offset;
> +	/**< Offset to indicate start of private data (if any). The private
> +	 * data may be used by the application to store information which
> +	 * should remain untouched in the library/driver

Is this the offset for the private data after the crypto operation?
>From your title, it looks like it is for the session private data, but then, this shouldn't be here.
If it is for the crypto operation, I suggest you to separate it in another patch.
Also, you should indicate where the offset starts from. For the IV, the offset is counted
from the start of the rte_crypto_op, so I think it should be the same, to keep consistency.

For the session private data, we see two options:

1 - Add a  "valid" private data field in the rte_cryptodev_sym_session structure,
so when it is set, it indicates that the session contains private data
(a single bit would be enough, 1 to indicate there is, and 0 to indicate there is not).
This would go into the beginning of the structure, so this would require an ABI deprecation notice.
This also assumes that the private data starts just after the session header

2 -  Do not add an extra "valid" private data field in rte_cryptodev_sym_session structure,
and add a small header in the private data, which contains the "valid" bit.
Then, when calling sym_session_get_private_data, this bit should be checked.
Note that the object that holds the session structure needs to be big enough to hold this value.
If the object has only space for the sess_private_data array, then the session has no private data.
Therefore, this approach might be less performant, but with no ABI deprecation required.

I would recommend you to send a deprecation notice for option 1, then check the performance of both option,
and if needed, make the change in the structure, in 18.05.

Regards,
Pablo

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [RFC v2, 1/2] cryptodev: add support to set session private data
  2018-01-24 19:46  4% ` De Lara Guarch, Pablo
@ 2018-01-25  6:42  0%   ` Akhil Goyal
  2018-01-25 15:37  0%   ` Gujjar, Abhinandan S
  1 sibling, 0 replies; 200+ results
From: Akhil Goyal @ 2018-01-25  6:42 UTC (permalink / raw)
  To: De Lara Guarch, Pablo, Gujjar, Abhinandan S, Doherty, Declan,
	Jerin.JacobKollanukkaran
  Cc: dev, Vangati, Narender, Rao, Nikhil

On 1/25/2018 1:16 AM, De Lara Guarch, Pablo wrote:
> 
> 
>> -----Original Message-----
>> From: Gujjar, Abhinandan S
>> Sent: Tuesday, January 23, 2018 8:54 AM
>> To: Doherty, Declan <declan.doherty@intel.com>; akhil.goyal@nxp.com; De
>> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
>> Jerin.JacobKollanukkaran@cavium.com
>> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
>> Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Rao, Nikhil
>> <nikhil.rao@intel.com>
>> Subject: [RFC v2, 1/2] cryptodev: add support to set session private data
>>
>> Update rte_crypto_op to indicate private data offset.
>>
>> The application may want to store private data along with the
>> rte_cryptodev that is transparent to the rte_cryptodev layer.
>> For e.g., If an eventdev based application is submitting a
>> rte_cryptodev_sym_session operation and wants to indicate event
>> information required to construct a new event that will be enqueued to
>> eventdev after completion of the rte_cryptodev_sym_session operation.
>> This patch provides a mechanism for the application to associate this
>> information with the rte_cryptodev_sym_session session.
>> The application can set the private data using
>> rte_cryptodev_sym_session_set_private_data() and retrieve it using
>> rte_cryptodev_sym_session_get_private_data().
> 
> Hi Abhinandan,
> 
>>
>> Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
>> Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
>> ---
>> Notes:
>>          V2:
>> 	1. Removed enum rte_crypto_op_private_data_type
>> 	2. Corrected formatting
>>
>>   lib/librte_cryptodev/rte_crypto.h    |  8 ++++++--
>>   lib/librte_cryptodev/rte_cryptodev.h | 32
>> ++++++++++++++++++++++++++++++++
>>   2 files changed, 38 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_cryptodev/rte_crypto.h
>> b/lib/librte_cryptodev/rte_crypto.h
>> index 95cf861..14c87c8 100644
>> --- a/lib/librte_cryptodev/rte_crypto.h
>> +++ b/lib/librte_cryptodev/rte_crypto.h
>> @@ -84,8 +84,12 @@ struct rte_crypto_op {
>>   	 */
>>   	uint8_t sess_type;
>>   	/**< operation session type */
>> -
>> -	uint8_t reserved[5];
>> +	uint16_t private_data_offset;
>> +	/**< Offset to indicate start of private data (if any). The private
>> +	 * data may be used by the application to store information which
>> +	 * should remain untouched in the library/driver
> 
> Is this the offset for the private data after the crypto operation?
>  From your title, it looks like it is for the session private data, but then, this shouldn't be here.
> If it is for the crypto operation, I suggest you to separate it in another patch.
> Also, you should indicate where the offset starts from. For the IV, the offset is counted
> from the start of the rte_crypto_op, so I think it should be the same, to keep consistency.
> 
> For the session private data, we see two options:
> 
> 1 - Add a  "valid" private data field in the rte_cryptodev_sym_session structure,
> so when it is set, it indicates that the session contains private data
> (a single bit would be enough, 1 to indicate there is, and 0 to indicate there is not).
> This would go into the beginning of the structure, so this would require an ABI deprecation notice.
> This also assumes that the private data starts just after the session header
> 
> 2 -  Do not add an extra "valid" private data field in rte_cryptodev_sym_session structure,
> and add a small header in the private data, which contains the "valid" bit.
> Then, when calling sym_session_get_private_data, this bit should be checked.
> Note that the object that holds the session structure needs to be big enough to hold this value.
> If the object has only space for the sess_private_data array, then the session has no private data.
> Therefore, this approach might be less performant, but with no ABI deprecation required.
> 
> I would recommend you to send a deprecation notice for option 1, then check the performance of both option,
> and if needed, make the change in the structure, in 18.05.
> 
> Regards,
> Pablo
> 

My thoughts are also inline with Pablo.

-Akhil

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC v2, 1/2] cryptodev: add support to set session private data
  2018-01-24 19:46  4% ` De Lara Guarch, Pablo
  2018-01-25  6:42  0%   ` Akhil Goyal
@ 2018-01-25 15:37  0%   ` Gujjar, Abhinandan S
  2018-01-31 13:40  0%     ` De Lara Guarch, Pablo
  1 sibling, 1 reply; 200+ results
From: Gujjar, Abhinandan S @ 2018-01-25 15:37 UTC (permalink / raw)
  To: De Lara Guarch, Pablo, Doherty, Declan, akhil.goyal,
	Jerin.JacobKollanukkaran
  Cc: dev, Vangati, Narender, Rao, Nikhil

Hi Pablo & Declan,

> -----Original Message-----
> From: De Lara Guarch, Pablo
> Sent: Thursday, January 25, 2018 1:17 AM
> To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; akhil.goyal@nxp.com;
> Jerin.JacobKollanukkaran@cavium.com
> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>; Rao,
> Nikhil <nikhil.rao@intel.com>
> Subject: RE: [RFC v2, 1/2] cryptodev: add support to set session private data
> 
> 
> 
> > -----Original Message-----
> > From: Gujjar, Abhinandan S
> > Sent: Tuesday, January 23, 2018 8:54 AM
> > To: Doherty, Declan <declan.doherty@intel.com>; akhil.goyal@nxp.com;
> > De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
> > Jerin.JacobKollanukkaran@cavium.com
> > Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> > Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Rao, Nikhil
> > <nikhil.rao@intel.com>
> > Subject: [RFC v2, 1/2] cryptodev: add support to set session private
> > data
> >
> > Update rte_crypto_op to indicate private data offset.
> >
> > The application may want to store private data along with the
> > rte_cryptodev that is transparent to the rte_cryptodev layer.
> > For e.g., If an eventdev based application is submitting a
> > rte_cryptodev_sym_session operation and wants to indicate event
> > information required to construct a new event that will be enqueued to
> > eventdev after completion of the rte_cryptodev_sym_session operation.
> > This patch provides a mechanism for the application to associate this
> > information with the rte_cryptodev_sym_session session.
> > The application can set the private data using
> > rte_cryptodev_sym_session_set_private_data() and retrieve it using
> > rte_cryptodev_sym_session_get_private_data().
> 
> Hi Abhinandan,
> 
> >
> > Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
> > Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
> > ---
> > Notes:
> >         V2:
> > 	1. Removed enum rte_crypto_op_private_data_type
> > 	2. Corrected formatting
> >
> >  lib/librte_cryptodev/rte_crypto.h    |  8 ++++++--
> >  lib/librte_cryptodev/rte_cryptodev.h | 32
> > ++++++++++++++++++++++++++++++++
> >  2 files changed, 38 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_cryptodev/rte_crypto.h
> > b/lib/librte_cryptodev/rte_crypto.h
> > index 95cf861..14c87c8 100644
> > --- a/lib/librte_cryptodev/rte_crypto.h
> > +++ b/lib/librte_cryptodev/rte_crypto.h
> > @@ -84,8 +84,12 @@ struct rte_crypto_op {
> >  	 */
> >  	uint8_t sess_type;
> >  	/**< operation session type */
> > -
> > -	uint8_t reserved[5];
> > +	uint16_t private_data_offset;
> > +	/**< Offset to indicate start of private data (if any). The private
> > +	 * data may be used by the application to store information which
> > +	 * should remain untouched in the library/driver
> 
> Is this the offset for the private data after the crypto operation?
Yes. This is private date is meant for sessionless case.
> From your title, it looks like it is for the session private data, but then, this
> shouldn't be here.
Agree.
> If it is for the crypto operation, I suggest you to separate it in another patch.
> Also, you should indicate where the offset starts from. For the IV, the offset is
> counted from the start of the rte_crypto_op, so I think it should be the same, to
> keep consistency.
Sure. I will make a separate patch for this changes. Add some more information to make it clear.
> 
> For the session private data, we see two options:
> 
> 1 - Add a  "valid" private data field in the rte_cryptodev_sym_session structure,
> so when it is set, it indicates that the session contains private data (a single bit
> would be enough, 1 to indicate there is, and 0 to indicate there is not).
> This would go into the beginning of the structure, so this would require an ABI
> deprecation notice.
> This also assumes that the private data starts just after the session header
> 
> 2 -  Do not add an extra "valid" private data field in rte_cryptodev_sym_session
> structure, and add a small header in the private data, which contains the "valid"
> bit.
> Then, when calling sym_session_get_private_data, this bit should be checked.
> Note that the object that holds the session structure needs to be big enough to
> hold this value.
> If the object has only space for the sess_private_data array, then the session has
> no private data.
> Therefore, this approach might be less performant, but with no ABI deprecation
> required.
I am with option 2 with slight changes as below:
rte_cryptodev_sym_session_create() will have a flag as below
indicating private data exits or not.
{ 
- memset(sess, 0, (sizeof(void *) * nb_drivers));
+memset(sess, 0, (sizeof(void *) * nb_drivers ) + sizeof(private_data_flag));
}
and in
rte_cryptodev_get_header_session_size(void)
{
  /*
   * Header contains pointers to the private data
   * of all registered drivers
   */
  -return (sizeof(void *) * nb_drivers);
  +return ((sizeof(void *) * nb_drivers) + sizeof(private_data_flag));
}
With this, a flag indicating private data exists or not will always have valid value.

> 
> I would recommend you to send a deprecation notice for option 1, then check
> the performance of both option, and if needed, make the change in the
> structure, in 18.05.
> 
> Regards,
> Pablo

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
@ 2018-01-26  9:03  4% Pablo de Lara
  2018-01-26 10:44  4% ` Trahe, Fiona
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Pablo de Lara @ 2018-01-26  9:03 UTC (permalink / raw)
  To: akhil.goyal, hemant.agrawal, declan.doherty, jerin.jacob,
	fiona.trahe, john.griffin, deepak.k.jain, jck, tdu, dima,
	nsamsono, jianbo.liu
  Cc: dev, Pablo de Lara

Since the API changes made in 17.08, the session mempool
is not created anymore in each crypto device.
Therefore, there is no need to have, in the cryptodev info
structure, the maximum number of sessions supported per device
and per queue pair.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..5588ba7c1 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,8 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* cryptodev: The structure ``sym``, including its fields ``max_nb_sessions``
+  and ``max_nb_sessions_per_qp``, in structure ``rte_cryptodev_info``,
+  will be removed in 18.05, as these fields are not relevant anymore
+  since the session mempool is not internal in the crypto device anymore.
-- 
2.14.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-26  9:03  4% [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct Pablo de Lara
@ 2018-01-26 10:44  4% ` Trahe, Fiona
  2018-01-26 11:08  4%   ` De Lara Guarch, Pablo
  2018-01-30 11:37  4% ` Akhil Goyal
  2018-01-30 12:14  7% ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices Pablo de Lara
  2 siblings, 1 reply; 200+ results
From: Trahe, Fiona @ 2018-01-26 10:44 UTC (permalink / raw)
  To: De Lara Guarch, Pablo, akhil.goyal, hemant.agrawal, Doherty,
	Declan, jerin.jacob, Griffin, John, Jain, Deepak K, jck, tdu,
	dima, nsamsono, jianbo.liu
  Cc: dev, Trahe, Fiona

Hi Pablo,

> -----Original Message-----
> From: De Lara Guarch, Pablo
> Sent: Friday, January 26, 2018 9:04 AM
> To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
> jerin.jacob@intel.com; Trahe, Fiona <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>; Jain,
> Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com; dima@marvell.com;
> nsamsono@marvell.com; jianbo.liu@arm.com
> Cc: dev@dpdk.org; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Subject: [PATCH] doc: announce ABI change for crypto info struct
> 
> Since the API changes made in 17.08, the session mempool
> is not created anymore in each crypto device.
> Therefore, there is no need to have, in the cryptodev info
> structure, the maximum number of sessions supported per device
> and per queue pair.
> 
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index d59ad5988..5588ba7c1 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -59,3 +59,8 @@ Deprecation Notices
>    be added between the producer and consumer structures. The size of the
>    structure and the offset of the fields will remain the same on
>    platforms with 64B cache line, but will change on other platforms.
> +
> +* cryptodev: The structure ``sym``, including its fields ``max_nb_sessions``
> +  and ``max_nb_sessions_per_qp``, in structure ``rte_cryptodev_info``,
> +  will be removed in 18.05, as these fields are not relevant anymore
> +  since the session mempool is not internal in the crypto device anymore.
> --
[Fiona] max_nb_sessions must be also removed from 
struct rte_cryptodev_pmd_init_params
Regards deprecation of max_nb_sessions from both structs:
Acked-by: Fiona Trahe <fiona.trahe@intel.com>

If removing the max_nb_sessions_per_qp, then the following functions should also be deprecated. 
rte_cryptodev_queue_pair_attach_sym_session
rte_cryptodev_queue_pair_detach_sym_session
These and the max_nb_session_per_qp were added here at request of NXP:
http://dpdk.org/ml/archives/dev/2017-March/060740.html

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-26 10:44  4% ` Trahe, Fiona
@ 2018-01-26 11:08  4%   ` De Lara Guarch, Pablo
  2018-01-29  9:26  4%     ` Akhil Goyal
  0 siblings, 1 reply; 200+ results
From: De Lara Guarch, Pablo @ 2018-01-26 11:08 UTC (permalink / raw)
  To: Trahe, Fiona, akhil.goyal, hemant.agrawal, Doherty, Declan,
	Griffin, John, Jain, Deepak K, jck, tdu, dima, nsamsono,
	jianbo.liu, jerin.jacob
  Cc: dev



> -----Original Message-----
> From: Trahe, Fiona
> Sent: Friday, January 26, 2018 10:45 AM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
> akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
> <declan.doherty@intel.com>; jerin.jacob@intel.com; Griffin, John
> <john.griffin@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
> jck@semihalf.com; tdu@semihalf.com; dima@marvell.com;
> nsamsono@marvell.com; jianbo.liu@arm.com
> Cc: dev@dpdk.org; Trahe, Fiona <fiona.trahe@intel.com>
> Subject: RE: [PATCH] doc: announce ABI change for crypto info struct
> 
> Hi Pablo,
> 
> > -----Original Message-----
> > From: De Lara Guarch, Pablo
> > Sent: Friday, January 26, 2018 9:04 AM
> > To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
> > <declan.doherty@intel.com>; jerin.jacob@intel.com; Trahe, Fiona
> > <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>; Jain,
> > Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> > tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> > jianbo.liu@arm.com
> > Cc: dev@dpdk.org; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>
> > Subject: [PATCH] doc: announce ABI change for crypto info struct
> >
> > Since the API changes made in 17.08, the session mempool is not
> > created anymore in each crypto device.
> > Therefore, there is no need to have, in the cryptodev info structure,
> > the maximum number of sessions supported per device and per queue
> > pair.
> >
> > Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index d59ad5988..5588ba7c1 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -59,3 +59,8 @@ Deprecation Notices
> >    be added between the producer and consumer structures. The size of
> the
> >    structure and the offset of the fields will remain the same on
> >    platforms with 64B cache line, but will change on other platforms.
> > +
> > +* cryptodev: The structure ``sym``, including its fields
> > +``max_nb_sessions``
> > +  and ``max_nb_sessions_per_qp``, in structure
> > +``rte_cryptodev_info``,
> > +  will be removed in 18.05, as these fields are not relevant anymore
> > +  since the session mempool is not internal in the crypto device
> anymore.
> > --
> [Fiona] max_nb_sessions must be also removed from struct
> rte_cryptodev_pmd_init_params 

Good point. Since this structure is internal, I guess we don't need a deprecation notice for it, 
but I will remove it in the patch for 18.05.

> Regards deprecation of max_nb_sessions from both structs:
> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
> 
> If removing the max_nb_sessions_per_qp, then the following functions
> should also be deprecated.
> rte_cryptodev_queue_pair_attach_sym_session
> rte_cryptodev_queue_pair_detach_sym_session
> These and the max_nb_session_per_qp were added here at request of NXP:
> http://dpdk.org/ml/archives/dev/2017-March/060740.html

Akhil, do you agree on this change?

Thanks,
Pablo

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-26 11:08  4%   ` De Lara Guarch, Pablo
@ 2018-01-29  9:26  4%     ` Akhil Goyal
  2018-01-30  7:55  4%       ` Verma, Shally
  0 siblings, 1 reply; 200+ results
From: Akhil Goyal @ 2018-01-29  9:26 UTC (permalink / raw)
  To: De Lara Guarch, Pablo, Trahe, Fiona, hemant.agrawal, Doherty,
	Declan, Griffin, John, Jain, Deepak K, jck, tdu, dima, nsamsono,
	jianbo.liu, jerin.jacob
  Cc: dev

Hi Pablo/Fiona,

On 1/26/2018 4:38 PM, De Lara Guarch, Pablo wrote:
> 
> 
>> -----Original Message-----
>> From: Trahe, Fiona
>> Sent: Friday, January 26, 2018 10:45 AM
>> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
>> akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
>> <declan.doherty@intel.com>; jerin.jacob@intel.com; Griffin, John
>> <john.griffin@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
>> jck@semihalf.com; tdu@semihalf.com; dima@marvell.com;
>> nsamsono@marvell.com; jianbo.liu@arm.com
>> Cc: dev@dpdk.org; Trahe, Fiona <fiona.trahe@intel.com>
>> Subject: RE: [PATCH] doc: announce ABI change for crypto info struct
>>
>> Hi Pablo,
>>
>>> -----Original Message-----
>>> From: De Lara Guarch, Pablo
>>> Sent: Friday, January 26, 2018 9:04 AM
>>> To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
>>> <declan.doherty@intel.com>; jerin.jacob@intel.com; Trahe, Fiona
>>> <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>; Jain,
>>> Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
>>> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
>>> jianbo.liu@arm.com
>>> Cc: dev@dpdk.org; De Lara Guarch, Pablo
>>> <pablo.de.lara.guarch@intel.com>
>>> Subject: [PATCH] doc: announce ABI change for crypto info struct
>>>
>>> Since the API changes made in 17.08, the session mempool is not
>>> created anymore in each crypto device.
>>> Therefore, there is no need to have, in the cryptodev info structure,
>>> the maximum number of sessions supported per device and per queue
>>> pair.
>>>
>>> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>>> ---
>>>   doc/guides/rel_notes/deprecation.rst | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/doc/guides/rel_notes/deprecation.rst
>>> b/doc/guides/rel_notes/deprecation.rst
>>> index d59ad5988..5588ba7c1 100644
>>> --- a/doc/guides/rel_notes/deprecation.rst
>>> +++ b/doc/guides/rel_notes/deprecation.rst
>>> @@ -59,3 +59,8 @@ Deprecation Notices
>>>     be added between the producer and consumer structures. The size of
>> the
>>>     structure and the offset of the fields will remain the same on
>>>     platforms with 64B cache line, but will change on other platforms.
>>> +
>>> +* cryptodev: The structure ``sym``, including its fields
>>> +``max_nb_sessions``
>>> +  and ``max_nb_sessions_per_qp``, in structure
>>> +``rte_cryptodev_info``,
>>> +  will be removed in 18.05, as these fields are not relevant anymore
>>> +  since the session mempool is not internal in the crypto device
>> anymore.
>>> --
>> [Fiona] max_nb_sessions must be also removed from struct
>> rte_cryptodev_pmd_init_params
> 
> Good point. Since this structure is internal, I guess we don't need a deprecation notice for it,
> but I will remove it in the patch for 18.05.
> 
>> Regards deprecation of max_nb_sessions from both structs:
>> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
>>
>> If removing the max_nb_sessions_per_qp, then the following functions
>> should also be deprecated.
>> rte_cryptodev_queue_pair_attach_sym_session
>> rte_cryptodev_queue_pair_detach_sym_session
>> These and the max_nb_session_per_qp were added here at request of NXP:
>> http://dpdk.org/ml/archives/dev/2017-March/060740.html
> 
> Akhil, do you agree on this change?
> 

We recently did some changes in the driver to take care of the 
dependency for limit on max_nb_sessions_per_qp, but it is not removed 
completely. We will need to look into this. But sending the deprecation 
notice at this moment is fine. If something comes up, will let you know 
later.

-Akhil

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation
  2018-01-23 10:35  4%     ` Mcnamara, John
@ 2018-01-29 21:42  4%       ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-01-29 21:42 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev, Mcnamara, John, Richardson, Bruce

23/01/2018 11:35, Mcnamara, John:
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, January 22, 2018 1:48 AM
> > To: dev@dpdk.org
> > Cc: Neil Horman <nhorman@tuxdriver.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Mcnamara, John <john.mcnamara@intel.com>;
> > Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation
> > 
> > Document the need to add the __experimental tag to appropriate functions
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > CC: Thomas Monjalon <thomas@monjalon.net>
> > CC: "Mcnamara, John" <john.mcnamara@intel.com>
> > CC: Bruce Richardson <bruce.richardson@intel.com>
> > ...
> > +Note that marking an API as experimental is a multi step process.  To
> > +mark an API as experimental, the symbols which are desired to be
> > +exported must be placed in an EXPERIMENTAL version block in the
> > +corresponding libraries' version map script. Secondly, the
> > +corresponding definitions of those exported functions, and their
> > +forward declarations (in the development header files), must be marked
> > +with the __rte_experimental tag (see rte_compat.h).  The DPDK build
> > +makefiles perform a check to ensure that the map file and the C code
> > +reflect the same list of symbols.  This check can be circumvented by
> > defining ALLOW_EXPERIMENTAL_API during compilation in the corresponding
> > library Makefile.
> > +
> > +In addition to tagging the code with __rte_experimental, the doxygen
> > +markup must also contain the EXPERIMENTAL string, and the MAINTAINER
> > +file should note that the library contains EXPERIMENTAL APIs.
> > +
> >  ABI versions, once released, are available until such time as their
> > deprecation has been noted in the Release Notes for at least one major
> > release  cycle. For example consider the case where the ABI for DPDK 2.0
> > has been
> > --
> > 2.14.3
> 
> Thanks for the update, and this work in general.
> 
> The rendered docs would probably look a better better with __rte_experimental
> and ALLOW_EXPERIMENTAL_API is fixed width backticks ``var`` but that is only
> a "nice to have" so no need for a respin.

Backticks added on apply.

Also changed the last sentence from
	the MAINTAINER file should note that the library contains EXPERIMENTAL APIs.
to
	the MAINTAINERS file should note the EXPERIMENTAL libraries.
Indeed, the practice is to note only new libraries as experimental in
the MAINTAINERS files.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-29  9:26  4%     ` Akhil Goyal
@ 2018-01-30  7:55  4%       ` Verma, Shally
  2018-01-30 11:21  4%         ` De Lara Guarch, Pablo
  0 siblings, 1 reply; 200+ results
From: Verma, Shally @ 2018-01-30  7:55 UTC (permalink / raw)
  To: Akhil Goyal, De Lara Guarch, Pablo, Trahe, Fiona, hemant.agrawal,
	Doherty, Declan, Griffin, John, Jain, Deepak K, jck, tdu, dima,
	nsamsono, jianbo.liu, Jacob,  Jerin, Athreya, Narayana Prasad,
	Murthy, Nidadavolu
  Cc: dev

I do see current cryptodev unit testcase (inside \test dir) uses info.sym.max_nb_sessions param for session mempool_create. So, such testcases change are also in proposal?

Another point, we recently submitted an RFC patch on lib/cryptodev with asymmetric crypto support (https://dpdk.org/dev/patchwork/patch/34308/) which is awaiting review and these fields have role to play there. 
So, could this change be please viewed in conjunction with asym RFC?

Thanks
Shally

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Akhil Goyal
> Sent: 29 January 2018 14:57
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Trahe, Fiona
> <fiona.trahe@intel.com>; hemant.agrawal@nxp.com; Doherty, Declan
> <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>; Jain,
> Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> jianbo.liu@arm.com; Jacob, Jerin <Jerin.JacobKollanukkaran@cavium.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info
> struct
> 
> Hi Pablo/Fiona,
> 
> On 1/26/2018 4:38 PM, De Lara Guarch, Pablo wrote:
> >
> >
> >> -----Original Message-----
> >> From: Trahe, Fiona
> >> Sent: Friday, January 26, 2018 10:45 AM
> >> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
> >> akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
> >> <declan.doherty@intel.com>; jerin.jacob@intel.com; Griffin, John
> >> <john.griffin@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
> >> jck@semihalf.com; tdu@semihalf.com; dima@marvell.com;
> >> nsamsono@marvell.com; jianbo.liu@arm.com
> >> Cc: dev@dpdk.org; Trahe, Fiona <fiona.trahe@intel.com>
> >> Subject: RE: [PATCH] doc: announce ABI change for crypto info struct
> >>
> >> Hi Pablo,
> >>
> >>> -----Original Message-----
> >>> From: De Lara Guarch, Pablo
> >>> Sent: Friday, January 26, 2018 9:04 AM
> >>> To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
> >>> <declan.doherty@intel.com>; jerin.jacob@intel.com; Trahe, Fiona
> >>> <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>; Jain,
> >>> Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> >>> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> >>> jianbo.liu@arm.com
> >>> Cc: dev@dpdk.org; De Lara Guarch, Pablo
> >>> <pablo.de.lara.guarch@intel.com>
> >>> Subject: [PATCH] doc: announce ABI change for crypto info struct
> >>>
> >>> Since the API changes made in 17.08, the session mempool is not
> >>> created anymore in each crypto device.
> >>> Therefore, there is no need to have, in the cryptodev info structure,
> >>> the maximum number of sessions supported per device and per queue
> >>> pair.
> >>>
> >>> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> >>> ---
> >>>   doc/guides/rel_notes/deprecation.rst | 5 +++++
> >>>   1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/doc/guides/rel_notes/deprecation.rst
> >>> b/doc/guides/rel_notes/deprecation.rst
> >>> index d59ad5988..5588ba7c1 100644
> >>> --- a/doc/guides/rel_notes/deprecation.rst
> >>> +++ b/doc/guides/rel_notes/deprecation.rst
> >>> @@ -59,3 +59,8 @@ Deprecation Notices
> >>>     be added between the producer and consumer structures. The size of
> >> the
> >>>     structure and the offset of the fields will remain the same on
> >>>     platforms with 64B cache line, but will change on other platforms.
> >>> +
> >>> +* cryptodev: The structure ``sym``, including its fields
> >>> +``max_nb_sessions``
> >>> +  and ``max_nb_sessions_per_qp``, in structure
> >>> +``rte_cryptodev_info``,
> >>> +  will be removed in 18.05, as these fields are not relevant anymore
> >>> +  since the session mempool is not internal in the crypto device
> >> anymore.
> >>> --
> >> [Fiona] max_nb_sessions must be also removed from struct
> >> rte_cryptodev_pmd_init_params
> >
> > Good point. Since this structure is internal, I guess we don't need a
> deprecation notice for it,
> > but I will remove it in the patch for 18.05.
> >
> >> Regards deprecation of max_nb_sessions from both structs:
> >> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
> >>
> >> If removing the max_nb_sessions_per_qp, then the following functions
> >> should also be deprecated.
> >> rte_cryptodev_queue_pair_attach_sym_session
> >> rte_cryptodev_queue_pair_detach_sym_session
> >> These and the max_nb_session_per_qp were added here at request of
> NXP:
> >> http://dpdk.org/ml/archives/dev/2017-March/060740.html
> >
> > Akhil, do you agree on this change?
> >
> 
> We recently did some changes in the driver to take care of the
> dependency for limit on max_nb_sessions_per_qp, but it is not removed
> completely. We will need to look into this. But sending the deprecation
> notice at this moment is fine. If something comes up, will let you know
> later.
> 
> -Akhil


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-30  7:55  4%       ` Verma, Shally
@ 2018-01-30 11:21  4%         ` De Lara Guarch, Pablo
  2018-01-30 11:53  4%           ` Verma, Shally
  0 siblings, 1 reply; 200+ results
From: De Lara Guarch, Pablo @ 2018-01-30 11:21 UTC (permalink / raw)
  To: Verma, Shally, Akhil Goyal, Trahe, Fiona, hemant.agrawal,
	Doherty, Declan, Griffin, John, Jain, Deepak K, jck, tdu, dima,
	nsamsono, jianbo.liu, Jacob,  Jerin, Athreya, Narayana Prasad,
	Murthy, Nidadavolu
  Cc: dev

Hi Shally/Ahkil,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Verma, Shally
> Sent: Tuesday, January 30, 2018 7:56 AM
> To: Akhil Goyal <akhil.goyal@nxp.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Trahe, Fiona <fiona.trahe@intel.com>;
> hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
> Griffin, John <john.griffin@intel.com>; Jain, Deepak K
> <deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com;
> dima@marvell.com; nsamsono@marvell.com; jianbo.liu@arm.com; Jacob,
> Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
> <NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
> <Nidadavolu.Murthy@cavium.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info
> struct
> 
> I do see current cryptodev unit testcase (inside \test dir) uses
> info.sym.max_nb_sessions param for session mempool_create. So, such
> testcases change are also in proposal?

Yes, for these tests, we can just define a macro in the tests, instead of using the info structure.
> 
> Another point, we recently submitted an RFC patch on lib/cryptodev with
> asymmetric crypto support
> (https://dpdk.org/dev/patchwork/patch/34308/) which is awaiting review
> and these fields have role to play there.
> So, could this change be please viewed in conjunction with asym RFC?

Do you need it for asymmetric? Anyway, this would remove the symmetric function and structures, not applicable for you.
> 
> Thanks
> Shally
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Akhil Goyal
> > Sent: 29 January 2018 14:57
> > To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Trahe,
> > Fiona <fiona.trahe@intel.com>; hemant.agrawal@nxp.com; Doherty,
> Declan
> > <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>;
> > Jain, Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> > tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> > jianbo.liu@arm.com; Jacob, Jerin
> <Jerin.JacobKollanukkaran@cavium.com>
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto
> > info struct
> >
> > Hi Pablo/Fiona,
> >
> > On 1/26/2018 4:38 PM, De Lara Guarch, Pablo wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Trahe, Fiona
> > >> Sent: Friday, January 26, 2018 10:45 AM
> > >> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
> > >> akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
> > >> <declan.doherty@intel.com>; jerin.jacob@intel.com; Griffin, John
> > >> <john.griffin@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
> > >> jck@semihalf.com; tdu@semihalf.com; dima@marvell.com;
> > >> nsamsono@marvell.com; jianbo.liu@arm.com
> > >> Cc: dev@dpdk.org; Trahe, Fiona <fiona.trahe@intel.com>
> > >> Subject: RE: [PATCH] doc: announce ABI change for crypto info
> > >> struct
> > >>
> > >> Hi Pablo,
> > >>
> > >>> -----Original Message-----
> > >>> From: De Lara Guarch, Pablo
> > >>> Sent: Friday, January 26, 2018 9:04 AM
> > >>> To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty,
> Declan
> > >>> <declan.doherty@intel.com>; jerin.jacob@intel.com; Trahe, Fiona
> > >>> <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>;
> > >>> Jain, Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> > >>> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> > >>> jianbo.liu@arm.com
> > >>> Cc: dev@dpdk.org; De Lara Guarch, Pablo
> > >>> <pablo.de.lara.guarch@intel.com>
> > >>> Subject: [PATCH] doc: announce ABI change for crypto info struct
> > >>>
> > >>> Since the API changes made in 17.08, the session mempool is not
> > >>> created anymore in each crypto device.
> > >>> Therefore, there is no need to have, in the cryptodev info
> > >>> structure, the maximum number of sessions supported per device
> and
> > >>> per queue pair.
> > >>>
> > >>> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> > >>> ---
> > >>>   doc/guides/rel_notes/deprecation.rst | 5 +++++
> > >>>   1 file changed, 5 insertions(+)
> > >>>
> > >>> diff --git a/doc/guides/rel_notes/deprecation.rst
> > >>> b/doc/guides/rel_notes/deprecation.rst
> > >>> index d59ad5988..5588ba7c1 100644
> > >>> --- a/doc/guides/rel_notes/deprecation.rst
> > >>> +++ b/doc/guides/rel_notes/deprecation.rst
> > >>> @@ -59,3 +59,8 @@ Deprecation Notices
> > >>>     be added between the producer and consumer structures. The
> > >>> size of
> > >> the
> > >>>     structure and the offset of the fields will remain the same on
> > >>>     platforms with 64B cache line, but will change on other platforms.
> > >>> +
> > >>> +* cryptodev: The structure ``sym``, including its fields
> > >>> +``max_nb_sessions``
> > >>> +  and ``max_nb_sessions_per_qp``, in structure
> > >>> +``rte_cryptodev_info``,
> > >>> +  will be removed in 18.05, as these fields are not relevant
> > >>> +anymore
> > >>> +  since the session mempool is not internal in the crypto device
> > >> anymore.
> > >>> --
> > >> [Fiona] max_nb_sessions must be also removed from struct
> > >> rte_cryptodev_pmd_init_params
> > >
> > > Good point. Since this structure is internal, I guess we don't need
> > > a
> > deprecation notice for it,
> > > but I will remove it in the patch for 18.05.
> > >
> > >> Regards deprecation of max_nb_sessions from both structs:
> > >> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
> > >>
> > >> If removing the max_nb_sessions_per_qp, then the following
> > >> functions should also be deprecated.
> > >> rte_cryptodev_queue_pair_attach_sym_session
> > >> rte_cryptodev_queue_pair_detach_sym_session
> > >> These and the max_nb_session_per_qp were added here at request of
> > NXP:
> > >> http://dpdk.org/ml/archives/dev/2017-March/060740.html
> > >
> > > Akhil, do you agree on this change?
> > >
> >
> > We recently did some changes in the driver to take care of the
> > dependency for limit on max_nb_sessions_per_qp, but it is not removed
> > completely. We will need to look into this. But sending the
> > deprecation notice at this moment is fine. If something comes up, will
> > let you know later.

Looks good to me. Could you ack this if you agree with it?

Thanks,
Pablo

> >
> > -Akhil


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-26  9:03  4% [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct Pablo de Lara
  2018-01-26 10:44  4% ` Trahe, Fiona
@ 2018-01-30 11:37  4% ` Akhil Goyal
  2018-01-30 12:14  7% ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices Pablo de Lara
  2 siblings, 0 replies; 200+ results
From: Akhil Goyal @ 2018-01-30 11:37 UTC (permalink / raw)
  To: Pablo de Lara, hemant.agrawal, declan.doherty, jerin.jacob,
	fiona.trahe, john.griffin, deepak.k.jain, jck, tdu, dima,
	nsamsono, jianbo.liu
  Cc: dev

On 1/26/2018 2:33 PM, Pablo de Lara wrote:
> Since the API changes made in 17.08, the session mempool
> is not created anymore in each crypto device.
> Therefore, there is no need to have, in the cryptodev info
> structure, the maximum number of sessions supported per device
> and per queue pair.
> 
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> ---
>   doc/guides/rel_notes/deprecation.rst | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index d59ad5988..5588ba7c1 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -59,3 +59,8 @@ Deprecation Notices
>     be added between the producer and consumer structures. The size of the
>     structure and the offset of the fields will remain the same on
>     platforms with 64B cache line, but will change on other platforms.
> +
> +* cryptodev: The structure ``sym``, including its fields ``max_nb_sessions``
> +  and ``max_nb_sessions_per_qp``, in structure ``rte_cryptodev_info``,
> +  will be removed in 18.05, as these fields are not relevant anymore
> +  since the session mempool is not internal in the crypto device anymore.
> 
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-30 11:21  4%         ` De Lara Guarch, Pablo
@ 2018-01-30 11:53  4%           ` Verma, Shally
  2018-02-02  9:07  4%             ` De Lara Guarch, Pablo
  0 siblings, 1 reply; 200+ results
From: Verma, Shally @ 2018-01-30 11:53 UTC (permalink / raw)
  To: De Lara Guarch, Pablo, Akhil Goyal, Trahe, Fiona, hemant.agrawal,
	Doherty, Declan, Griffin, John, Jain, Deepak K, jck, tdu, dima,
	nsamsono, jianbo.liu, Jacob,  Jerin, Athreya, Narayana Prasad,
	Murthy, Nidadavolu
  Cc: dev



>-----Original Message-----
>From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch@intel.com]
>Sent: 30 January 2018 16:51
>To: Verma, Shally <Shally.Verma@cavium.com>; Akhil Goyal <akhil.goyal@nxp.com>; Trahe, Fiona <fiona.trahe@intel.com>;
>hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>; Jain, Deepak K
><deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
>jianbo.liu@arm.com; Jacob, Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
><NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu <Nidadavolu.Murthy@cavium.com>
>Cc: dev@dpdk.org
>Subject: RE: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
>
>Hi Shally/Ahkil,
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Verma, Shally
>> Sent: Tuesday, January 30, 2018 7:56 AM
>> To: Akhil Goyal <akhil.goyal@nxp.com>; De Lara Guarch, Pablo
>> <pablo.de.lara.guarch@intel.com>; Trahe, Fiona <fiona.trahe@intel.com>;
>> hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
>> Griffin, John <john.griffin@intel.com>; Jain, Deepak K
>> <deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com;
>> dima@marvell.com; nsamsono@marvell.com; jianbo.liu@arm.com; Jacob,
>> Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
>> <NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
>> <Nidadavolu.Murthy@cavium.com>
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info
>> struct
>>
>> I do see current cryptodev unit testcase (inside \test dir) uses
>> info.sym.max_nb_sessions param for session mempool_create. So, such
>> testcases change are also in proposal?
>
>Yes, for these tests, we can just define a macro in the tests, instead of using the info structure.

[Shally] Ok, then you mean applications will choose any random number during mempool_create and not dependent on device max_nb_sessions?

>>
>> Another point, we recently submitted an RFC patch on lib/cryptodev with
>> asymmetric crypto support
>> (https://dpdk.org/dev/patchwork/patch/34308/) which is awaiting review
>> and these fields have role to play there.
>> So, could this change be please viewed in conjunction with asym RFC?
>
>Do you need it for asymmetric? Anyway, this would remove the symmetric function and structures, not applicable for you.

[Shally] I would say addition of asym in lib/cryptodev is not entirely standalone, specifically for PMDs that can support both. 
My key concern are max_nb_sessions_per_qp and related qp_attach_sym/asym APIs which enable management of queue distribution among sym and asym in current proposal, specifically, for PMDs that can support both but have dedicated qp for each. Right now proposal is open for feedback and would prefer to be covered before sym related changes could be applied.

>>
>> Thanks
>> Shally
>>
>> > -----Original Message-----
>> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Akhil Goyal
>> > Sent: 29 January 2018 14:57
>> > To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Trahe,
>> > Fiona <fiona.trahe@intel.com>; hemant.agrawal@nxp.com; Doherty,
>> Declan
>> > <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>;
>> > Jain, Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
>> > tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
>> > jianbo.liu@arm.com; Jacob, Jerin
>> <Jerin.JacobKollanukkaran@cavium.com>
>> > Cc: dev@dpdk.org
>> > Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto
>> > info struct
>> >
>> > Hi Pablo/Fiona,
>> >
>> > On 1/26/2018 4:38 PM, De Lara Guarch, Pablo wrote:
>> > >
>> > >
>> > >> -----Original Message-----
>> > >> From: Trahe, Fiona
>> > >> Sent: Friday, January 26, 2018 10:45 AM
>> > >> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
>> > >> akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
>> > >> <declan.doherty@intel.com>; jerin.jacob@intel.com; Griffin, John
>> > >> <john.griffin@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
>> > >> jck@semihalf.com; tdu@semihalf.com; dima@marvell.com;
>> > >> nsamsono@marvell.com; jianbo.liu@arm.com
>> > >> Cc: dev@dpdk.org; Trahe, Fiona <fiona.trahe@intel.com>
>> > >> Subject: RE: [PATCH] doc: announce ABI change for crypto info
>> > >> struct
>> > >>
>> > >> Hi Pablo,
>> > >>
>> > >>> -----Original Message-----
>> > >>> From: De Lara Guarch, Pablo
>> > >>> Sent: Friday, January 26, 2018 9:04 AM
>> > >>> To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty,
>> Declan
>> > >>> <declan.doherty@intel.com>; jerin.jacob@intel.com; Trahe, Fiona
>> > >>> <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>;
>> > >>> Jain, Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
>> > >>> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
>> > >>> jianbo.liu@arm.com
>> > >>> Cc: dev@dpdk.org; De Lara Guarch, Pablo
>> > >>> <pablo.de.lara.guarch@intel.com>
>> > >>> Subject: [PATCH] doc: announce ABI change for crypto info struct
>> > >>>
>> > >>> Since the API changes made in 17.08, the session mempool is not
>> > >>> created anymore in each crypto device.
>> > >>> Therefore, there is no need to have, in the cryptodev info
>> > >>> structure, the maximum number of sessions supported per device
>> and
>> > >>> per queue pair.
>> > >>>
>> > >>> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>> > >>> ---
>> > >>>   doc/guides/rel_notes/deprecation.rst | 5 +++++
>> > >>>   1 file changed, 5 insertions(+)
>> > >>>
>> > >>> diff --git a/doc/guides/rel_notes/deprecation.rst
>> > >>> b/doc/guides/rel_notes/deprecation.rst
>> > >>> index d59ad5988..5588ba7c1 100644
>> > >>> --- a/doc/guides/rel_notes/deprecation.rst
>> > >>> +++ b/doc/guides/rel_notes/deprecation.rst
>> > >>> @@ -59,3 +59,8 @@ Deprecation Notices
>> > >>>     be added between the producer and consumer structures. The
>> > >>> size of
>> > >> the
>> > >>>     structure and the offset of the fields will remain the same on
>> > >>>     platforms with 64B cache line, but will change on other platforms.
>> > >>> +
>> > >>> +* cryptodev: The structure ``sym``, including its fields
>> > >>> +``max_nb_sessions``
>> > >>> +  and ``max_nb_sessions_per_qp``, in structure
>> > >>> +``rte_cryptodev_info``,
>> > >>> +  will be removed in 18.05, as these fields are not relevant
>> > >>> +anymore
>> > >>> +  since the session mempool is not internal in the crypto device
>> > >> anymore.
>> > >>> --
>> > >> [Fiona] max_nb_sessions must be also removed from struct
>> > >> rte_cryptodev_pmd_init_params
>> > >
>> > > Good point. Since this structure is internal, I guess we don't need
>> > > a
>> > deprecation notice for it,
>> > > but I will remove it in the patch for 18.05.
>> > >
>> > >> Regards deprecation of max_nb_sessions from both structs:
>> > >> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
>> > >>
>> > >> If removing the max_nb_sessions_per_qp, then the following
>> > >> functions should also be deprecated.
>> > >> rte_cryptodev_queue_pair_attach_sym_session
>> > >> rte_cryptodev_queue_pair_detach_sym_session
>> > >> These and the max_nb_session_per_qp were added here at request of
>> > NXP:
>> > >> http://dpdk.org/ml/archives/dev/2017-March/060740.html
>> > >
>> > > Akhil, do you agree on this change?
>> > >
>> >
>> > We recently did some changes in the driver to take care of the
>> > dependency for limit on max_nb_sessions_per_qp, but it is not removed
>> > completely. We will need to look into this. But sending the
>> > deprecation notice at this moment is fine. If something comes up, will
>> > let you know later.
>
>Looks good to me. Could you ack this if you agree with it?
>
>Thanks,
>Pablo
>
>> >
>> > -Akhil


^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices
  2018-01-26  9:03  4% [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct Pablo de Lara
  2018-01-26 10:44  4% ` Trahe, Fiona
  2018-01-30 11:37  4% ` Akhil Goyal
@ 2018-01-30 12:14  7% ` Pablo de Lara
  2018-01-30 12:14  4%   ` [dpdk-dev] [PATCH v2 1/3] doc: announce ABI change for crypto info struct Pablo de Lara
  2018-02-13 11:45  4%   ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices De Lara Guarch, Pablo
  2 siblings, 2 replies; 200+ results
From: Pablo de Lara @ 2018-01-30 12:14 UTC (permalink / raw)
  To: akhil.goyal, hemant.agrawal, declan.doherty, jerin.jacob,
	fiona.trahe, john.griffin, deepak.k.jain, jck, tdu, dima,
	nsamsono, jianbo.liu
  Cc: dev, Pablo de Lara

v2:
- Added an extra deprecation announcement
- Bonded the other two deprecation notices with the new one in a
  patchset

Pablo de Lara (3):
  doc: announce ABI change for crypto info struct
  doc: announce deprecation for attach/detach crypto session
  doc: announce deprecation in crypto queue pair start/stop

 doc/guides/rel_notes/deprecation.rst | 15 +++++++++++++++
 lib/librte_cryptodev/rte_cryptodev.h |  4 ++++
 2 files changed, 19 insertions(+)

-- 
2.14.3

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH v2 1/3] doc: announce ABI change for crypto info struct
  2018-01-30 12:14  7% ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices Pablo de Lara
@ 2018-01-30 12:14  4%   ` Pablo de Lara
  2018-02-13 11:45  4%   ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices De Lara Guarch, Pablo
  1 sibling, 0 replies; 200+ results
From: Pablo de Lara @ 2018-01-30 12:14 UTC (permalink / raw)
  To: akhil.goyal, hemant.agrawal, declan.doherty, jerin.jacob,
	fiona.trahe, john.griffin, deepak.k.jain, jck, tdu, dima,
	nsamsono, jianbo.liu
  Cc: dev, Pablo de Lara

Since the API changes made in 17.08, the session mempool
is not created anymore in each crypto device.
Therefore, there is no need to have, in the cryptodev info
structure, the maximum number of sessions supported per device
and per queue pair.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..5588ba7c1 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,8 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* cryptodev: The structure ``sym``, including its fields ``max_nb_sessions``
+  and ``max_nb_sessions_per_qp``, in structure ``rte_cryptodev_info``,
+  will be removed in 18.05, as these fields are not relevant anymore
+  since the session mempool is not internal in the crypto device anymore.
-- 
2.14.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [RFC v2, 1/2] cryptodev: add support to set session private data
  2018-01-25 15:37  0%   ` Gujjar, Abhinandan S
@ 2018-01-31 13:40  0%     ` De Lara Guarch, Pablo
  0 siblings, 0 replies; 200+ results
From: De Lara Guarch, Pablo @ 2018-01-31 13:40 UTC (permalink / raw)
  To: Gujjar, Abhinandan S, Doherty, Declan, akhil.goyal,
	Jerin.JacobKollanukkaran
  Cc: dev, Vangati, Narender, Rao, Nikhil

Hi Abhinandan,

> -----Original Message-----
> From: Gujjar, Abhinandan S
> Sent: Thursday, January 25, 2018 3:38 PM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Doherty,
> Declan <declan.doherty@intel.com>; akhil.goyal@nxp.com;
> Jerin.JacobKollanukkaran@cavium.com
> Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>; Rao,
> Nikhil <nikhil.rao@intel.com>
> Subject: RE: [RFC v2, 1/2] cryptodev: add support to set session private
> data
> 
> Hi Pablo & Declan,
> 
> > -----Original Message-----
> > From: De Lara Guarch, Pablo
> > Sent: Thursday, January 25, 2018 1:17 AM
> > To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Doherty,
> > Declan <declan.doherty@intel.com>; akhil.goyal@nxp.com;
> > Jerin.JacobKollanukkaran@cavium.com
> > Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> Rao,
> > Nikhil <nikhil.rao@intel.com>
> > Subject: RE: [RFC v2, 1/2] cryptodev: add support to set session
> > private data
> >
> >
> >
> > > -----Original Message-----
> > > From: Gujjar, Abhinandan S
> > > Sent: Tuesday, January 23, 2018 8:54 AM
> > > To: Doherty, Declan <declan.doherty@intel.com>;
> akhil.goyal@nxp.com;
> > > De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>;
> > > Jerin.JacobKollanukkaran@cavium.com
> > > Cc: dev@dpdk.org; Vangati, Narender <narender.vangati@intel.com>;
> > > Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Rao, Nikhil
> > > <nikhil.rao@intel.com>
> > > Subject: [RFC v2, 1/2] cryptodev: add support to set session private
> > > data
> > >
> > > Update rte_crypto_op to indicate private data offset.
> > >
> > > The application may want to store private data along with the
> > > rte_cryptodev that is transparent to the rte_cryptodev layer.
> > > For e.g., If an eventdev based application is submitting a
> > > rte_cryptodev_sym_session operation and wants to indicate event
> > > information required to construct a new event that will be enqueued
> > > to eventdev after completion of the rte_cryptodev_sym_session
> operation.
> > > This patch provides a mechanism for the application to associate
> > > this information with the rte_cryptodev_sym_session session.
> > > The application can set the private data using
> > > rte_cryptodev_sym_session_set_private_data() and retrieve it using
> > > rte_cryptodev_sym_session_get_private_data().
> >
> > Hi Abhinandan,
> >
> > >
> > > Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
> > > Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
> > > ---
> > > Notes:
> > >         V2:
> > > 	1. Removed enum rte_crypto_op_private_data_type
> > > 	2. Corrected formatting
> > >
> > >  lib/librte_cryptodev/rte_crypto.h    |  8 ++++++--
> > >  lib/librte_cryptodev/rte_cryptodev.h | 32
> > > ++++++++++++++++++++++++++++++++
> > >  2 files changed, 38 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_cryptodev/rte_crypto.h
> > > b/lib/librte_cryptodev/rte_crypto.h
> > > index 95cf861..14c87c8 100644
> > > --- a/lib/librte_cryptodev/rte_crypto.h
> > > +++ b/lib/librte_cryptodev/rte_crypto.h
> > > @@ -84,8 +84,12 @@ struct rte_crypto_op {
> > >  	 */
> > >  	uint8_t sess_type;
> > >  	/**< operation session type */
> > > -
> > > -	uint8_t reserved[5];
> > > +	uint16_t private_data_offset;
> > > +	/**< Offset to indicate start of private data (if any). The private
> > > +	 * data may be used by the application to store information which
> > > +	 * should remain untouched in the library/driver
> >
> > Is this the offset for the private data after the crypto operation?
> Yes. This is private date is meant for sessionless case.
> > From your title, it looks like it is for the session private data, but
> > then, this shouldn't be here.
> Agree.
> > If it is for the crypto operation, I suggest you to separate it in another
> patch.
> > Also, you should indicate where the offset starts from. For the IV,
> > the offset is counted from the start of the rte_crypto_op, so I think
> > it should be the same, to keep consistency.
> Sure. I will make a separate patch for this changes. Add some more
> information to make it clear.
> >
> > For the session private data, we see two options:
> >
> > 1 - Add a  "valid" private data field in the rte_cryptodev_sym_session
> > structure, so when it is set, it indicates that the session contains
> > private data (a single bit would be enough, 1 to indicate there is, and 0 to
> indicate there is not).
> > This would go into the beginning of the structure, so this would
> > require an ABI deprecation notice.
> > This also assumes that the private data starts just after the session
> > header
> >
> > 2 -  Do not add an extra "valid" private data field in
> > rte_cryptodev_sym_session structure, and add a small header in the
> private data, which contains the "valid"
> > bit.
> > Then, when calling sym_session_get_private_data, this bit should be
> checked.
> > Note that the object that holds the session structure needs to be big
> > enough to hold this value.
> > If the object has only space for the sess_private_data array, then the
> > session has no private data.
> > Therefore, this approach might be less performant, but with no ABI
> > deprecation required.
> I am with option 2 with slight changes as below:
> rte_cryptodev_sym_session_create() will have a flag as below indicating
> private data exits or not.
> {
> - memset(sess, 0, (sizeof(void *) * nb_drivers));
> +memset(sess, 0, (sizeof(void *) * nb_drivers ) +
> +sizeof(private_data_flag));
> }
> and in
> rte_cryptodev_get_header_session_size(void)
> {
>   /*
>    * Header contains pointers to the private data
>    * of all registered drivers
>    */
>   -return (sizeof(void *) * nb_drivers);
>   +return ((sizeof(void *) * nb_drivers) + sizeof(private_data_flag)); } With
> this, a flag indicating private data exists or not will always have valid value.

Sure, this should work. Go ahead and send a v3 with this change, separating the changes
made in the session from the changes made in the crypto operation (so you will have 3 patches in total).

Pablo

> 
> >
> > I would recommend you to send a deprecation notice for option 1, then
> > check the performance of both option, and if needed, make the change
> > in the structure, in 18.05.
> >
> > Regards,
> > Pablo

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC v2 00/17] mempool: add bucket mempool driver
  2018-01-23 13:15  2% ` [dpdk-dev] [RFC v2 00/17] " Andrew Rybchenko
@ 2018-01-31 16:44  0%   ` Olivier Matz
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
  1 sibling, 0 replies; 200+ results
From: Olivier Matz @ 2018-01-31 16:44 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, Santosh Shukla, Jerin Jacob, Hemant Agrawal, Shreyansh Jain

Hi,

On Tue, Jan 23, 2018 at 01:15:55PM +0000, Andrew Rybchenko wrote:
> The patch series starts from generic enhancements suggested by Olivier.
> Basically it adds driver callbacks to calculate required memory size and
> to populate objects using provided memory area. It allows to remove
> so-called capability flags used before to tell generic code how to
> allocate and slice allocated memory into mempool objects.
> Clean up which removes get_capabilities and register_memory_area is
> not strictly required, but I think right thing to do.
> Existing mempool drivers are updated.
> 
> I've kept rte_mempool_populate_iova_tab() intact since it seems to
> be not directly related XMEM API functions.
> 
> The patch series adds bucket mempool driver which allows to allocate
> (both physically and virtually) contiguous blocks of objects and adds
> mempool API to do it. It is still capable to provide separate objects,
> but it is definitely more heavy-weight than ring/stack drivers.
> The driver will be used by the future Solarflare driver enhancements
> which allow to utilize physical contiguous blocks in the NIC
> hardware/firmware.
> 
> The target usecase is dequeue in blocks and enqueue separate objects
> back (which are collected in buckets to be dequeued). So, the memory
> pool with bucket driver is created by an application and provided to
> networking PMD receive queue. The choice of bucket driver is done using
> rte_eth_dev_pool_ops_supported(). A PMD that relies upon contiguous
> block allocation should report the bucket driver as the only supported
> and preferred one.
> 
> Introduction of the contiguous block dequeue operation is proven by
> performance measurements using autotest with minor enhancements:
>  - in the original test bulks are powers of two, which is unacceptable
>    for us, so they are changed to multiple of contig_block_size;
>  - the test code is duplicated to support plain dequeue and
>    dequeue_contig_blocks;
>  - all the extra test variations (with/without cache etc) are eliminated;
>  - a fake read from the dequeued buffer is added (in both cases) to
>    simulate mbufs access.
> 
> start performance test for bucket (without cache)
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Srate_persec=   111935488
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Srate_persec=   115290931
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Srate_persec=   353055539
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Srate_persec=   353330790
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Srate_persec=   224657407
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Srate_persec=   230411468
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Srate_persec=   706700902
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Srate_persec=   703673139
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Srate_persec=   425236887
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Srate_persec=   437295512
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Srate_persec=  1343409356
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Srate_persec=  1336567397
> start performance test for bucket (without cache + contiguous dequeue)
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Crate_persec=   122945536
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Crate_persec=   126458265
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Crate_persec=   374262988
> mempool_autotest cache=   0 cores= 1 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Crate_persec=   377316966
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Crate_persec=   244842496
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Crate_persec=   251618917
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Crate_persec=   751226060
> mempool_autotest cache=   0 cores= 2 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Crate_persec=   756233010
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  30 Crate_persec=   462068120
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=   1 n_keep=  60 Crate_persec=   476997221
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  30 Crate_persec=  1432171313
> mempool_autotest cache=   0 cores= 4 n_get_bulk=  15 n_put_bulk=  15 n_keep=  60 Crate_persec=  1438829771
> 
> The number of objects in the contiguous block is a function of bucket
> memory size (.config option) and total element size. In the future
> additional API with possibility to pass parameters on mempool allocation
> may be added.
> 
> It breaks ABI since changes rte_mempool_ops. Also it removes
> rte_mempool_ops_register_memory_area() and
> rte_mempool_ops_get_capabilities() since corresponding callbacks are
> removed.
> 
> The target DPDK release is 18.05.
> 
> v2:
>   - add driver ops to calculate required memory size and populate
>     mempool objects, remove extra flags which were required before
>     to control it
>   - transition of octeontx and dpaa drivers to the new callbacks
>   - change info API to get information from driver required to
>     API user to know contiguous block size
>   - remove get_capabilities (not required any more and may be
>     substituted with more in info get API)
>   - remove register_memory_area since it is substituted with
>     populate callback which can do more
>   - use SPDX tags
>   - avoid all objects affinity to single lcore
>   - fix bucket get_count
>   - deprecate XMEM API
>   - avoid introduction of a new function to flush cache
>   - fix NO_CACHE_ALIGN case in bucket mempool
> 
> Andrew Rybchenko (10):
>   mempool: fix phys contig check if populate default skipped
>   mempool: add op to calculate memory size to be allocated
>   mempool/octeontx: add callback to calculate memory size
>   mempool: add op to populate objects using provided memory
>   mempool/octeontx: implement callback to populate objects
>   mempool: remove callback to get capabilities
>   mempool: deprecate xmem functions
>   mempool/octeontx: prepare to remove register memory area op
>   mempool/dpaa: convert to use populate driver op
>   mempool: remove callback to register memory area
> 
> Artem V. Andreev (7):
>   mempool: ensure the mempool is initialized before populating
>   mempool/bucket: implement bucket mempool manager
>   mempool: support flushing the default cache of the mempool
>   mempool: implement abstract mempool info API
>   mempool: support block dequeue operation
>   mempool/bucket: implement block dequeue operation
>   mempool/bucket: do not allow one lcore to grab all buckets
> 
>  MAINTAINERS                                        |   9 +
>  config/common_base                                 |   2 +
>  drivers/mempool/Makefile                           |   1 +
>  drivers/mempool/bucket/Makefile                    |  27 +
>  drivers/mempool/bucket/rte_mempool_bucket.c        | 626 +++++++++++++++++++++
>  .../mempool/bucket/rte_mempool_bucket_version.map  |   4 +
>  drivers/mempool/dpaa/dpaa_mempool.c                |  13 +-
>  drivers/mempool/octeontx/rte_mempool_octeontx.c    |  63 ++-
>  lib/librte_mempool/rte_mempool.c                   | 192 ++++---
>  lib/librte_mempool/rte_mempool.h                   | 366 +++++++++---
>  lib/librte_mempool/rte_mempool_ops.c               |  48 +-
>  lib/librte_mempool/rte_mempool_version.map         |  11 +-
>  mk/rte.app.mk                                      |   1 +
>  13 files changed, 1184 insertions(+), 179 deletions(-)
>  create mode 100644 drivers/mempool/bucket/Makefile
>  create mode 100644 drivers/mempool/bucket/rte_mempool_bucket.c
>  create mode 100644 drivers/mempool/bucket/rte_mempool_bucket_version.map

Globally, the RFC looks fine to me. Thanks for this good work.

I didn't review the mempool/bucket part like I did last time. About the
changes to the mempool API, I think it's a good enhancement: it makes
things more flexible and removes complexity in the common code. Some
points may still need some discussions, for instance how the PMDs and
applications take advantage of block dequeue operations and get_info().

I have some specific comments that are sent directly as replies to the
patches.

Since it changes dpaa and octeontx, having feedback from people from NXP
and Cavium Networks would be good.

Thanks,
Olivier

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
  2018-01-23 13:23 13% [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool Andrew Rybchenko
@ 2018-01-31 16:46  4% ` Olivier Matz
  2018-02-01  6:40  4%   ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2018-01-31 16:46 UTC (permalink / raw)
  To: Andrew Rybchenko; +Cc: dev

On Tue, Jan 23, 2018 at 01:23:04PM +0000, Andrew Rybchenko wrote:
> An API/ABI changes are planned for 18.05 [1]:
> 
>  * Allow to customize how mempool objects are stored in memory.
>  * Deprecate mempool XMEM API.
>  * Add mempool driver ops to get information from mempool driver and
>    dequeue contiguous blocks of objects if driver supports it.
> 
> [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html
> 
> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3] checkpatches.sh: Add checks for ABI symbol addition
    @ 2018-01-31 17:27  6% ` Neil Horman
  2018-02-04 14:44  7%   ` Thomas Monjalon
  2018-02-05 17:29  6% ` [dpdk-dev] [PATCH v4] " Neil Horman
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-01-31 17:27 UTC (permalink / raw)
  To: dev
  Cc: Neil Horman, thomas, john.mcnamara, bruce.richardson,
	Ferruh Yigit, Stephen Hemminger

Recently, some additional patches were added to allow for programmatic
marking of C symbols as experimental.  The addition of these markers is
dependent on the manual addition of exported symbols to the EXPERIMENTAL
section of the corresponding libraries version map file.  The consensus
on review is that, in addition to mandating the addition of symbols to
the EXPERIMENTAL version in the map, we need a mechanism to enforce our
documented process of mandating that addition when they are introduced.
To that end, I am proposing this change.  It is an addition to the
checkpatches script, which scan incoming patches for additions and
removals of symbols to the map file, and warns the user appropriately

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: thomas@monjalon.net
CC: john.mcnamara@intel.com
CC: bruce.richardson@intel.com
CC: Ferruh Yigit <ferruh.yigit@intel.com>
CC: Stephen Hemminger <stephen@networkplumber.org>

---
Change notes

v2)
 * Cleaned up and documented awk script (shemminger)
 * fixed sort/uniq usage (shemminger)
 * moved checking to new script (tmonjalon)
 * added maintainer entry (tmonjalon)
 * added license (tmonjalon)

v3)
 * Changed symbol check script name (tmonjalon)
 * Trapped exit to clean temp file (tmonjalon)
 * Honored verbose command (tmonjalon)
 * Cleaned left over debug bits (tmonjalon)
 * Updated location in MAINTAINERS file (tmonjalon)
---
 MAINTAINERS                     |   2 +
 devtools/check-symbol-change.sh | 146 ++++++++++++++++++++++++++++++++++++++++
 devtools/checkpatches.sh        |  23 ++++++-
 3 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100755 devtools/check-symbol-change.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index acd056134..417115f97 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -42,6 +42,7 @@ F: doc/
 
 Developers and Maintainers Tools
 M: Thomas Monjalon <thomas@monjalon.net>
+M: Neil Horman <nhorman@tuxdriver.com>
 F: MAINTAINERS
 F: devtools/check-dup-includes.sh
 F: devtools/check-maintainers.sh
@@ -86,6 +87,7 @@ M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
 F: doc/guides/rel_notes/deprecation.rst
 F: devtools/validate-abi.sh
+F: devtools/check-symbol-change.sh
 F: buildtools/check-experimental-syms.sh
 
 Driver information
diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh
new file mode 100755
index 000000000..22b17e6f2
--- /dev/null
+++ b/devtools/check-symbol-change.sh
@@ -0,0 +1,146 @@
+#!/bin/sh
+
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Neil Horman <nhorman@tuxdriver.com>
+
+build_map_changes()
+{
+	local fname=$1
+	local mapdb=$2
+
+	cat $fname | filterdiff -i *.map | awk '
+		# Initialize our variables
+		BEGIN {map="";sym="";ar="";sec=""; in_sec=0}
+
+		# Anything that starts with + or -, followed by an a
+		# and ends in the string .map is the name of our map file
+		# This may appear multiple times in a patch if multiple
+		# map files are altered, and all section/symbol names
+		# appearing between a triggering of this rule and the
+		# next trigger of this rule are associated with this file
+		/[-+] a\/.*\.map/ {map=$2}
+
+		# Triggering this rule, which starts a line with a + and ends it
+		# with a { identifies a versioned section.  The section name is
+		# the rest of the line with the + and { symbols remvoed.
+		# Triggering this rule sets in_sec to 1, which actives the
+		# symbol rule below
+		/+.*{/ {gsub("+","");sec=$1; in_sec=1}
+
+		# This rule idenfies the end of a section, and disables the
+		# symbol rule
+		/.*}/ {in_sec=0}
+
+		# This rule matches on a + followed by any characters except a :
+		# (which denotes a global vs local segment), and ends with a ;.
+		# The semicolon is removed and the symbol is printed with its
+		# association file name and version section, along with an
+		# indicator that the symbol is a new addition.  Note this rule
+		# only works if we have found a version section in the rule
+		# above (hence the in_sec check).  Otherwise we flag it as an
+		# unknown section
+		/^+[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " add"
+			} else {
+				print map " " sym " unknown add"
+			}
+		}
+
+		# This is the same rule as above, but the rule matches on a
+		# leading - rather than a +, denoting that the symbol is being
+		# removed.
+		/^-[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " del"
+			} else {
+				print map " " sym " unknown del"
+			}
+		}' > ./$mapdb
+
+		sort -u $mapdb > ./$mapdb.2
+		mv -f $mapdb.2 $mapdb
+
+}
+
+check_for_rule_violations()
+{
+	local mapdb=$1
+	local mname
+	local symname
+	local secname
+	local ar
+	local ret=0
+
+	while read mname symname secname ar
+	do
+		if [ "$ar" == "add" ]
+		then
+
+			if [ "$secname" == "unknown" ]
+			then
+				# Just inform the user of this occurrence, but
+				# don't flag it as an error
+				echo -n "INFO: symbol $syname is added but "
+				echo -n "patch has insuficient context "
+				echo -n "to determine the section name "
+				echo -n "please ensure the version is "
+				echo "EXPERIMENTAL"
+				continue
+			fi
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Symbols that are getting added in a section
+				# other ithan the experimental section
+				# to be moving from an already supported
+				# section or its a violation
+				grep -q \
+				"$mname $symname [^EXPERIMENTAL] del" $mapdb
+				if [ $? -ne 0 ]
+				then
+					echo -n "ERROR: symbol $symname "
+					echo -n "is added in a section "
+					echo -n "other than the EXPERIMENTAL "
+					echo "section of the version map"
+					ret=1
+				fi
+			fi
+		else
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Just inform users that non-experimenal
+				# symbols need to go through a deprecation
+				# process
+				echo -n "INFO: symbol $symname is being "
+				echo -n "removed, ensure that it has "
+				echo "gone through the deprecation process"
+			fi
+		fi
+	done < $mapdb
+
+	return $ret
+}
+
+trap clean_and_exit_on_sig EXIT
+
+mapfile=`mktemp mapdb.XXXXXX`
+patch=$1
+exit_code=1
+
+clean_and_exit_on_sig()
+{
+	rm -f $mapfile
+	exit $exit_code
+}
+
+build_map_changes $patch $mapfile
+check_for_rule_violations $mapfile
+exit_code=$?
+
+rm -f $mapfile
+
+exit $exit_code
+
+
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 7676a6b50..0b2b5f039 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -35,6 +35,8 @@
 # - DPDK_CHECKPATCH_LINE_LENGTH
 . $(dirname $(readlink -e $0))/load-devel-config
 
+VALIDATE_NEW_API=$(dirname $(readlink -e $0))/check-symbol-change.sh
+
 length=${DPDK_CHECKPATCH_LINE_LENGTH:-80}
 
 # override default Linux options
@@ -61,6 +63,7 @@ print_usage () {
 	END_OF_HELP
 }
 
+
 number=0
 quiet=false
 verbose=false
@@ -86,6 +89,7 @@ total=0
 status=0
 
 check () { # <patch> <commit> <title>
+	local reta
 	total=$(($total + 1))
 	! $verbose || printf '\n### %s\n\n' "$3"
 	if [ -n "$1" ] ; then
@@ -96,9 +100,26 @@ check () { # <patch> <commit> <title>
 	else
 		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
 	fi
-	[ $? -ne 0 ] || return 0
+	reta=$?
+
 	$verbose || printf '\n### %s\n\n' "$3"
 	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+
+	! $verbose || echo
+	! $verbose || echo "Checking API additions/removals:"
+
+	if [ -n "$1" ] ; then
+		report=$($VALIDATE_NEW_API $1)
+	elif [ -n "$2" ] ; then
+		report=$(git format-patch \
+			 --find-renames --no-stat --stdout -1 $commit |
+			$VALIDATE_NEW_API -)
+	else
+		report=$($VALIDATE_NEW_API -)
+	fi
+	[ $? -ne 0 -o $reta -ne 0 ] || return 0
+	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+
 	status=$(($status + 1))
 }
 
-- 
2.14.3

^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH] net/octeontx: register fpa as platform HW mempool
  @ 2018-01-31 19:51  4% ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2018-01-31 19:51 UTC (permalink / raw)
  To: Pavan Nikhilesh, jerin.jacob, santosh.shukla, olivier.matz,
	hemant.agrawal
  Cc: dev, Neil Horman

On 1/22/2018 3:45 PM, Pavan Nikhilesh wrote:
> Register octeontx-fpavf as platform HW mempool when net/octeontx pmd is
> used.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
> ---
> 
>  This patch depends on http://dpdk.org/dev/patchwork/patch/34239 patchset.

This patch was waiting dependent patch, which seems merged now.

But now because of "__rte_experimental" tag in
rte_mbuf_set_platform_mempool_ops() that this patch uses getting build errors [1].

Need to add a special note to pmd makefile to allow experimental API usage:
CFLAGS += -DALLOW_EXPERIMENTAL_API



[1]
...dpdk/drivers/net/octeontx/octeontx_ethdev.c:1330:2: error:
'rte_mbuf_set_platform_mempool_ops' is deprecated: Symbol is not yet part of
stable ABI [-Werror,-Wdeprecate
d-declarations]


        rte_mbuf_set_platform_mempool_ops("octeontx_fpavf");
        ^

...dpdk/x86_64-native-linuxapp-clang/include/rte_mbuf_pool_ops.h:37:5: note:
'rte_mbuf_set_platform_mempool_ops' has been explicitly marked deprecated here
int __rte_experimental

    ^


...dpdk/x86_64-native-linuxapp-clang/include/rte_compat.h:107:16: note: expanded
from macro '__rte_experimental'

__attribute__((deprecated("Symbol is not yet part of stable ABI"), \
               ^

1 error generated.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
  2018-01-31 16:46  4% ` Olivier Matz
@ 2018-02-01  6:40  4%   ` Jerin Jacob
  2018-02-01 12:53  4%     ` Hemant Agrawal
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2018-02-01  6:40 UTC (permalink / raw)
  To: Olivier Matz; +Cc: Andrew Rybchenko, dev

-----Original Message-----
> Date: Wed, 31 Jan 2018 17:46:51 +0100
> From: Olivier Matz <olivier.matz@6wind.com>
> To: Andrew Rybchenko <arybchenko@solarflare.com>
> CC: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
> User-Agent: NeoMutt/20170113 (1.7.2)
> 
> On Tue, Jan 23, 2018 at 01:23:04PM +0000, Andrew Rybchenko wrote:
> > An API/ABI changes are planned for 18.05 [1]:
> > 
> >  * Allow to customize how mempool objects are stored in memory.
> >  * Deprecate mempool XMEM API.
> >  * Add mempool driver ops to get information from mempool driver and
> >    dequeue contiguous blocks of objects if driver supports it.
> > 
> > [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html
> > 
> > Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2] relicense various bits of the dpdk
  @ 2018-02-01 12:19  8% ` Neil Horman
  2018-02-01 12:49  0%   ` Hemant Agrawal
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-02-01 12:19 UTC (permalink / raw)
  To: dev; +Cc: Neil Horman, Hemant Agrawal, Thomas Monjalon

Received a note the other day from the Linux Foundation governance board
for DPDK indicating that several files I have copyright on need to be
relicensed to be compliant with the DPDK licensing guidelines.  I have
some concerns with some parts of the request, but am not opposed to
other parts.  So, for those pieces that we are in consensus on, I'm
proposing that we change their license from BSD 2 clause to 3 clause.
I'm also updating the files to use the SPDX licensing scheme

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Hemant Agrawal <hemant.agrawal@nxp.com>
CC: Thomas Monjalon <thomas@monjalon.net>

---
Change notes
V2) Cleaned up formatting (tmonjalon)
---
 devtools/validate-abi.sh       | 32 ++++----------------------------
 lib/librte_compat/rte_compat.h | 31 +++----------------------------
 2 files changed, 7 insertions(+), 56 deletions(-)

diff --git a/devtools/validate-abi.sh b/devtools/validate-abi.sh
index 8caf43e83..ee64b08fa 100755
--- a/devtools/validate-abi.sh
+++ b/devtools/validate-abi.sh
@@ -1,32 +1,8 @@
 #!/usr/bin/env bash
-#   BSD LICENSE
-#
-#   Copyright(c) 2015 Neil Horman. All rights reserved.
-#   Copyright(c) 2017 6WIND S.A.
-#   All rights reserved.
-#
-#   Redistribution and use in source and binary forms, with or without
-#   modification, are permitted provided that the following conditions
-#   are met:
-#
-#     * Redistributions of source code must retain the above copyright
-#       notice, this list of conditions and the following disclaimer.
-#     * Redistributions in binary form must reproduce the above copyright
-#       notice, this list of conditions and the following disclaimer in
-#       the documentation and/or other materials provided with the
-#       distribution.
-#
-#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+# SPDX-License-Identifier:	BSD-3-Clause
+# Copyright(c) 2015 Neil Horman. All rights reserved.
+# Copyright(c) 2017 6WIND S.A.
+# All rights reserved
 
 set -e
 
diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h
index d6e79f3fc..2cdc37214 100644
--- a/lib/librte_compat/rte_compat.h
+++ b/lib/librte_compat/rte_compat.h
@@ -1,31 +1,6 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2015 Neil Horman <nhorman@tuxdriver.com>.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+/* SPDX-License-Identifier: BSD-3-Clause 
+ * Copyright(c) 2015 Neil Horman <nhorman@tuxdriver.com>.
+ * All rights reserved.
  */
 
 #ifndef _RTE_COMPAT_H_
-- 
2.14.3

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v2] relicense various bits of the dpdk
  2018-02-01 12:19  8% ` [dpdk-dev] [PATCH v2] " Neil Horman
@ 2018-02-01 12:49  0%   ` Hemant Agrawal
  0 siblings, 0 replies; 200+ results
From: Hemant Agrawal @ 2018-02-01 12:49 UTC (permalink / raw)
  To: Neil Horman, dev; +Cc: Thomas Monjalon

On 2/1/2018 5:49 PM, Neil Horman wrote:
> Received a note the other day from the Linux Foundation governance board
> for DPDK indicating that several files I have copyright on need to be
> relicensed to be compliant with the DPDK licensing guidelines.  I have
> some concerns with some parts of the request, but am not opposed to
> other parts.  So, for those pieces that we are in consensus on, I'm
> proposing that we change their license from BSD 2 clause to 3 clause.
> I'm also updating the files to use the SPDX licensing scheme
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Hemant Agrawal <hemant.agrawal@nxp.com>
> CC: Thomas Monjalon <thomas@monjalon.net>
> 
> ---
> Change notes
> V2) Cleaned up formatting (tmonjalon)
> ---
>   devtools/validate-abi.sh       | 32 ++++----------------------------
>   lib/librte_compat/rte_compat.h | 31 +++----------------------------
>   2 files changed, 7 insertions(+), 56 deletions(-)
> 

Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
  2018-02-01  6:40  4%   ` Jerin Jacob
@ 2018-02-01 12:53  4%     ` Hemant Agrawal
  2018-02-14 15:23  4%       ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Hemant Agrawal @ 2018-02-01 12:53 UTC (permalink / raw)
  To: Jerin Jacob, Olivier Matz; +Cc: Andrew Rybchenko, dev

On 2/1/2018 12:10 PM, Jerin Jacob wrote:
> -----Original Message-----
>> Date: Wed, 31 Jan 2018 17:46:51 +0100
>> From: Olivier Matz <olivier.matz@6wind.com>
>> To: Andrew Rybchenko <arybchenko@solarflare.com>
>> CC: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
>> User-Agent: NeoMutt/20170113 (1.7.2)
>>
>> On Tue, Jan 23, 2018 at 01:23:04PM +0000, Andrew Rybchenko wrote:
>>> An API/ABI changes are planned for 18.05 [1]:
>>>
>>>   * Allow to customize how mempool objects are stored in memory.
>>>   * Deprecate mempool XMEM API.
>>>   * Add mempool driver ops to get information from mempool driver and
>>>     dequeue contiguous blocks of objects if driver supports it.
>>>
>>> [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html
>>>
>>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
>>
>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 
> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> 
> 
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/2] Revert "eal: fix default mempool ops"
  @ 2018-02-01 20:40  3%   ` Pavan Nikhilesh
  2018-02-02  5:43  0%     ` Hemant Agrawal
  0 siblings, 1 reply; 200+ results
From: Pavan Nikhilesh @ 2018-02-01 20:40 UTC (permalink / raw)
  To: Hemant Agrawal, olivier.matz, thomas, jerin.jacob; +Cc: dev

Hi Hemanth,

	Currently, best_mempool_ops is broken because when
rte_mbuf_user_mempool_ops is invoked it is expected to returns 'NULL' through
internal_config.user_mbuf_pool_ops_name. IMO it is best to create a named
memzone ('mbuf_user_pool_ops') at the end of eal_init and copy mbuf-pool-ops
passed to eal.

`rte_eal_mbuf_default_mempool_ops` is not expected to return 'NULL' would doing
so break the ABI?.

---
/**
 * Get default pool ops name for mbuf
 *
 * @return
 *   returns default pool ops name.
 */
const char *
rte_eal_mbuf_default_mempool_ops(void);
---

IMO creating named mempool at the end of eal_init and changing
`rte_mbuf_user_mempool_ops` as below would be a better solution.

rte_mbuf_user_mempool_ops(void)
{
...
        mz = rte_memzone_lookup("mbuf_user_pool_ops");
        if (mz == NULL)
                return NULL;
	...
}

Thoughts?

Pavan.

On Thu, Feb 01, 2018 at 07:56:47PM +0000, Hemant Agrawal wrote:
> Hi Pavan,
> 	Your patch was breaking the design of the best_mempool_ops and the whole purpose of selection was getting lost.
> I guess you were trying to fix  test_mempool.  I have sent another patch, which fixes that and start using the best mempool ops API
> instead of default mempool ops API.
>
> Regards,
> Hemant
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Hemant Agrawal
> > Sent: Friday, February 02, 2018 1:17 AM
> > To: olivier.matz@6wind.com; pbhagavatula@caviumnetworks.com
> > Cc: thomas@monjalon.net; dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH 1/2] Revert "eal: fix default mempool ops"
> >
> > This reverts commit fe06cb6c54fe5ada299ebba40a382bee37c919f2.
> > ---

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 1/2] Revert "eal: fix default mempool ops"
  2018-02-01 20:40  3%   ` Pavan Nikhilesh
@ 2018-02-02  5:43  0%     ` Hemant Agrawal
  0 siblings, 0 replies; 200+ results
From: Hemant Agrawal @ 2018-02-02  5:43 UTC (permalink / raw)
  To: Pavan Nikhilesh, olivier.matz, thomas, jerin.jacob; +Cc: dev

HI Pavan,
> 	Currently, best_mempool_ops is broken because when
> rte_mbuf_user_mempool_ops is invoked it is expected to returns 'NULL' through
> internal_config.user_mbuf_pool_ops_name. IMO it is best to create a named
> memzone ('mbuf_user_pool_ops') at the end of eal_init and copy mbuf-pool-ops
> passed to eal.
> 
> `rte_eal_mbuf_default_mempool_ops` is not expected to return 'NULL' would
> doing so break the ABI?.
> 
> ---
> /**
>  * Get default pool ops name for mbuf
>  *
>  * @return
>  *   returns default pool ops name.
>  */
> const char *
> rte_eal_mbuf_default_mempool_ops(void);
> ---
> 
> IMO creating named mempool at the end of eal_init and changing
> `rte_mbuf_user_mempool_ops` as below would be a better solution.
> 
> rte_mbuf_user_mempool_ops(void)
> {
> ...
>         mz = rte_memzone_lookup("mbuf_user_pool_ops");
>         if (mz == NULL)
>                 return NULL;
> 	...
> }
> 
> Thoughts?


[Hemant]  It seems reasonable. We can also deprecate the eal default mempool ops API . I will be sending patch shortly.
 
Unfortunately all NXP platforms are broken at the moment, so we need to get it fixed fast.

Hemant

> 
> Pavan.
> 
> On Thu, Feb 01, 2018 at 07:56:47PM +0000, Hemant Agrawal wrote:
> > Hi Pavan,
> > 	Your patch was breaking the design of the best_mempool_ops and the
> whole purpose of selection was getting lost.
> > I guess you were trying to fix  test_mempool.  I have sent another
> > patch, which fixes that and start using the best mempool ops API instead of
> default mempool ops API.
> >
> > Regards,
> > Hemant
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Hemant Agrawal
> > > Sent: Friday, February 02, 2018 1:17 AM
> > > To: olivier.matz@6wind.com; pbhagavatula@caviumnetworks.com
> > > Cc: thomas@monjalon.net; dev@dpdk.org
> > > Subject: [dpdk-dev] [PATCH 1/2] Revert "eal: fix default mempool ops"
> > >
> > > This reverts commit fe06cb6c54fe5ada299ebba40a382bee37c919f2.
> > > ---

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: remove eal API for default mempool ops name
    @ 2018-02-02  8:03 10% ` Hemant Agrawal
  2018-02-02  8:31 10%   ` [dpdk-dev] [PATCH v2] " Hemant Agrawal
  1 sibling, 1 reply; 200+ results
From: Hemant Agrawal @ 2018-02-02  8:03 UTC (permalink / raw)
  To: olivier.matz, thomas, pbhagavatula
  Cc: nipun.gupta, jerin.jacob, santosh.shukla, dev, Hemant Agrawal

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 doc/guides/rel_notes/deprecation.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad59..a2b391c 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -8,6 +8,15 @@ API and ABI deprecation notices are to be posted here.
 Deprecation Notices
 -------------------
 
+* eal: a new set of mbuf mempool ops name APIs for user, platform and best
+  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
+  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
+  ``rte_mbuf_best_mempool_ops``.
+  The following function is now redundant and it is target to be deprecated in 
+  18.05:
+
+  - ``rte_eal_mbuf_default_mempool_ops``
+  
 * eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
   The format of device command line parameters will change. The bus will need
   to be explicitly stated in the device declaration. The enum ``rte_devtype``
-- 
2.7.4

^ permalink raw reply	[relevance 10%]

* [dpdk-dev] [PATCH v2] doc: remove eal API for default mempool ops name
  2018-02-02  8:03 10% ` [dpdk-dev] [PATCH] doc: remove eal API for default mempool ops name Hemant Agrawal
@ 2018-02-02  8:31 10%   ` Hemant Agrawal
  2018-02-02 14:01  0%     ` Olivier Matz
  0 siblings, 1 reply; 200+ results
From: Hemant Agrawal @ 2018-02-02  8:31 UTC (permalink / raw)
  To: olivier.matz, thomas, pbhagavatula
  Cc: nipun.gupta, jerin.jacob, santosh.shukla, dev, Hemant Agrawal

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v2: fix checkpatch errors

 doc/guides/rel_notes/deprecation.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad59..c7d8f25 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -8,6 +8,15 @@ API and ABI deprecation notices are to be posted here.
 Deprecation Notices
 -------------------
 
+* eal: a new set of mbuf mempool ops name APIs for user, platform and best
+  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
+  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
+  ``rte_mbuf_best_mempool_ops``.
+  The following function is now redundant and it is target to be deprecated in
+  18.05:
+
+  - ``rte_eal_mbuf_default_mempool_ops``
+
 * eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
   The format of device command line parameters will change. The bus will need
   to be explicitly stated in the device declaration. The enum ``rte_devtype``
-- 
2.7.4

^ permalink raw reply	[relevance 10%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-01-30 11:53  4%           ` Verma, Shally
@ 2018-02-02  9:07  4%             ` De Lara Guarch, Pablo
  2018-02-02 10:52  4%               ` Verma, Shally
  0 siblings, 1 reply; 200+ results
From: De Lara Guarch, Pablo @ 2018-02-02  9:07 UTC (permalink / raw)
  To: Verma, Shally, Akhil Goyal, Trahe, Fiona, hemant.agrawal,
	Doherty, Declan, Griffin, John, Jain, Deepak K, jck, tdu, dima,
	nsamsono, jianbo.liu, Jacob,  Jerin, Athreya, Narayana Prasad,
	Murthy, Nidadavolu
  Cc: dev

Hi Shally,

> -----Original Message-----
> From: Verma, Shally [mailto:Shally.Verma@cavium.com]
> Sent: Tuesday, January 30, 2018 11:54 AM
> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Akhil Goyal
> <akhil.goyal@nxp.com>; Trahe, Fiona <fiona.trahe@intel.com>;
> hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
> Griffin, John <john.griffin@intel.com>; Jain, Deepak K
> <deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com;
> dima@marvell.com; nsamsono@marvell.com; jianbo.liu@arm.com; Jacob,
> Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
> <NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
> <Nidadavolu.Murthy@cavium.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info
> struct
> 
> 
> 
> >-----Original Message-----
> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch@intel.com]
> >Sent: 30 January 2018 16:51
> >To: Verma, Shally <Shally.Verma@cavium.com>; Akhil Goyal
> ><akhil.goyal@nxp.com>; Trahe, Fiona <fiona.trahe@intel.com>;
> >hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
> >Griffin, John <john.griffin@intel.com>; Jain, Deepak K
> ><deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com;
> >dima@marvell.com; nsamsono@marvell.com; jianbo.liu@arm.com; Jacob,
> >Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
> ><NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
> ><Nidadavolu.Murthy@cavium.com>
> >Cc: dev@dpdk.org
> >Subject: RE: [dpdk-dev] [PATCH] doc: announce ABI change for crypto
> >info struct
> >
> >Hi Shally/Ahkil,
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Verma, Shally
> >> Sent: Tuesday, January 30, 2018 7:56 AM
> >> To: Akhil Goyal <akhil.goyal@nxp.com>; De Lara Guarch, Pablo
> >> <pablo.de.lara.guarch@intel.com>; Trahe, Fiona
> >> <fiona.trahe@intel.com>; hemant.agrawal@nxp.com; Doherty, Declan
> >> <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>;
> >> Jain, Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> >> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> >> jianbo.liu@arm.com; Jacob, Jerin
> >> <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
> >> <NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
> >> <Nidadavolu.Murthy@cavium.com>
> >> Cc: dev@dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto
> >> info struct
> >>
> >> I do see current cryptodev unit testcase (inside \test dir) uses
> >> info.sym.max_nb_sessions param for session mempool_create. So, such
> >> testcases change are also in proposal?
> >
> >Yes, for these tests, we can just define a macro in the tests, instead of
> using the info structure.
> 
> [Shally] Ok, then you mean applications will choose any random number
> during mempool_create and not dependent on device max_nb_sessions?

Yes, actually for the unit tests, even one session is enough.

> 
> >>
> >> Another point, we recently submitted an RFC patch on lib/cryptodev
> >> with asymmetric crypto support
> >> (https://dpdk.org/dev/patchwork/patch/34308/) which is awaiting
> >> review and these fields have role to play there.
> >> So, could this change be please viewed in conjunction with asym RFC?
> >
> >Do you need it for asymmetric? Anyway, this would remove the
> symmetric function and structures, not applicable for you.
> 
> [Shally] I would say addition of asym in lib/cryptodev is not entirely
> standalone, specifically for PMDs that can support both.
> My key concern are max_nb_sessions_per_qp and related
> qp_attach_sym/asym APIs which enable management of queue distribution
> among sym and asym in current proposal, specifically, for PMDs that can
> support both but have dedicated qp for each. Right now proposal is open
> for feedback and would prefer to be covered before sym related changes
> could be applied.

Actually, I have been thinking about this. Given the time we have until 18.02 is out,
and that this is not urgent to be applied (this is just code cleanup),
I am postponing this until next release. 

My other reason is that the info structure has a rte_pci_device pointer which should be removed.
However, I believe it is better to leave it for next release and discuss it with other libraries which has this, like ethdev.

Thanks,
Pablo


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
  2018-02-02  9:07  4%             ` De Lara Guarch, Pablo
@ 2018-02-02 10:52  4%               ` Verma, Shally
  0 siblings, 0 replies; 200+ results
From: Verma, Shally @ 2018-02-02 10:52 UTC (permalink / raw)
  To: De Lara Guarch, Pablo, Akhil Goyal, Trahe, Fiona, hemant.agrawal,
	Doherty, Declan, Griffin, John, Jain, Deepak K, jck, tdu, dima,
	nsamsono, jianbo.liu, Jacob,  Jerin, Athreya, Narayana Prasad,
	Murthy, Nidadavolu
  Cc: dev



>-----Original Message-----
>From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch@intel.com]
>Sent: 02 February 2018 14:38
>To: Verma, Shally <Shally.Verma@cavium.com>; Akhil Goyal <akhil.goyal@nxp.com>; Trahe, Fiona <fiona.trahe@intel.com>;
>hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>; Jain, Deepak K
><deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
>jianbo.liu@arm.com; Jacob, Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
><NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu <Nidadavolu.Murthy@cavium.com>
>Cc: dev@dpdk.org
>Subject: RE: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct
>
>Hi Shally,
>
>> -----Original Message-----
>> From: Verma, Shally [mailto:Shally.Verma@cavium.com]
>> Sent: Tuesday, January 30, 2018 11:54 AM
>> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Akhil Goyal
>> <akhil.goyal@nxp.com>; Trahe, Fiona <fiona.trahe@intel.com>;
>> hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
>> Griffin, John <john.griffin@intel.com>; Jain, Deepak K
>> <deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com;
>> dima@marvell.com; nsamsono@marvell.com; jianbo.liu@arm.com; Jacob,
>> Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
>> <NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
>> <Nidadavolu.Murthy@cavium.com>
>> Cc: dev@dpdk.org
>> Subject: RE: [dpdk-dev] [PATCH] doc: announce ABI change for crypto info
>> struct
>>
>>
>>
>> >-----Original Message-----
>> >From: De Lara Guarch, Pablo [mailto:pablo.de.lara.guarch@intel.com]
>> >Sent: 30 January 2018 16:51
>> >To: Verma, Shally <Shally.Verma@cavium.com>; Akhil Goyal
>> ><akhil.goyal@nxp.com>; Trahe, Fiona <fiona.trahe@intel.com>;
>> >hemant.agrawal@nxp.com; Doherty, Declan <declan.doherty@intel.com>;
>> >Griffin, John <john.griffin@intel.com>; Jain, Deepak K
>> ><deepak.k.jain@intel.com>; jck@semihalf.com; tdu@semihalf.com;
>> >dima@marvell.com; nsamsono@marvell.com; jianbo.liu@arm.com; Jacob,
>> >Jerin <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
>> ><NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
>> ><Nidadavolu.Murthy@cavium.com>
>> >Cc: dev@dpdk.org
>> >Subject: RE: [dpdk-dev] [PATCH] doc: announce ABI change for crypto
>> >info struct
>> >
>> >Hi Shally/Ahkil,
>> >
>> >> -----Original Message-----
>> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Verma, Shally
>> >> Sent: Tuesday, January 30, 2018 7:56 AM
>> >> To: Akhil Goyal <akhil.goyal@nxp.com>; De Lara Guarch, Pablo
>> >> <pablo.de.lara.guarch@intel.com>; Trahe, Fiona
>> >> <fiona.trahe@intel.com>; hemant.agrawal@nxp.com; Doherty, Declan
>> >> <declan.doherty@intel.com>; Griffin, John <john.griffin@intel.com>;
>> >> Jain, Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
>> >> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
>> >> jianbo.liu@arm.com; Jacob, Jerin
>> >> <Jerin.JacobKollanukkaran@cavium.com>; Athreya, Narayana Prasad
>> >> <NarayanaPrasad.Athreya@cavium.com>; Murthy, Nidadavolu
>> >> <Nidadavolu.Murthy@cavium.com>
>> >> Cc: dev@dpdk.org
>> >> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change for crypto
>> >> info struct
>> >>
>> >> I do see current cryptodev unit testcase (inside \test dir) uses
>> >> info.sym.max_nb_sessions param for session mempool_create. So, such
>> >> testcases change are also in proposal?
>> >
>> >Yes, for these tests, we can just define a macro in the tests, instead of
>> using the info structure.
>>
>> [Shally] Ok, then you mean applications will choose any random number
>> during mempool_create and not dependent on device max_nb_sessions?
>
>Yes, actually for the unit tests, even one session is enough.
>
>>
>> >>
>> >> Another point, we recently submitted an RFC patch on lib/cryptodev
>> >> with asymmetric crypto support
>> >> (https://dpdk.org/dev/patchwork/patch/34308/) which is awaiting
>> >> review and these fields have role to play there.
>> >> So, could this change be please viewed in conjunction with asym RFC?
>> >
>> >Do you need it for asymmetric? Anyway, this would remove the
>> symmetric function and structures, not applicable for you.
>>
>> [Shally] I would say addition of asym in lib/cryptodev is not entirely
>> standalone, specifically for PMDs that can support both.
>> My key concern are max_nb_sessions_per_qp and related
>> qp_attach_sym/asym APIs which enable management of queue distribution
>> among sym and asym in current proposal, specifically, for PMDs that can
>> support both but have dedicated qp for each. Right now proposal is open
>> for feedback and would prefer to be covered before sym related changes
>> could be applied.
>
>Actually, I have been thinking about this. Given the time we have until 18.02 is out,
>and that this is not urgent to be applied (this is just code cleanup),
>I am postponing this until next release.
>
[Shally] Ok. Thanks for acknowledging this.

>My other reason is that the info structure has a rte_pci_device pointer which should be removed.
>However, I believe it is better to leave it for next release and discuss it with other libraries which has this, like ethdev.
>
>Thanks,
>Pablo


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: remove eal API for default mempool ops name
  2018-02-02  8:31 10%   ` [dpdk-dev] [PATCH v2] " Hemant Agrawal
@ 2018-02-02 14:01  0%     ` Olivier Matz
  2018-02-13 11:28  0%       ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2018-02-02 14:01 UTC (permalink / raw)
  To: Hemant Agrawal
  Cc: thomas, pbhagavatula, nipun.gupta, jerin.jacob, santosh.shukla, dev

On Fri, Feb 02, 2018 at 02:01:42PM +0530, Hemant Agrawal wrote:
> Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
> v2: fix checkpatch errors
> 
>  doc/guides/rel_notes/deprecation.rst | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index d59ad59..c7d8f25 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -8,6 +8,15 @@ API and ABI deprecation notices are to be posted here.
>  Deprecation Notices
>  -------------------
>  
> +* eal: a new set of mbuf mempool ops name APIs for user, platform and best
> +  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
> +  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
> +  ``rte_mbuf_best_mempool_ops``.
> +  The following function is now redundant and it is target to be deprecated in
> +  18.05:
> +
> +  - ``rte_eal_mbuf_default_mempool_ops``
> +
>  * eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
>    The format of device command line parameters will change. The bus will need
>    to be explicitly stated in the device declaration. The enum ``rte_devtype``

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration
@ 2018-02-02 15:16  3% Adrien Mazarguil
  2018-02-02 15:16  3% ` [dpdk-dev] [PATCH v1 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
  2018-02-02 16:46  3% ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  0 siblings, 2 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
of generating them at run time due to security concerns [1] led to a few
issues:

- They must be present on the file system before running DPDK.
- Their location must be known to the dynamic linker.
- Their names overlap and ABI compatibility is not guaranteed, which may
  lead to crashes.

This series addresses the above by adding version information to plug-ins
and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the file
system.

[1] http://dpdk.org/ml/archives/dev/2018-January/089617.html

Adrien Mazarguil (4):
  net/mlx: add debug checks to glue structure
  net/mlx: fix missing includes for rdma-core glue
  net/mlx: version rdma-core glue libraries
  net/mlx: make rdma-core glue path configurable

 doc/guides/nics/mlx4.rst     | 17 ++++++++++++
 doc/guides/nics/mlx5.rst     | 14 ++++++++++
 drivers/net/mlx4/Makefile    |  8 ++++--
 drivers/net/mlx4/mlx4.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx4/mlx4_glue.c |  4 +++
 drivers/net/mlx4/mlx4_glue.h |  9 +++++++
 drivers/net/mlx5/Makefile    |  8 ++++--
 drivers/net/mlx5/mlx5.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_glue.c |  1 +
 drivers/net/mlx5/mlx5_glue.h |  7 +++++
 10 files changed, 176 insertions(+), 6 deletions(-)

-- 
2.11.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v1 3/4] net/mlx: version rdma-core glue libraries
  2018-02-02 15:16  3% [dpdk-dev] [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
@ 2018-02-02 15:16  3% ` Adrien Mazarguil
  2018-02-02 16:46  3% ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  1 sibling, 0 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/Makefile    | 8 ++++++--
 drivers/net/mlx4/mlx4.c      | 5 +++++
 drivers/net/mlx4/mlx4_glue.c | 1 +
 drivers/net/mlx4/mlx4_glue.h | 6 ++++++
 drivers/net/mlx5/Makefile    | 8 ++++++--
 drivers/net/mlx5/mlx5.c      | 5 +++++
 drivers/net/mlx5/mlx5_glue.c | 1 +
 drivers/net/mlx5/mlx5_glue.h | 6 ++++++
 8 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index c004ac71c..cc9db9977 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx4.a
-LIB_GLUE = librte_pmd_mlx4_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
@@ -64,6 +66,7 @@ CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 CFLAGS += -DMLX4_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX4_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx4_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -131,6 +134,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx4_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx4
 
 mlx4_glue.o: mlx4_autoconf.h
@@ -139,6 +143,6 @@ endif
 
 clean_mlx4: FORCE
 	$Q rm -f -- mlx4_autoconf.h mlx4_autoconf.h.new
-	$Q rm -f -- mlx4_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx4_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx4
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 201d39b6e..61a852fb9 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -808,6 +808,11 @@ rte_mlx4_pmd_init(void)
 			assert(((const void *const *)mlx4_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx4_glue->version, MLX4_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx4_glue->version, MLX4_GLUE_VERSION);
+		return;
+	}
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 47ae7ad0f..3b79d320e 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -240,6 +240,7 @@ mlx4_glue_dv_set_context_attr(struct ibv_context *context,
 }
 
 const struct mlx4_glue *mlx4_glue = &(const struct mlx4_glue){
+	.version = MLX4_GLUE_VERSION,
 	.fork_init = mlx4_glue_fork_init,
 	.get_async_event = mlx4_glue_get_async_event,
 	.ack_async_event = mlx4_glue_ack_async_event,
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index de251c622..368f906bf 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -19,7 +19,13 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX4_GLUE_VERSION
+#define MLX4_GLUE_VERSION ""
+#endif
+
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx4_glue {
+	const char *version;
 	int (*fork_init)(void);
 	int (*get_async_event)(struct ibv_context *context,
 			       struct ibv_async_event *event);
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4b20d718b..4086f2039 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx5.a
-LIB_GLUE = librte_pmd_mlx5_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
@@ -74,6 +76,7 @@ CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -Wno-strict-prototypes
 ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
 CFLAGS += -DMLX5_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX5_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx5_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -180,6 +183,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx5_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx5
 
 mlx5_glue.o: mlx5_autoconf.h
@@ -188,6 +192,6 @@ endif
 
 clean_mlx5: FORCE
 	$Q rm -f -- mlx5_autoconf.h mlx5_autoconf.h.new
-	$Q rm -f -- mlx5_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx5_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx5
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 050cfac0d..341230d2b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1151,6 +1151,11 @@ rte_mlx5_pmd_init(void)
 			assert(((const void *const *)mlx5_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx5_glue->version, MLX5_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx5_glue->version, MLX5_GLUE_VERSION);
+		return;
+	}
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index 8f500be6e..1c4396ada 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -308,6 +308,7 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 }
 
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
+	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
 	.alloc_pd = mlx5_glue_alloc_pd,
 	.dealloc_pd = mlx5_glue_dealloc_pd,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 7fed302ba..b5efee3b6 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -19,6 +19,10 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX5_GLUE_VERSION
+#define MLX5_GLUE_VERSION ""
+#endif
+
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_counter_set;
 struct ibv_counter_set_data;
@@ -27,7 +31,9 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
+	const char *version;
 	int (*fork_init)(void);
 	struct ibv_pd *(*alloc_pd)(struct ibv_context *context);
 	int (*dealloc_pd)(struct ibv_pd *pd);
-- 
2.11.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration
  2018-02-02 15:16  3% [dpdk-dev] [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 15:16  3% ` [dpdk-dev] [PATCH v1 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
@ 2018-02-02 16:46  3% ` Adrien Mazarguil
  2018-02-02 16:46  3%   ` [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
                     ` (2 more replies)
  1 sibling, 3 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
of generating them at run time due to security concerns [1] led to a few
issues:

- They must be present on the file system before running DPDK.
- Their location must be known to the dynamic linker.
- Their names overlap and ABI compatibility is not guaranteed, which may
  lead to crashes.

This series addresses the above by adding version information to plug-ins
and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the file
system.

[1] http://dpdk.org/ml/archives/dev/2018-January/089617.html

v2 changes:

- Fixed extra "\n" in glue file name generation (although it didn't break
  functionality).

Adrien Mazarguil (4):
  net/mlx: add debug checks to glue structure
  net/mlx: fix missing includes for rdma-core glue
  net/mlx: version rdma-core glue libraries
  net/mlx: make rdma-core glue path configurable

 doc/guides/nics/mlx4.rst     | 17 ++++++++++++
 doc/guides/nics/mlx5.rst     | 14 ++++++++++
 drivers/net/mlx4/Makefile    |  8 ++++--
 drivers/net/mlx4/mlx4.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx4/mlx4_glue.c |  4 +++
 drivers/net/mlx4/mlx4_glue.h |  9 +++++++
 drivers/net/mlx5/Makefile    |  8 ++++--
 drivers/net/mlx5/mlx5.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_glue.c |  1 +
 drivers/net/mlx5/mlx5_glue.h |  7 +++++
 10 files changed, 176 insertions(+), 6 deletions(-)

-- 
2.11.0

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-02 16:46  3% ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
@ 2018-02-02 16:46  3%   ` Adrien Mazarguil
    2018-02-02 16:52  0%   ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Nélio Laranjeiro
  2018-02-06 11:31  0%   ` Shahaf Shuler
  2 siblings, 1 reply; 200+ results
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/Makefile    | 8 ++++++--
 drivers/net/mlx4/mlx4.c      | 5 +++++
 drivers/net/mlx4/mlx4_glue.c | 1 +
 drivers/net/mlx4/mlx4_glue.h | 6 ++++++
 drivers/net/mlx5/Makefile    | 8 ++++++--
 drivers/net/mlx5/mlx5.c      | 5 +++++
 drivers/net/mlx5/mlx5_glue.c | 1 +
 drivers/net/mlx5/mlx5_glue.h | 6 ++++++
 8 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index c004ac71c..cc9db9977 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx4.a
-LIB_GLUE = librte_pmd_mlx4_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
@@ -64,6 +66,7 @@ CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 CFLAGS += -DMLX4_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX4_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx4_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -131,6 +134,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx4_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx4
 
 mlx4_glue.o: mlx4_autoconf.h
@@ -139,6 +143,6 @@ endif
 
 clean_mlx4: FORCE
 	$Q rm -f -- mlx4_autoconf.h mlx4_autoconf.h.new
-	$Q rm -f -- mlx4_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx4_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx4
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 201d39b6e..61a852fb9 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -808,6 +808,11 @@ rte_mlx4_pmd_init(void)
 			assert(((const void *const *)mlx4_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx4_glue->version, MLX4_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx4_glue->version, MLX4_GLUE_VERSION);
+		return;
+	}
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 47ae7ad0f..3b79d320e 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -240,6 +240,7 @@ mlx4_glue_dv_set_context_attr(struct ibv_context *context,
 }
 
 const struct mlx4_glue *mlx4_glue = &(const struct mlx4_glue){
+	.version = MLX4_GLUE_VERSION,
 	.fork_init = mlx4_glue_fork_init,
 	.get_async_event = mlx4_glue_get_async_event,
 	.ack_async_event = mlx4_glue_ack_async_event,
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index de251c622..368f906bf 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -19,7 +19,13 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX4_GLUE_VERSION
+#define MLX4_GLUE_VERSION ""
+#endif
+
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx4_glue {
+	const char *version;
 	int (*fork_init)(void);
 	int (*get_async_event)(struct ibv_context *context,
 			       struct ibv_async_event *event);
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4b20d718b..4086f2039 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx5.a
-LIB_GLUE = librte_pmd_mlx5_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
@@ -74,6 +76,7 @@ CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -Wno-strict-prototypes
 ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
 CFLAGS += -DMLX5_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX5_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx5_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -180,6 +183,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx5_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx5
 
 mlx5_glue.o: mlx5_autoconf.h
@@ -188,6 +192,6 @@ endif
 
 clean_mlx5: FORCE
 	$Q rm -f -- mlx5_autoconf.h mlx5_autoconf.h.new
-	$Q rm -f -- mlx5_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx5_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx5
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 050cfac0d..341230d2b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1151,6 +1151,11 @@ rte_mlx5_pmd_init(void)
 			assert(((const void *const *)mlx5_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx5_glue->version, MLX5_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx5_glue->version, MLX5_GLUE_VERSION);
+		return;
+	}
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index 8f500be6e..1c4396ada 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -308,6 +308,7 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 }
 
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
+	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
 	.alloc_pd = mlx5_glue_alloc_pd,
 	.dealloc_pd = mlx5_glue_dealloc_pd,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 7fed302ba..b5efee3b6 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -19,6 +19,10 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX5_GLUE_VERSION
+#define MLX5_GLUE_VERSION ""
+#endif
+
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_counter_set;
 struct ibv_counter_set_data;
@@ -27,7 +31,9 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
+	const char *version;
 	int (*fork_init)(void);
 	struct ibv_pd *(*alloc_pd)(struct ibv_context *context);
 	int (*dealloc_pd)(struct ibv_pd *pd);
-- 
2.11.0

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration
  2018-02-02 16:46  3% ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 16:46  3%   ` [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
@ 2018-02-02 16:52  0%   ` Nélio Laranjeiro
  2018-02-06 11:31  0%   ` Shahaf Shuler
  2 siblings, 0 replies; 200+ results
From: Nélio Laranjeiro @ 2018-02-02 16:52 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Shahaf Shuler, dev, Marcelo Ricardo Leitner

On Fri, Feb 02, 2018 at 05:46:10PM +0100, Adrien Mazarguil wrote:
> The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
> of generating them at run time due to security concerns [1] led to a few
> issues:
> 
> - They must be present on the file system before running DPDK.
> - Their location must be known to the dynamic linker.
> - Their names overlap and ABI compatibility is not guaranteed, which may
>   lead to crashes.
> 
> This series addresses the above by adding version information to plug-ins
> and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the file
> system.
> 
> [1] http://dpdk.org/ml/archives/dev/2018-January/089617.html
> 
> v2 changes:
> 
> - Fixed extra "\n" in glue file name generation (although it didn't break
>   functionality).
> 
> Adrien Mazarguil (4):
>   net/mlx: add debug checks to glue structure
>   net/mlx: fix missing includes for rdma-core glue
>   net/mlx: version rdma-core glue libraries
>   net/mlx: make rdma-core glue path configurable
> 
>  doc/guides/nics/mlx4.rst     | 17 ++++++++++++
>  doc/guides/nics/mlx5.rst     | 14 ++++++++++
>  drivers/net/mlx4/Makefile    |  8 ++++--
>  drivers/net/mlx4/mlx4.c      | 57 ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx4/mlx4_glue.c |  4 +++
>  drivers/net/mlx4/mlx4_glue.h |  9 +++++++
>  drivers/net/mlx5/Makefile    |  8 ++++--
>  drivers/net/mlx5/mlx5.c      | 57 ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_glue.c |  1 +
>  drivers/net/mlx5/mlx5_glue.h |  7 +++++
>  10 files changed, 176 insertions(+), 6 deletions(-)
> 
> -- 
> 2.11.0

For the series,

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] doc: annouce ABI change for RSS configuraiton structure
@ 2018-02-04  7:24  4% Xueming Li
  2018-02-06  7:38  4% ` [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure Xueming Li
  0 siblings, 1 reply; 200+ results
From: Xueming Li @ 2018-02-04  7:24 UTC (permalink / raw)
  To: Thomas Monjalon, Neil Horman; +Cc: Xueming Li, dev, Shahaf Shuler

Update deprecation notice for the new level field of rte_eth_rss_conf.

Link: http://www.dpdk.org/dev/patchwork/patch/31891

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/rel_notes/deprecation.rst | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..cdb7f6ba2 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,7 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+  
+* ethdev: A new rss level field planned in 18.05.
+  The new API add level field to ``rte_eth_rss_conf`` to enable a choice
+  of RSS hash calculation on outer or inner header of tunneled packet.
-- 
2.13.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] checkpatches.sh: Add checks for ABI symbol addition
  2018-01-31 17:27  6% ` [dpdk-dev] [PATCH v3] " Neil Horman
@ 2018-02-04 14:44  7%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-04 14:44 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, john.mcnamara, bruce.richardson, Ferruh Yigit, Stephen Hemminger

31/01/2018 18:27, Neil Horman:
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -42,6 +42,7 @@ F: doc/
>  
>  Developers and Maintainers Tools
>  M: Thomas Monjalon <thomas@monjalon.net>
> +M: Neil Horman <nhorman@tuxdriver.com>
>  F: MAINTAINERS
>  F: devtools/check-dup-includes.sh
>  F: devtools/check-maintainers.sh

You don't need to add your name in this general section.
The new file is now in "ABI versioning" section that you already maintain.

Talking about maintenance, you are welcome to put your name
into "Driver information" section :)

> @@ -86,6 +87,7 @@ M: Neil Horman <nhorman@tuxdriver.com>
>  F: lib/librte_compat/
>  F: doc/guides/rel_notes/deprecation.rst
>  F: devtools/validate-abi.sh
> +F: devtools/check-symbol-change.sh
>  F: buildtools/check-experimental-syms.sh

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-01-18 10:32 13% ` [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes Anatoly Burakov
  2018-01-23 10:36  0%   ` Mcnamara, John
@ 2018-02-05 11:47  0%   ` Bruce Richardson
  2018-02-07 10:11  0%     ` Jerin Jacob
  2018-02-12 15:58  0%   ` Jonas Pfefferle
  2018-02-13  0:24  0%   ` Yongseok Koh
  3 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2018-02-05 11:47 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev, Neil Horman, John McNamara, Marko Kovacevic

On Thu, Jan 18, 2018 at 10:32:28AM +0000, Anatoly Burakov wrote:
> Due to coming changes outlined in memory hotplug RFC, there will
> be several API/ABI changes.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH 2/8] vhost: avoid enum fields in VhostUserMsg
  @ 2018-02-05 12:16  3% ` Stefan Hajnoczi
  2018-02-06  9:47  0%   ` Maxime Coquelin
  0 siblings, 1 reply; 200+ results
From: Stefan Hajnoczi @ 2018-02-05 12:16 UTC (permalink / raw)
  To: dev; +Cc: Maxime Coquelin, Yuanhan Liu, Stefan Hajnoczi

The VhostUserMsg struct binary representation must match the vhost-user
protocol specification since this struct is read from and written to the
socket.

The VhostUserMsg.request union contains enum fields.  Enum binary
representation is implementation-defined according to the C standard and
it is unportable to make assumptions about the representation:

  6.7.2.2 Enumeration specifiers
  ...
  Each enumerated type shall be compatible with char, a signed integer
  type, or an unsigned integer type. The choice of type is
  implementation-defined, but shall be capable of representing the
  values of all the members of the enumeration.

Additionally, librte_vhost relies on the enum type being unsigned when
validating untrusted inputs:

  if (ret <= 0 || msg.request.master >= VHOST_USER_MAX) {

If msg.request.master is signed then negative values pass this check!

Even if we assume gcc on x86_64 (SysV amd64 ABI) and don't care about
portability, the actual enum constants still affect the final type.  For
example, if we add a negative constant then its type changes to signed
int:

  typedef enum VhostUserRequest {
      ...
      VHOST_USER_INVALID = -1,
  };

This is very fragile and it's unlikely that anyone changing the code
would remember this.  A security hole can be introduced accidentally.

This patch switches VhostUserMsg.request fields to uint32_t to avoid the
portability and potential security issues.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 lib/librte_vhost/vhost_user.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index d4bd604b9..0fafbe6e0 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -81,8 +81,8 @@ typedef struct VhostUserLog {

 typedef struct VhostUserMsg {
 	union {
-		VhostUserRequest master;
-		VhostUserSlaveRequest slave;
+		uint32_t master; /* a VhostUserRequest value */
+		uint32_t slave;  /* a VhostUserSlaveRequest value*/
 	} request;

 #define VHOST_USER_VERSION_MASK     0x3
-- 
2.14.3

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  @ 2018-02-05 13:44  3%               ` Adrien Mazarguil
  2018-02-05 14:16  0%                 ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Adrien Mazarguil @ 2018-02-05 13:44 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Van Haaren, Harry, Thomas Monjalon, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > Sent: Monday, February 5, 2018 12:14 PM
> > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > libraries
> > > 
> > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > >
> > > > > >  # Library name.
> > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > >
> > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > Ideally, you should retrieve it from rte_version.h.
> > > >
> > > > Keep in mind this only needs to be updated when the glue API gets
> > > modified,
> > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > >
> > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > different sets of Verbs calls and thus a different version, hence the
> > > added
> > > > "18.02" as a starting point. The last digit may have to be modified
> > > possibly
> > > > several times between official DPDK releases while work is being done on
> > > the
> > > > PMD (i.e. per commit).
> > > >
> > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > really think it should, doing so will make things more complex in the
> > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > 
> > > What about appending date +%s output to it? It would be stricter and
> > > automated.
> > 
> > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > generally not recommended.
> 
> Then the sha1sum of mlx4_glue.h.
> With this the size check I mentioned on the other patch would become
> redundant and unnecessary.

Using a strong hash algorithm to version a library/symbol, while possible,
seems a bit overkill and results in ugliness:

 librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3

Using a weak one like CRC32 for a shorter name poses a risk of
collision. Moreover the next time someone decides to update all version
notices or modify a comment will impact that hash. We'd need to isolate the
symbol definition itself, ignore parameter names in function prototypes and
only then we may get a somewhat meaningful hash describing a given ABI.

Given the added complexity, is there really a problem with simple version
numbers we increment every time something gets modified? (Note this is
already how our .map files work, they're not generated automatically)

How about keeping things as is?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 13:44  3%               ` Adrien Mazarguil
@ 2018-02-05 14:16  0%                 ` Thomas Monjalon
  2018-02-05 14:33  0%                   ` Adrien Mazarguil
  2018-02-05 14:37  0%                   ` Marcelo Ricardo Leitner
  0 siblings, 2 replies; 200+ results
From: Thomas Monjalon @ 2018-02-05 14:16 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Marcelo Ricardo Leitner, Van Haaren, Harry, dev, Shahaf Shuler,
	Nelio Laranjeiro

05/02/2018 14:44, Adrien Mazarguil:
> On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > libraries
> > > > 
> > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > >
> > > > > > >  # Library name.
> > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > >
> > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > >
> > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > modified,
> > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > >
> > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > different sets of Verbs calls and thus a different version, hence the
> > > > added
> > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > possibly
> > > > > several times between official DPDK releases while work is being done on
> > > > the
> > > > > PMD (i.e. per commit).
> > > > >
> > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > really think it should, doing so will make things more complex in the
> > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > 
> > > > What about appending date +%s output to it? It would be stricter and
> > > > automated.
> > > 
> > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > generally not recommended.
> > 
> > Then the sha1sum of mlx4_glue.h.
> > With this the size check I mentioned on the other patch would become
> > redundant and unnecessary.
> 
> Using a strong hash algorithm to version a library/symbol, while possible,
> seems a bit overkill and results in ugliness:
> 
>  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> 
> Using a weak one like CRC32 for a shorter name poses a risk of
> collision. Moreover the next time someone decides to update all version
> notices or modify a comment will impact that hash. We'd need to isolate the
> symbol definition itself, ignore parameter names in function prototypes and
> only then we may get a somewhat meaningful hash describing a given ABI.
> 
> Given the added complexity, is there really a problem with simple version
> numbers we increment every time something gets modified? (Note this is
> already how our .map files work, they're not generated automatically)

Our map files show the major version where a symbol was introduced.
It is simple because no symbol can be introduced in a minor version.

> How about keeping things as is?

You are using 18.02.1 while it is introduced in 18.02.0.
If you don't want to correlate the .so version number with DPDK version
number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:16  0%                 ` Thomas Monjalon
@ 2018-02-05 14:33  0%                   ` Adrien Mazarguil
  2018-02-05 14:37  0%                   ` Marcelo Ricardo Leitner
  1 sibling, 0 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-05 14:33 UTC (permalink / raw)
  To: Thomas Monjalon, Shahaf Shuler
  Cc: Marcelo Ricardo Leitner, Van Haaren, Harry, dev, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> 05/02/2018 14:44, Adrien Mazarguil:
> > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > libraries
> > > > > 
> > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > >
> > > > > > > >  # Library name.
> > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > >
> > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > >
> > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > modified,
> > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > >
> > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > added
> > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > possibly
> > > > > > several times between official DPDK releases while work is being done on
> > > > > the
> > > > > > PMD (i.e. per commit).
> > > > > >
> > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > really think it should, doing so will make things more complex in the
> > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > 
> > > > > What about appending date +%s output to it? It would be stricter and
> > > > > automated.
> > > > 
> > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > generally not recommended.
> > > 
> > > Then the sha1sum of mlx4_glue.h.
> > > With this the size check I mentioned on the other patch would become
> > > redundant and unnecessary.
> > 
> > Using a strong hash algorithm to version a library/symbol, while possible,
> > seems a bit overkill and results in ugliness:
> > 
> >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > 
> > Using a weak one like CRC32 for a shorter name poses a risk of
> > collision. Moreover the next time someone decides to update all version
> > notices or modify a comment will impact that hash. We'd need to isolate the
> > symbol definition itself, ignore parameter names in function prototypes and
> > only then we may get a somewhat meaningful hash describing a given ABI.
> > 
> > Given the added complexity, is there really a problem with simple version
> > numbers we increment every time something gets modified? (Note this is
> > already how our .map files work, they're not generated automatically)
> 
> Our map files show the major version where a symbol was introduced.
> It is simple because no symbol can be introduced in a minor version.
> 
> > How about keeping things as is?
> 
> You are using 18.02.1 while it is introduced in 18.02.0.
> If you don't want to correlate the .so version number with DPDK version
> number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

I don't really care as long as there's no confusion with their backported
counterparts (namely 17.11 and 17.02). I understand the possible confusion
for someone who'd grep the sources though.

If 18.02.0 is OK in everyone's opinion, let's use that. It satisfies the
uniqueness requirement. We'll add a digit or find some other versioning
scheme later if necessary.

Shahaf, can you make a minor adjustment while applying this series?

Both drivers/net/mlx4/Makefile and drivers/net/mlx5/Makefile need to be
modified as follows in patch 3/4:

 -LIB_GLUE_VERSION = 18.02.1
 +LIB_GLUE_VERSION = 18.02.0

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:16  0%                 ` Thomas Monjalon
  2018-02-05 14:33  0%                   ` Adrien Mazarguil
@ 2018-02-05 14:37  0%                   ` Marcelo Ricardo Leitner
  2018-02-05 14:59  0%                     ` Adrien Mazarguil
  1 sibling, 1 reply; 200+ results
From: Marcelo Ricardo Leitner @ 2018-02-05 14:37 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Adrien Mazarguil, Van Haaren, Harry, dev, Shahaf Shuler,
	Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> 05/02/2018 14:44, Adrien Mazarguil:
> > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > libraries
> > > > > 
> > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > >
> > > > > > > >  # Library name.
> > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > >
> > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > >
> > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > modified,
> > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > >
> > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > added
> > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > possibly
> > > > > > several times between official DPDK releases while work is being done on
> > > > > the
> > > > > > PMD (i.e. per commit).
> > > > > >
> > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > really think it should, doing so will make things more complex in the
> > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > 
> > > > > What about appending date +%s output to it? It would be stricter and
> > > > > automated.
> > > > 
> > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > generally not recommended.
> > > 
> > > Then the sha1sum of mlx4_glue.h.
> > > With this the size check I mentioned on the other patch would become
> > > redundant and unnecessary.
> > 
> > Using a strong hash algorithm to version a library/symbol, while possible,
> > seems a bit overkill and results in ugliness:
> > 
> >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3

Ugh yes, but it wouldn't need to be that visible. A pointer on
mlx*_glue and a define on PMD would be enough already. As in, an
extended check to the versioning.

> > 
> > Using a weak one like CRC32 for a shorter name poses a risk of
> > collision. Moreover the next time someone decides to update all version
> > notices or modify a comment will impact that hash. We'd need to isolate the
> > symbol definition itself, ignore parameter names in function prototypes and
> > only then we may get a somewhat meaningful hash describing a given ABI.

That's what I meant with stricter. Yes it would catch such
situations, but you tell me on how much we want to protect/restrict
here.  Do you see a reason for building only the dpdk/pmd side and not
the glue library at a time?

> > 
> > Given the added complexity, is there really a problem with simple version
> > numbers we increment every time something gets modified? (Note this is
> > already how our .map files work, they're not generated automatically)
> 
> Our map files show the major version where a symbol was introduced.
> It is simple because no symbol can be introduced in a minor version.
> 
> > How about keeping things as is?

I don't really see the need of unique filenames. The next patch is
already leveraging RTE_EAL_PMD_PATH, which if versioned should be
enough for this, no?

> 
> You are using 18.02.1 while it is introduced in 18.02.0.
> If you don't want to correlate the .so version number with DPDK version
> number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

+1

  Marcelo

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:37  0%                   ` Marcelo Ricardo Leitner
@ 2018-02-05 14:59  0%                     ` Adrien Mazarguil
  2018-02-05 15:29  0%                       ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 200+ results
From: Adrien Mazarguil @ 2018-02-05 14:59 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > 05/02/2018 14:44, Adrien Mazarguil:
> > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > libraries
> > > > > > 
> > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > >
> > > > > > > > >  # Library name.
> > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > >
> > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > >
> > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > modified,
> > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > >
> > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > added
> > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > possibly
> > > > > > > several times between official DPDK releases while work is being done on
> > > > > > the
> > > > > > > PMD (i.e. per commit).
> > > > > > >
> > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > 
> > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > automated.
> > > > > 
> > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > generally not recommended.
> > > > 
> > > > Then the sha1sum of mlx4_glue.h.
> > > > With this the size check I mentioned on the other patch would become
> > > > redundant and unnecessary.
> > > 
> > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > seems a bit overkill and results in ugliness:
> > > 
> > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> 
> Ugh yes, but it wouldn't need to be that visible. A pointer on
> mlx*_glue and a define on PMD would be enough already. As in, an
> extended check to the versioning.

I thought you suggested this as a replacement. I'm not sure we need or want
to go this far. The current string comparison is really not worse than
standard symbol versioning, which doesn't check symbol properties besides
whether they are functions or other objects. We could have used C++ with
automatically mangled symbol names for that, however that again would make
things way more complex than necessary.

> > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > collision. Moreover the next time someone decides to update all version
> > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > symbol definition itself, ignore parameter names in function prototypes and
> > > only then we may get a somewhat meaningful hash describing a given ABI.
> 
> That's what I meant with stricter. Yes it would catch such
> situations, but you tell me on how much we want to protect/restrict
> here.  Do you see a reason for building only the dpdk/pmd side and not
> the glue library at a time?

No, they're always built together. We're only adding this versioning to
avoid issues when users somehow end up with several DPDK versions installed
on their system, or with leftovers of previous releases lying around. That's
all we need to solve here. dlopen()'ing the proper file takes care of that,
the symbol version number check afterward is performed just in case.

> > > Given the added complexity, is there really a problem with simple version
> > > numbers we increment every time something gets modified? (Note this is
> > > already how our .map files work, they're not generated automatically)
> > 
> > Our map files show the major version where a symbol was introduced.
> > It is simple because no symbol can be introduced in a minor version.
> > 
> > > How about keeping things as is?
> 
> I don't really see the need of unique filenames. The next patch is
> already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> enough for this, no?

As you said, "if" versioned. As an undocumented empty string by default,
there's no way to be sure. Leaving the PMD version its internal but
(unfortunately) exposed bits will certainly prevent mistakes.

> > You are using 18.02.1 while it is introduced in 18.02.0.
> > If you don't want to correlate the .so version number with DPDK version
> > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> 
> +1

Then are you fine with the "18.02.0" suffix?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:59  0%                     ` Adrien Mazarguil
@ 2018-02-05 15:29  0%                       ` Marcelo Ricardo Leitner
  2018-02-05 15:54  4%                         ` Adrien Mazarguil
  0 siblings, 1 reply; 200+ results
From: Marcelo Ricardo Leitner @ 2018-02-05 15:29 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > 05/02/2018 14:44, Adrien Mazarguil:
> > > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > > libraries
> > > > > > > 
> > > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > > >
> > > > > > > > > >  # Library name.
> > > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > > >
> > > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > > >
> > > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > > modified,
> > > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > > >
> > > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > > added
> > > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > > possibly
> > > > > > > > several times between official DPDK releases while work is being done on
> > > > > > > the
> > > > > > > > PMD (i.e. per commit).
> > > > > > > >
> > > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > > 
> > > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > > automated.
> > > > > > 
> > > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > > generally not recommended.
> > > > > 
> > > > > Then the sha1sum of mlx4_glue.h.
> > > > > With this the size check I mentioned on the other patch would become
> > > > > redundant and unnecessary.
> > > > 
> > > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > > seems a bit overkill and results in ugliness:
> > > > 
> > > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > 
> > Ugh yes, but it wouldn't need to be that visible. A pointer on
> > mlx*_glue and a define on PMD would be enough already. As in, an
> > extended check to the versioning.
> 
> I thought you suggested this as a replacement. I'm not sure we need or want
> to go this far. The current string comparison is really not worse than
> standard symbol versioning, which doesn't check symbol properties besides
> whether they are functions or other objects. We could have used C++ with
> automatically mangled symbol names for that, however that again would make
> things way more complex than necessary.
> 
> > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > collision. Moreover the next time someone decides to update all version
> > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > 
> > That's what I meant with stricter. Yes it would catch such
> > situations, but you tell me on how much we want to protect/restrict
> > here.  Do you see a reason for building only the dpdk/pmd side and not
> > the glue library at a time?
> 
> No, they're always built together. We're only adding this versioning to
> avoid issues when users somehow end up with several DPDK versions installed
> on their system, or with leftovers of previous releases lying around. That's
> all we need to solve here. dlopen()'ing the proper file takes care of that,
> the symbol version number check afterward is performed just in case.

Interesting. These leftovers probably wouldn't be there if it wasn't
versioned in the first place. :-)

> 
> > > > Given the added complexity, is there really a problem with simple version
> > > > numbers we increment every time something gets modified? (Note this is
> > > > already how our .map files work, they're not generated automatically)
> > > 
> > > Our map files show the major version where a symbol was introduced.
> > > It is simple because no symbol can be introduced in a minor version.
> > > 
> > > > How about keeping things as is?
> > 
> > I don't really see the need of unique filenames. The next patch is
> > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > enough for this, no?
> 
> As you said, "if" versioned. As an undocumented empty string by default,
> there's no way to be sure. Leaving the PMD version its internal but
> (unfortunately) exposed bits will certainly prevent mistakes.
> 
> > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > If you don't want to correlate the .so version number with DPDK version
> > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > 
> > +1
> 
> Then are you fine with the "18.02.0" suffix?

Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
to dpdk version.

With the latest replies, I don't think the reasoning is enough to
justify these extra checks, but I won't oppose to including it.

  Marcelo

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 15:29  0%                       ` Marcelo Ricardo Leitner
@ 2018-02-05 15:54  4%                         ` Adrien Mazarguil
  2018-02-05 17:06  0%                           ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 200+ results
From: Adrien Mazarguil @ 2018-02-05 15:54 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > 05/02/2018 14:44, Adrien Mazarguil:
> > > > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > > > libraries
> > > > > > > > 
> > > > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > > > >
> > > > > > > > > > >  # Library name.
> > > > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > > > >
> > > > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > > > >
> > > > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > > > modified,
> > > > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > > > >
> > > > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > > > added
> > > > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > > > possibly
> > > > > > > > > several times between official DPDK releases while work is being done on
> > > > > > > > the
> > > > > > > > > PMD (i.e. per commit).
> > > > > > > > >
> > > > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > > > 
> > > > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > > > automated.
> > > > > > > 
> > > > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > > > generally not recommended.
> > > > > > 
> > > > > > Then the sha1sum of mlx4_glue.h.
> > > > > > With this the size check I mentioned on the other patch would become
> > > > > > redundant and unnecessary.
> > > > > 
> > > > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > > > seems a bit overkill and results in ugliness:
> > > > > 
> > > > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > > 
> > > Ugh yes, but it wouldn't need to be that visible. A pointer on
> > > mlx*_glue and a define on PMD would be enough already. As in, an
> > > extended check to the versioning.
> > 
> > I thought you suggested this as a replacement. I'm not sure we need or want
> > to go this far. The current string comparison is really not worse than
> > standard symbol versioning, which doesn't check symbol properties besides
> > whether they are functions or other objects. We could have used C++ with
> > automatically mangled symbol names for that, however that again would make
> > things way more complex than necessary.
> > 
> > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > collision. Moreover the next time someone decides to update all version
> > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > 
> > > That's what I meant with stricter. Yes it would catch such
> > > situations, but you tell me on how much we want to protect/restrict
> > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > the glue library at a time?
> > 
> > No, they're always built together. We're only adding this versioning to
> > avoid issues when users somehow end up with several DPDK versions installed
> > on their system, or with leftovers of previous releases lying around. That's
> > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > the symbol version number check afterward is performed just in case.
> 
> Interesting. These leftovers probably wouldn't be there if it wasn't
> versioned in the first place. :-)

Seriously, we can't assume users will do everything using neat packages and
may run an unfortunate "make install" from the DPDK source tree without
noticing they wrecked their system. Someone will have to mop the ensuing but
preventable bug reports.

> > > > > Given the added complexity, is there really a problem with simple version
> > > > > numbers we increment every time something gets modified? (Note this is
> > > > > already how our .map files work, they're not generated automatically)
> > > > 
> > > > Our map files show the major version where a symbol was introduced.
> > > > It is simple because no symbol can be introduced in a minor version.
> > > > 
> > > > > How about keeping things as is?
> > > 
> > > I don't really see the need of unique filenames. The next patch is
> > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > enough for this, no?
> > 
> > As you said, "if" versioned. As an undocumented empty string by default,
> > there's no way to be sure. Leaving the PMD version its internal but
> > (unfortunately) exposed bits will certainly prevent mistakes.
> > 
> > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > If you don't want to correlate the .so version number with DPDK version
> > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > 
> > > +1
> > 
> > Then are you fine with the "18.02.0" suffix?
> 
> Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> to dpdk version.
> 
> With the latest replies, I don't think the reasoning is enough to
> justify these extra checks, but I won't oppose to including it.

18.02.0 makes it tied to the current release number, so I guess we agree.
The idea for now is this part remains tied to the DPDK release.

If a new ABI version is needed in a subsequent commit, the initial part gets
bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
intermediate commits break the glue ABI, a fourth digit is added
(e.g. 42.02.0.1).

This role is currently held by the third digit but since there's a confusion
with DPDK revisions, it won't be used internally by the PMD. Hopefully this
fourth digit will remain unused (otherwise I can add as many digits as
necessary to make it acceptable, I'll then re-consider the SHA1 idea :)

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets
       [not found]     ` <cover.1517848624.git.anatoly.burakov@intel.com>
@ 2018-02-05 16:37  8%   ` Anatoly Burakov
  2018-02-05 17:39  3%     ` Burakov, Anatoly
                       ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Anatoly Burakov @ 2018-02-05 16:37 UTC (permalink / raw)
  To: dev

During lcore scan, find maximum socket ID and store it.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v3:
    - Added ABI backwards compatibility
    
    v2:
    - checkpatch changes
    - check socket before deciding if the core is not to be used

 lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
 lib/librte_eal/common/include/rte_eal.h   | 25 +++++++++++++++++++++
 lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
 lib/librte_eal/linuxapp/eal/eal.c         | 27 +++++++++++++++++++++-
 lib/librte_eal/rte_eal_version.map        |  9 +++++++-
 5 files changed, 92 insertions(+), 14 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 7724fa4..827ddeb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -28,6 +28,7 @@ rte_eal_cpu_init(void)
 	struct rte_config *config = rte_eal_get_configuration();
 	unsigned lcore_id;
 	unsigned count = 0;
+	unsigned int socket_id, max_socket_id = 0;
 
 	/*
 	 * Parse the maximum set of logical cores, detect the subset of running
@@ -39,6 +40,19 @@ rte_eal_cpu_init(void)
 		/* init cpuset for per lcore config */
 		CPU_ZERO(&lcore_config[lcore_id].cpuset);
 
+		/* find socket first */
+		socket_id = eal_cpu_socket_id(lcore_id);
+		if (socket_id >= RTE_MAX_NUMA_NODES) {
+#ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
+			socket_id = 0;
+#else
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than RTE_MAX_NUMA_NODES (%d)\n",
+					socket_id, RTE_MAX_NUMA_NODES);
+			return -1;
+#endif
+		}
+		max_socket_id = RTE_MAX(max_socket_id, socket_id);
+
 		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = eal_cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
@@ -54,18 +68,7 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_role = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
-		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
-#ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
-			lcore_config[lcore_id].socket_id = 0;
-#else
-			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
-				"RTE_MAX_NUMA_NODES (%d)\n",
-				lcore_config[lcore_id].socket_id,
-				RTE_MAX_NUMA_NODES);
-			return -1;
-#endif
-		}
+		lcore_config[lcore_id].socket_id = socket_id;
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
@@ -79,5 +82,15 @@ rte_eal_cpu_init(void)
 		RTE_MAX_LCORE);
 	RTE_LOG(INFO, EAL, "Detected %u lcore(s)\n", config->lcore_count);
 
+	config->numa_node_count = max_socket_id + 1;
+	RTE_LOG(INFO, EAL, "Detected %u NUMA nodes\n", config->numa_node_count);
+
 	return 0;
 }
+
+unsigned int
+rte_num_sockets(void)
+{
+	const struct rte_config *config = rte_eal_get_configuration();
+	return config->numa_node_count;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 08c6637..bbf54e2 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -57,6 +57,29 @@ enum rte_proc_type_t {
 struct rte_config {
 	uint32_t master_lcore;       /**< Id of the master lcore */
 	uint32_t lcore_count;        /**< Number of available logical cores. */
+	uint32_t numa_node_count;    /**< Number of detected NUMA nodes. */
+	uint32_t service_lcore_count;/**< Number of available service cores. */
+	enum rte_lcore_role_t lcore_role[RTE_MAX_LCORE]; /**< State of cores. */
+
+	/** Primary or secondary configuration */
+	enum rte_proc_type_t process_type;
+
+	/** PA or VA mapping mode */
+	enum rte_iova_mode iova_mode;
+
+	/**
+	 * Pointer to memory configuration, which may be shared across multiple
+	 * DPDK instances
+	 */
+	struct rte_mem_config *mem_config;
+} __attribute__((__packed__));
+
+/**
+ * The global RTE configuration structure - 18.02 ABI version.
+ */
+struct rte_config_v1802 {
+	uint32_t master_lcore;       /**< Id of the master lcore */
+	uint32_t lcore_count;        /**< Number of available logical cores. */
 	uint32_t service_lcore_count;/**< Number of available service cores. */
 	enum rte_lcore_role_t lcore_role[RTE_MAX_LCORE]; /**< State of cores. */
 
@@ -80,6 +103,8 @@ struct rte_config {
  *   A pointer to the global configuration structure.
  */
 struct rte_config *rte_eal_get_configuration(void);
+struct rte_config_v1802 *rte_eal_get_configuration_v1802(void);
+struct rte_config *rte_eal_get_configuration_v1805(void);
 
 /**
  * Get a lcore's role.
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index d84bcff..ddf4c64 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -120,6 +120,14 @@ rte_lcore_index(int lcore_id)
 unsigned rte_socket_id(void);
 
 /**
+ * Return number of physical sockets on the system.
+ * @return
+ *   the number of physical sockets as recognized by EAL
+ *
+ */
+unsigned int rte_num_sockets(void);
+
+/**
  * Get the ID of the physical socket of the specified lcore
  *
  * @param lcore_id
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 451fdaf..757f404 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -67,6 +67,9 @@ static rte_usage_hook_t	rte_application_usage_hook = NULL;
 /* early configuration structure, when memory config is not mmapped */
 static struct rte_mem_config early_mem_config;
 
+/* compatibility structure to return to old ABI calls */
+static struct rte_config_v1802 v1802_config;
+
 /* define fd variable here, because file needs to be kept open for the
  * duration of the program, as we hold a write lock on it in the primary proc */
 static int mem_cfg_fd = -1;
@@ -103,11 +106,33 @@ rte_eal_mbuf_default_mempool_ops(void)
 }
 
 /* Return a pointer to the configuration structure */
+struct rte_config_v1802 *
+rte_eal_get_configuration_v1802(void)
+{
+	/* copy everything to old config so that it's up to date */
+	v1802_config.iova_mode = rte_config.iova_mode;
+	v1802_config.lcore_count = rte_config.lcore_count;
+	memcpy(v1802_config.lcore_role, rte_config.lcore_role,
+			sizeof(rte_config.lcore_role));
+	v1802_config.master_lcore = rte_config.master_lcore;
+	v1802_config.mem_config = rte_config.mem_config;
+	v1802_config.process_type = rte_config.process_type;
+	v1802_config.service_lcore_count = rte_config.service_lcore_count;
+
+	return &v1802_config;
+}
+VERSION_SYMBOL(rte_eal_get_configuration, _v1802, 18.02);
+
+/* Return a pointer to the configuration structure */
 struct rte_config *
-rte_eal_get_configuration(void)
+rte_eal_get_configuration_v1805(void)
 {
 	return &rte_config;
 }
+VERSION_SYMBOL(rte_eal_get_configuration, _v1805, 18.05);
+BIND_DEFAULT_SYMBOL(rte_eal_get_configuration, _v1805, 18.05);
+MAP_STATIC_SYMBOL(struct rte_config *rte_eal_get_configuration(void),
+		rte_eal_get_configuration_v1805);
 
 enum rte_iova_mode
 rte_eal_iova_mode(void)
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 4146907..fc83e74 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -211,6 +211,13 @@ DPDK_18.02 {
 
 }  DPDK_17.11;
 
+DPDK_18.05 {
+	global:
+
+	rte_num_sockets;
+
+} DPDK_18.02;
+
 EXPERIMENTAL {
 	global:
 
@@ -255,4 +262,4 @@ EXPERIMENTAL {
 	rte_service_set_stats_enable;
 	rte_service_start_with_defaults;
 
-} DPDK_18.02;
+} DPDK_18.05;
-- 
2.7.4

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 15:54  4%                         ` Adrien Mazarguil
@ 2018-02-05 17:06  0%                           ` Marcelo Ricardo Leitner
  2018-02-06 11:06  4%                             ` Adrien Mazarguil
  0 siblings, 1 reply; 200+ results
From: Marcelo Ricardo Leitner @ 2018-02-05 17:06 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 04:54:55PM +0100, Adrien Mazarguil wrote:
> On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > > 05/02/2018 14:44, Adrien Mazarguil:
...
> > > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > > collision. Moreover the next time someone decides to update all version
> > > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > > 
> > > > That's what I meant with stricter. Yes it would catch such
> > > > situations, but you tell me on how much we want to protect/restrict
> > > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > > the glue library at a time?
> > > 
> > > No, they're always built together. We're only adding this versioning to
> > > avoid issues when users somehow end up with several DPDK versions installed
> > > on their system, or with leftovers of previous releases lying around. That's
> > > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > > the symbol version number check afterward is performed just in case.
> > 
> > Interesting. These leftovers probably wouldn't be there if it wasn't
> > versioned in the first place. :-)
> 
> Seriously, we can't assume users will do everything using neat packages and
> may run an unfortunate "make install" from the DPDK source tree without
> noticing they wrecked their system. Someone will have to mop the ensuing but
> preventable bug reports.
> 
> > > > > > Given the added complexity, is there really a problem with simple version
> > > > > > numbers we increment every time something gets modified? (Note this is
> > > > > > already how our .map files work, they're not generated automatically)
> > > > > 
> > > > > Our map files show the major version where a symbol was introduced.
> > > > > It is simple because no symbol can be introduced in a minor version.
> > > > > 
> > > > > > How about keeping things as is?
> > > > 
> > > > I don't really see the need of unique filenames. The next patch is
> > > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > > enough for this, no?
> > > 
> > > As you said, "if" versioned. As an undocumented empty string by default,
> > > there's no way to be sure. Leaving the PMD version its internal but
> > > (unfortunately) exposed bits will certainly prevent mistakes.
> > > 
> > > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > > If you don't want to correlate the .so version number with DPDK version
> > > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > > 
> > > > +1
> > > 
> > > Then are you fine with the "18.02.0" suffix?
> > 
> > Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> > to dpdk version.
> > 
> > With the latest replies, I don't think the reasoning is enough to
> > justify these extra checks, but I won't oppose to including it.
> 
> 18.02.0 makes it tied to the current release number, so I guess we agree.

It makes them equal, but not tied. If nobody patches it, when 18.02.1
is out, the glue lib will still be 18.02.0.

> The idea for now is this part remains tied to the DPDK release.
> 
> If a new ABI version is needed in a subsequent commit, the initial part gets
> bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
> intermediate commits break the glue ABI, a fourth digit is added
> (e.g. 42.02.0.1).

I'll defer this to other project developers. This is more about a
project standard than anything here. I could even argue that this glue
should be named after the pmd lib, such as
   ./usr/local/lib/librte_pmd_mlx4_glue.so.1.1
The fact of not providing the _glue.so symlink is enough to avoid
others from linking against it. But it's more of a project standard
than a technical decision, I guess, weather this lib is seen as a
plugin or as a (private) library.

Considering the versioning used for the PMD libs, such easy versioning
is my preferred choice, FWIW.

> 
> This role is currently held by the third digit but since there's a confusion
> with DPDK revisions, it won't be used internally by the PMD. Hopefully this
> fourth digit will remain unused (otherwise I can add as many digits as
> necessary to make it acceptable, I'll then re-consider the SHA1 idea :)

hehe :-)

  Marcelo

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v4] checkpatches.sh: Add checks for ABI symbol addition
      2018-01-31 17:27  6% ` [dpdk-dev] [PATCH v3] " Neil Horman
@ 2018-02-05 17:29  6% ` Neil Horman
  2018-02-05 17:57  4%   ` Thomas Monjalon
  2018-02-09 15:21  6% ` [dpdk-dev] [PATCH v5] " Neil Horman
  2018-02-14 19:19  6% ` [dpdk-dev] [PATCH v6] " Neil Horman
  4 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-02-05 17:29 UTC (permalink / raw)
  To: dev
  Cc: Neil Horman, thomas, john.mcnamara, bruce.richardson,
	Ferruh Yigit, Stephen Hemminger

Recently, some additional patches were added to allow for programmatic
marking of C symbols as experimental.  The addition of these markers is
dependent on the manual addition of exported symbols to the EXPERIMENTAL
section of the corresponding libraries version map file.  The consensus
on review is that, in addition to mandating the addition of symbols to
the EXPERIMENTAL version in the map, we need a mechanism to enforce our
documented process of mandating that addition when they are introduced.
To that end, I am proposing this change.  It is an addition to the
checkpatches script, which scan incoming patches for additions and
removals of symbols to the map file, and warns the user appropriately

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: thomas@monjalon.net
CC: john.mcnamara@intel.com
CC: bruce.richardson@intel.com
CC: Ferruh Yigit <ferruh.yigit@intel.com>
CC: Stephen Hemminger <stephen@networkplumber.org>

---
Change notes

v2)
 * Cleaned up and documented awk script (shemminger)
 * fixed sort/uniq usage (shemminger)
 * moved checking to new script (tmonjalon)
 * added maintainer entry (tmonjalon)
 * added license (tmonjalon)

v3)
 * Changed symbol check script name (tmonjalon)
 * Trapped exit to clean temp file (tmonjalon)
 * Honored verbose command (tmonjalon)
 * Cleaned left over debug bits (tmonjalon)
 * Updated location in MAINTAINERS file (tmonjalon)

v4)
 * Updated maintainers file (tmonjalon)
---
 MAINTAINERS                     |   2 +
 devtools/check-symbol-change.sh | 146 ++++++++++++++++++++++++++++++++++++++++
 devtools/checkpatches.sh        |  23 ++++++-
 3 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100755 devtools/check-symbol-change.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index acd056134..d1ef43479 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -86,9 +86,11 @@ M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
 F: doc/guides/rel_notes/deprecation.rst
 F: devtools/validate-abi.sh
+F: devtools/check-symbol-change.sh
 F: buildtools/check-experimental-syms.sh
 
 Driver information
+M: Neil Horman <nhorman@tuxdriver.com>
 F: buildtools/pmdinfogen/
 F: usertools/dpdk-pmdinfo.py
 F: doc/guides/tools/pmdinfo.rst
diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh
new file mode 100755
index 000000000..22b17e6f2
--- /dev/null
+++ b/devtools/check-symbol-change.sh
@@ -0,0 +1,146 @@
+#!/bin/sh
+
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Neil Horman <nhorman@tuxdriver.com>
+
+build_map_changes()
+{
+	local fname=$1
+	local mapdb=$2
+
+	cat $fname | filterdiff -i *.map | awk '
+		# Initialize our variables
+		BEGIN {map="";sym="";ar="";sec=""; in_sec=0}
+
+		# Anything that starts with + or -, followed by an a
+		# and ends in the string .map is the name of our map file
+		# This may appear multiple times in a patch if multiple
+		# map files are altered, and all section/symbol names
+		# appearing between a triggering of this rule and the
+		# next trigger of this rule are associated with this file
+		/[-+] a\/.*\.map/ {map=$2}
+
+		# Triggering this rule, which starts a line with a + and ends it
+		# with a { identifies a versioned section.  The section name is
+		# the rest of the line with the + and { symbols remvoed.
+		# Triggering this rule sets in_sec to 1, which actives the
+		# symbol rule below
+		/+.*{/ {gsub("+","");sec=$1; in_sec=1}
+
+		# This rule idenfies the end of a section, and disables the
+		# symbol rule
+		/.*}/ {in_sec=0}
+
+		# This rule matches on a + followed by any characters except a :
+		# (which denotes a global vs local segment), and ends with a ;.
+		# The semicolon is removed and the symbol is printed with its
+		# association file name and version section, along with an
+		# indicator that the symbol is a new addition.  Note this rule
+		# only works if we have found a version section in the rule
+		# above (hence the in_sec check).  Otherwise we flag it as an
+		# unknown section
+		/^+[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " add"
+			} else {
+				print map " " sym " unknown add"
+			}
+		}
+
+		# This is the same rule as above, but the rule matches on a
+		# leading - rather than a +, denoting that the symbol is being
+		# removed.
+		/^-[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " del"
+			} else {
+				print map " " sym " unknown del"
+			}
+		}' > ./$mapdb
+
+		sort -u $mapdb > ./$mapdb.2
+		mv -f $mapdb.2 $mapdb
+
+}
+
+check_for_rule_violations()
+{
+	local mapdb=$1
+	local mname
+	local symname
+	local secname
+	local ar
+	local ret=0
+
+	while read mname symname secname ar
+	do
+		if [ "$ar" == "add" ]
+		then
+
+			if [ "$secname" == "unknown" ]
+			then
+				# Just inform the user of this occurrence, but
+				# don't flag it as an error
+				echo -n "INFO: symbol $syname is added but "
+				echo -n "patch has insuficient context "
+				echo -n "to determine the section name "
+				echo -n "please ensure the version is "
+				echo "EXPERIMENTAL"
+				continue
+			fi
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Symbols that are getting added in a section
+				# other ithan the experimental section
+				# to be moving from an already supported
+				# section or its a violation
+				grep -q \
+				"$mname $symname [^EXPERIMENTAL] del" $mapdb
+				if [ $? -ne 0 ]
+				then
+					echo -n "ERROR: symbol $symname "
+					echo -n "is added in a section "
+					echo -n "other than the EXPERIMENTAL "
+					echo "section of the version map"
+					ret=1
+				fi
+			fi
+		else
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Just inform users that non-experimenal
+				# symbols need to go through a deprecation
+				# process
+				echo -n "INFO: symbol $symname is being "
+				echo -n "removed, ensure that it has "
+				echo "gone through the deprecation process"
+			fi
+		fi
+	done < $mapdb
+
+	return $ret
+}
+
+trap clean_and_exit_on_sig EXIT
+
+mapfile=`mktemp mapdb.XXXXXX`
+patch=$1
+exit_code=1
+
+clean_and_exit_on_sig()
+{
+	rm -f $mapfile
+	exit $exit_code
+}
+
+build_map_changes $patch $mapfile
+check_for_rule_violations $mapfile
+exit_code=$?
+
+rm -f $mapfile
+
+exit $exit_code
+
+
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 7676a6b50..0b2b5f039 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -35,6 +35,8 @@
 # - DPDK_CHECKPATCH_LINE_LENGTH
 . $(dirname $(readlink -e $0))/load-devel-config
 
+VALIDATE_NEW_API=$(dirname $(readlink -e $0))/check-symbol-change.sh
+
 length=${DPDK_CHECKPATCH_LINE_LENGTH:-80}
 
 # override default Linux options
@@ -61,6 +63,7 @@ print_usage () {
 	END_OF_HELP
 }
 
+
 number=0
 quiet=false
 verbose=false
@@ -86,6 +89,7 @@ total=0
 status=0
 
 check () { # <patch> <commit> <title>
+	local reta
 	total=$(($total + 1))
 	! $verbose || printf '\n### %s\n\n' "$3"
 	if [ -n "$1" ] ; then
@@ -96,9 +100,26 @@ check () { # <patch> <commit> <title>
 	else
 		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
 	fi
-	[ $? -ne 0 ] || return 0
+	reta=$?
+
 	$verbose || printf '\n### %s\n\n' "$3"
 	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+
+	! $verbose || echo
+	! $verbose || echo "Checking API additions/removals:"
+
+	if [ -n "$1" ] ; then
+		report=$($VALIDATE_NEW_API $1)
+	elif [ -n "$2" ] ; then
+		report=$(git format-patch \
+			 --find-renames --no-stat --stdout -1 $commit |
+			$VALIDATE_NEW_API -)
+	else
+		report=$($VALIDATE_NEW_API -)
+	fi
+	[ $? -ne 0 -o $reta -ne 0 ] || return 0
+	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+
 	status=$(($status + 1))
 }
 
-- 
2.14.3

^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets
  2018-02-05 16:37  8%   ` [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets Anatoly Burakov
@ 2018-02-05 17:39  3%     ` Burakov, Anatoly
  2018-02-05 22:45  0%       ` Thomas Monjalon
  2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] Add " Anatoly Burakov
  2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] eal: add " Anatoly Burakov
  2 siblings, 1 reply; 200+ results
From: Burakov, Anatoly @ 2018-02-05 17:39 UTC (permalink / raw)
  To: dev

On 05-Feb-18 4:37 PM, Anatoly Burakov wrote:
> During lcore scan, find maximum socket ID and store it.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> 
> Notes:
>      v3:
>      - Added ABI backwards compatibility
>      
>      v2:
>      - checkpatch changes
>      - check socket before deciding if the core is not to be used
> 
>   lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
>   lib/librte_eal/common/include/rte_eal.h   | 25 +++++++++++++++++++++
>   lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
>   lib/librte_eal/linuxapp/eal/eal.c         | 27 +++++++++++++++++++++-
>   lib/librte_eal/rte_eal_version.map        |  9 +++++++-
>   5 files changed, 92 insertions(+), 14 deletions(-)
> 

This patch does not break ABI, but does it in a very ugly way. Is it 
worth it?

-- 
Thanks,
Anatoly

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4] checkpatches.sh: Add checks for ABI symbol addition
  2018-02-05 17:29  6% ` [dpdk-dev] [PATCH v4] " Neil Horman
@ 2018-02-05 17:57  4%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-05 17:57 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, john.mcnamara, bruce.richardson, Ferruh Yigit, Stephen Hemminger

05/02/2018 18:29, Neil Horman:
>  Driver information
> +M: Neil Horman <nhorman@tuxdriver.com>
>  F: buildtools/pmdinfogen/
>  F: usertools/dpdk-pmdinfo.py
>  F: doc/guides/tools/pmdinfo.rst

This change deserves a separate patch announcing that you are
volunteer to maintain pmdinfo.

It is really not related to the rest of this patch,
but thank you for vounteering :)

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets
  2018-02-05 17:39  3%     ` Burakov, Anatoly
@ 2018-02-05 22:45  0%       ` Thomas Monjalon
  2018-02-06  9:28  0%         ` Burakov, Anatoly
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-02-05 22:45 UTC (permalink / raw)
  To: Burakov, Anatoly; +Cc: dev

05/02/2018 18:39, Burakov, Anatoly:
> On 05-Feb-18 4:37 PM, Anatoly Burakov wrote:
> > During lcore scan, find maximum socket ID and store it.
> > 
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > ---
> > 
> > Notes:
> >      v3:
> >      - Added ABI backwards compatibility
> >      
> >      v2:
> >      - checkpatch changes
> >      - check socket before deciding if the core is not to be used
> > 
> >   lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
> >   lib/librte_eal/common/include/rte_eal.h   | 25 +++++++++++++++++++++
> >   lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
> >   lib/librte_eal/linuxapp/eal/eal.c         | 27 +++++++++++++++++++++-
> >   lib/librte_eal/rte_eal_version.map        |  9 +++++++-
> >   5 files changed, 92 insertions(+), 14 deletions(-)
> > 
> 
> This patch does not break ABI, but does it in a very ugly way. Is it 
> worth it?

I think we agreed to not get this patch in 18.02.
Did you change your mind?

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure
  2018-02-04  7:24  4% [dpdk-dev] [PATCH] doc: annouce ABI change for RSS configuraiton structure Xueming Li
@ 2018-02-06  7:38  4% ` Xueming Li
  2018-02-13  6:52  4%   ` Shahaf Shuler
  0 siblings, 1 reply; 200+ results
From: Xueming Li @ 2018-02-06  7:38 UTC (permalink / raw)
  To: Thomas Monjalon, Neil Horman; +Cc: Xueming Li, dev, Shahaf Shuler

Update deprecation notice for the new rss_level field of
rte_eth_rss_conf.

Link: http://www.dpdk.org/dev/patchwork/patch/31891

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
---
 doc/guides/rel_notes/deprecation.rst | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..4bfce3bd7 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,7 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* ethdev: A new rss level field planned in 18.05.
+  The new API add rss_level field to ``rte_eth_rss_conf`` to enable a choice
+  of RSS hash calculation on outer or inner header of tunneled packet.
-- 
2.13.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets
  2018-02-05 22:45  0%       ` Thomas Monjalon
@ 2018-02-06  9:28  0%         ` Burakov, Anatoly
  2018-02-06  9:47  0%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Burakov, Anatoly @ 2018-02-06  9:28 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On 05-Feb-18 10:45 PM, Thomas Monjalon wrote:
> 05/02/2018 18:39, Burakov, Anatoly:
>> On 05-Feb-18 4:37 PM, Anatoly Burakov wrote:
>>> During lcore scan, find maximum socket ID and store it.
>>>
>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>> ---
>>>
>>> Notes:
>>>       v3:
>>>       - Added ABI backwards compatibility
>>>       
>>>       v2:
>>>       - checkpatch changes
>>>       - check socket before deciding if the core is not to be used
>>>
>>>    lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
>>>    lib/librte_eal/common/include/rte_eal.h   | 25 +++++++++++++++++++++
>>>    lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
>>>    lib/librte_eal/linuxapp/eal/eal.c         | 27 +++++++++++++++++++++-
>>>    lib/librte_eal/rte_eal_version.map        |  9 +++++++-
>>>    5 files changed, 92 insertions(+), 14 deletions(-)
>>>
>>
>> This patch does not break ABI, but does it in a very ugly way. Is it
>> worth it?
> 
> I think we agreed to not get this patch in 18.02.
> Did you change your mind?
> 

Sorry, how do i mark this patch as for 18.05? Is it a patch header?

-- 
Thanks,
Anatoly

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets
  2018-02-06  9:28  0%         ` Burakov, Anatoly
@ 2018-02-06  9:47  0%           ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-06  9:47 UTC (permalink / raw)
  To: Burakov, Anatoly; +Cc: dev

06/02/2018 10:28, Burakov, Anatoly:
> On 05-Feb-18 10:45 PM, Thomas Monjalon wrote:
> > 05/02/2018 18:39, Burakov, Anatoly:
> >> On 05-Feb-18 4:37 PM, Anatoly Burakov wrote:
> >>> During lcore scan, find maximum socket ID and store it.
> >>>
> >>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >>> ---
> >>>
> >>> Notes:
> >>>       v3:
> >>>       - Added ABI backwards compatibility
> >>>       
> >>>       v2:
> >>>       - checkpatch changes
> >>>       - check socket before deciding if the core is not to be used
> >>>
> >>>    lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
> >>>    lib/librte_eal/common/include/rte_eal.h   | 25 +++++++++++++++++++++
> >>>    lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
> >>>    lib/librte_eal/linuxapp/eal/eal.c         | 27 +++++++++++++++++++++-
> >>>    lib/librte_eal/rte_eal_version.map        |  9 +++++++-
> >>>    5 files changed, 92 insertions(+), 14 deletions(-)
> >>>
> >>
> >> This patch does not break ABI, but does it in a very ugly way. Is it
> >> worth it?
> > 
> > I think we agreed to not get this patch in 18.02.
> > Did you change your mind?
> > 
> 
> Sorry, how do i mark this patch as for 18.05? Is it a patch header?

So your answer is "yes, it is for 18.05" :)

Next time, you could add 18.05 near "PATCH v3", or say it in annotations.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 2/8] vhost: avoid enum fields in VhostUserMsg
  2018-02-05 12:16  3% ` [dpdk-dev] [PATCH 2/8] vhost: avoid enum fields in VhostUserMsg Stefan Hajnoczi
@ 2018-02-06  9:47  0%   ` Maxime Coquelin
  0 siblings, 0 replies; 200+ results
From: Maxime Coquelin @ 2018-02-06  9:47 UTC (permalink / raw)
  To: Stefan Hajnoczi, dev; +Cc: Yuanhan Liu



On 02/05/2018 01:16 PM, Stefan Hajnoczi wrote:
> The VhostUserMsg struct binary representation must match the vhost-user
> protocol specification since this struct is read from and written to the
> socket.
> 
> The VhostUserMsg.request union contains enum fields.  Enum binary
> representation is implementation-defined according to the C standard and
> it is unportable to make assumptions about the representation:
> 
>    6.7.2.2 Enumeration specifiers
>    ...
>    Each enumerated type shall be compatible with char, a signed integer
>    type, or an unsigned integer type. The choice of type is
>    implementation-defined, but shall be capable of representing the
>    values of all the members of the enumeration.
> 
> Additionally, librte_vhost relies on the enum type being unsigned when
> validating untrusted inputs:
> 
>    if (ret <= 0 || msg.request.master >= VHOST_USER_MAX) {
> 
> If msg.request.master is signed then negative values pass this check!
> 
> Even if we assume gcc on x86_64 (SysV amd64 ABI) and don't care about
> portability, the actual enum constants still affect the final type.  For
> example, if we add a negative constant then its type changes to signed
> int:
> 
>    typedef enum VhostUserRequest {
>        ...
>        VHOST_USER_INVALID = -1,
>    };
> 
> This is very fragile and it's unlikely that anyone changing the code
> would remember this.  A security hole can be introduced accidentally.
> 
> This patch switches VhostUserMsg.request fields to uint32_t to avoid the
> portability and potential security issues.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   lib/librte_vhost/vhost_user.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
> index d4bd604b9..0fafbe6e0 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -81,8 +81,8 @@ typedef struct VhostUserLog {
>   
>   typedef struct VhostUserMsg {
>   	union {
> -		VhostUserRequest master;
> -		VhostUserSlaveRequest slave;
> +		uint32_t master; /* a VhostUserRequest value */
> +		uint32_t slave;  /* a VhostUserSlaveRequest value*/
>   	} request;
>   
>   #define VHOST_USER_VERSION_MASK     0x3
> 

Maybe we could simplify to:

typedef struct VhostUserMsg {
  	uint32_t request; /* a VhostUserRequest or VhostUserSlaveRequest value */
...

Also, it seems QEMU's vhost-user master implementation uses an enum for
the request in its VhostUserMsg struct. Should it be fixed too?

Thanks,
Maxime

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 17:06  0%                           ` Marcelo Ricardo Leitner
@ 2018-02-06 11:06  4%                             ` Adrien Mazarguil
  0 siblings, 0 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-06 11:06 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:06:19PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 04:54:55PM +0100, Adrien Mazarguil wrote:
> > On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > > > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > > > 05/02/2018 14:44, Adrien Mazarguil:
> ...
> > > > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > > > collision. Moreover the next time someone decides to update all version
> > > > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > > > 
> > > > > That's what I meant with stricter. Yes it would catch such
> > > > > situations, but you tell me on how much we want to protect/restrict
> > > > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > > > the glue library at a time?
> > > > 
> > > > No, they're always built together. We're only adding this versioning to
> > > > avoid issues when users somehow end up with several DPDK versions installed
> > > > on their system, or with leftovers of previous releases lying around. That's
> > > > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > > > the symbol version number check afterward is performed just in case.
> > > 
> > > Interesting. These leftovers probably wouldn't be there if it wasn't
> > > versioned in the first place. :-)
> > 
> > Seriously, we can't assume users will do everything using neat packages and
> > may run an unfortunate "make install" from the DPDK source tree without
> > noticing they wrecked their system. Someone will have to mop the ensuing but
> > preventable bug reports.
> > 
> > > > > > > Given the added complexity, is there really a problem with simple version
> > > > > > > numbers we increment every time something gets modified? (Note this is
> > > > > > > already how our .map files work, they're not generated automatically)
> > > > > > 
> > > > > > Our map files show the major version where a symbol was introduced.
> > > > > > It is simple because no symbol can be introduced in a minor version.
> > > > > > 
> > > > > > > How about keeping things as is?
> > > > > 
> > > > > I don't really see the need of unique filenames. The next patch is
> > > > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > > > enough for this, no?
> > > > 
> > > > As you said, "if" versioned. As an undocumented empty string by default,
> > > > there's no way to be sure. Leaving the PMD version its internal but
> > > > (unfortunately) exposed bits will certainly prevent mistakes.
> > > > 
> > > > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > > > If you don't want to correlate the .so version number with DPDK version
> > > > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > > > 
> > > > > +1
> > > > 
> > > > Then are you fine with the "18.02.0" suffix?
> > > 
> > > Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> > > to dpdk version.
> > > 
> > > With the latest replies, I don't think the reasoning is enough to
> > > justify these extra checks, but I won't oppose to including it.
> > 
> > 18.02.0 makes it tied to the current release number, so I guess we agree.
> 
> It makes them equal, but not tied. If nobody patches it, when 18.02.1
> is out, the glue lib will still be 18.02.0.

Well this must be understood as "this plug-in implements 18.02.0's mlx4 glue
ABI", which remains true (and compatible) with subsequent DPDK releases as
long as the glue code is not updated.

Note this is no different from a single-digit suffix, which wouldn't be
updated either if the ABI isn't. Again, these initial digits are needed
because otherwise there is already a confusion with stable branches that
implement different ABIs and are therefore incompatible:

 librte_pmd_mlx4_glue.so.17.02.1
 librte_pmd_mlx4_glue.so.17.11.1
 librte_pmd_mlx4_glue.so.18.02.0

With a single digit, all of them would be named "librte_pmd_mlx4_glue.so.1",
rendering versioning basically useless.

> > The idea for now is this part remains tied to the DPDK release.
> > 
> > If a new ABI version is needed in a subsequent commit, the initial part gets
> > bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
> > intermediate commits break the glue ABI, a fourth digit is added
> > (e.g. 42.02.0.1).
> 
> I'll defer this to other project developers. This is more about a
> project standard than anything here. I could even argue that this glue
> should be named after the pmd lib, such as
>    ./usr/local/lib/librte_pmd_mlx4_glue.so.1.1
> The fact of not providing the _glue.so symlink is enough to avoid
> others from linking against it. But it's more of a project standard
> than a technical decision, I guess, weather this lib is seen as a
> plugin or as a (private) library.

I think you nailed it, I call it a "plug-in" because dlopen() is manually
performed on it, however it's in fact a private library whose API is not
exposed and no application is supposed to use directly.

For this reason, while up to package maintainers, my suggestion is to not
install it in a public location like "/usr/local/lib" but configure
RTE_EAL_PMD_PATH to some DPDK-specific path, e.g. "/usr/share/dpdk/pmd",
which is possible since patch 4/4 of this series.

> Considering the versioning used for the PMD libs, such easy versioning
> is my preferred choice, FWIW.

Problem remains that the DPDK projects manages its own backports/stable
releases system instead of relying on package maintainers for that, so
properly versioning things from the beginning to avoid collisions is really
always a concern. Had backports not been a requirement in the first place,
I agree a single digit would have been enough.

My suggestion of using 18.02.0 (instead of 18.02.1) stands. It addresses
Thomas' concern by properly matching the DPDK release the ABI was last
updated for and mine for the backports issues mentioned above. Let's go with
that and move on.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration
  2018-02-02 16:46  3% ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 16:46  3%   ` [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
  2018-02-02 16:52  0%   ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Nélio Laranjeiro
@ 2018-02-06 11:31  0%   ` Shahaf Shuler
  2 siblings, 0 replies; 200+ results
From: Shahaf Shuler @ 2018-02-06 11:31 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Nélio Laranjeiro, dev, Marcelo Ricardo Leitner

Friday, February 2, 2018 6:46 PM, Adrien Mazarguil:
> The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
> of generating them at run time due to security concerns [1] led to a few
> issues:
> 
> - They must be present on the file system before running DPDK.
> - Their location must be known to the dynamic linker.
> - Their names overlap and ABI compatibility is not guaranteed, which may
>   lead to crashes.
> 
> This series addresses the above by adding version information to plug-ins
> and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the
> file system.

Series applied to next-net-mlx, with the following diff in patch 3/4:
diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index cc9db9977..cc800493b 100644                                 
--- a/drivers/net/mlx4/Makefile                                   
+++ b/drivers/net/mlx4/Makefile                                   
@@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk               
 LIB = librte_pmd_mlx4.a                                          
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)                  
 LIB_GLUE_BASE = librte_pmd_mlx4_glue.so                          
-LIB_GLUE_VERSION = 18.02.1                                       
+LIB_GLUE_VERSION = 18.02.0                                       
                                                                  
 # Sources.                                                       
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c                     
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4086f2039..3bc9736c9 100644                                 
--- a/drivers/net/mlx5/Makefile                                   
+++ b/drivers/net/mlx5/Makefile                                   
@@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk               
 LIB = librte_pmd_mlx5.a                                          
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)                  
 LIB_GLUE_BASE = librte_pmd_mlx5_glue.so                          
-LIB_GLUE_VERSION = 18.02.1                                       
+LIB_GLUE_VERSION = 18.02.0             

Thanks. 


> 
> [1]
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpd
> k.org%2Fml%2Farchives%2Fdev%2F2018-
> January%2F089617.html&data=02%7C01%7Cshahafs%40mellanox.com%7C6d
> 6d87b37a574c15f41808d56a5c7eae%7Ca652971c7d2e4d9ba6a4d149256f461b
> %7C0%7C0%7C636531867854846685&sdata=9Bc7lEnU%2Fq4E5PxgOkEvgwDN
> zc46%2BZ1B5boHyxg1Cuo%3D&reserved=0
> 
> v2 changes:
> 
> - Fixed extra "\n" in glue file name generation (although it didn't break
>   functionality).
> 
> Adrien Mazarguil (4):
>   net/mlx: add debug checks to glue structure
>   net/mlx: fix missing includes for rdma-core glue
>   net/mlx: version rdma-core glue libraries
>   net/mlx: make rdma-core glue path configurable
> 
>  doc/guides/nics/mlx4.rst     | 17 ++++++++++++
>  doc/guides/nics/mlx5.rst     | 14 ++++++++++
>  drivers/net/mlx4/Makefile    |  8 ++++--
>  drivers/net/mlx4/mlx4.c      | 57
> ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx4/mlx4_glue.c |  4 +++
>  drivers/net/mlx4/mlx4_glue.h |  9 +++++++
>  drivers/net/mlx5/Makefile    |  8 ++++--
>  drivers/net/mlx5/mlx5.c      | 57
> ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_glue.c |  1 +
>  drivers/net/mlx5/mlx5_glue.h |  7 +++++
>  10 files changed, 176 insertions(+), 6 deletions(-)
> 
> --
> 2.11.0

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v1] doc: update deprecation notice of rte_devargs
@ 2018-02-07  9:26 11% Gaetan Rivet
  0 siblings, 0 replies; 200+ results
From: Gaetan Rivet @ 2018-02-07  9:26 UTC (permalink / raw)
  To: dev; +Cc: Gaetan Rivet

The declaration and identification of devices will change in v18.05.

Remove the precedent deprecation notice

Add new one reflecting the planned changes more accurately,
updated for v18.05.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 doc/guides/rel_notes/deprecation.rst | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..07312f59a 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -8,18 +8,23 @@ API and ABI deprecation notices are to be posted here.
 Deprecation Notices
 -------------------
 
-* eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
-  The format of device command line parameters will change. The bus will need
-  to be explicitly stated in the device declaration. The enum ``rte_devtype``
-  was used to identify a bus and will disappear.
-  The structure ``rte_devargs`` will change.
-  The ``rte_devargs_list`` will be made private.
-  The following functions are deprecated starting from 17.08 and will either be
-  modified or removed in 18.02:
+* eal: both declaring and identifying devices will be streamlined in v18.05.
+  New functions will appear to query a specific port from buses, classes of
+  device and device drivers. Device declaration will be made coherent with the
+  new scheme of device identification.
+  As such, ``rte_devargs`` device representation will change.
 
-  - ``rte_eal_devargs_add``
-  - ``rte_eal_devargs_type_count``
-  - ``rte_eal_parse_devargs_str``, replaced by ``rte_eal_devargs_parse``
+  - removal of ``name`` and ``args`` fields.
+  - The enum ``rte_devtype`` was used to identify a bus and will disappear.
+  - The ``rte_devargs_list`` will be made private.
+  - Functions previously deprecated will change or disappear:
+
+    + ``rte_eal_devargs_add``
+    + ``rte_eal_devargs_type_count``
+    + ``rte_eal_parse_devargs_str``, replaced by ``rte_eal_devargs_parse``
+    + ``rte_eal_devargs_parse`` will change its format and use.
+    + all ``rte_devargs`` related functions will be renamed, changing the
+      ``rte_eal_devargs_`` prefix to ``rte_devargs_``.
 
 * pci: Several exposed functions are misnamed.
   The following functions are deprecated starting from v17.11 and are replaced:
-- 
2.11.0

^ permalink raw reply	[relevance 11%]

* [dpdk-dev] [PATCH 18.05 v4] eal: add function to return number of detected sockets
  2018-02-05 16:37  8%   ` [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets Anatoly Burakov
  2018-02-05 17:39  3%     ` Burakov, Anatoly
  2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] Add " Anatoly Burakov
@ 2018-02-07  9:58  5%     ` Anatoly Burakov
  2018-03-08 12:12  3%       ` Bruce Richardson
  2 siblings, 1 reply; 200+ results
From: Anatoly Burakov @ 2018-02-07  9:58 UTC (permalink / raw)
  To: dev; +Cc: Bruce Richardson

During lcore scan, find maximum socket ID and store it. This will
break the ABI, so bump ABI version.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v4:
    - Remove backwards ABI compatibility, bump ABI instead
    
    v3:
    - Added ABI compatibility
    
    v2:
    - checkpatch changes
    - check socket before deciding if the core is not to be used

 lib/librte_eal/bsdapp/eal/Makefile        |  2 +-
 lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
 lib/librte_eal/common/include/rte_eal.h   |  1 +
 lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
 lib/librte_eal/linuxapp/eal/Makefile      |  2 +-
 lib/librte_eal/rte_eal_version.map        |  9 +++++++-
 6 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile
index dd455e6..ed1d17b 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -21,7 +21,7 @@ LDLIBS += -lgcc_s
 
 EXPORT_MAP := ../../rte_eal_version.map
 
-LIBABIVER := 6
+LIBABIVER := 7
 
 # specific to bsdapp exec-env
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c
diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 7724fa4..827ddeb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -28,6 +28,7 @@ rte_eal_cpu_init(void)
 	struct rte_config *config = rte_eal_get_configuration();
 	unsigned lcore_id;
 	unsigned count = 0;
+	unsigned int socket_id, max_socket_id = 0;
 
 	/*
 	 * Parse the maximum set of logical cores, detect the subset of running
@@ -39,6 +40,19 @@ rte_eal_cpu_init(void)
 		/* init cpuset for per lcore config */
 		CPU_ZERO(&lcore_config[lcore_id].cpuset);
 
+		/* find socket first */
+		socket_id = eal_cpu_socket_id(lcore_id);
+		if (socket_id >= RTE_MAX_NUMA_NODES) {
+#ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
+			socket_id = 0;
+#else
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than RTE_MAX_NUMA_NODES (%d)\n",
+					socket_id, RTE_MAX_NUMA_NODES);
+			return -1;
+#endif
+		}
+		max_socket_id = RTE_MAX(max_socket_id, socket_id);
+
 		/* in 1:1 mapping, record related cpu detected state */
 		lcore_config[lcore_id].detected = eal_cpu_detected(lcore_id);
 		if (lcore_config[lcore_id].detected == 0) {
@@ -54,18 +68,7 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_role = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
-		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
-#ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
-			lcore_config[lcore_id].socket_id = 0;
-#else
-			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
-				"RTE_MAX_NUMA_NODES (%d)\n",
-				lcore_config[lcore_id].socket_id,
-				RTE_MAX_NUMA_NODES);
-			return -1;
-#endif
-		}
+		lcore_config[lcore_id].socket_id = socket_id;
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
@@ -79,5 +82,15 @@ rte_eal_cpu_init(void)
 		RTE_MAX_LCORE);
 	RTE_LOG(INFO, EAL, "Detected %u lcore(s)\n", config->lcore_count);
 
+	config->numa_node_count = max_socket_id + 1;
+	RTE_LOG(INFO, EAL, "Detected %u NUMA nodes\n", config->numa_node_count);
+
 	return 0;
 }
+
+unsigned int
+rte_num_sockets(void)
+{
+	const struct rte_config *config = rte_eal_get_configuration();
+	return config->numa_node_count;
+}
diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 08c6637..63fcc2e 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -57,6 +57,7 @@ enum rte_proc_type_t {
 struct rte_config {
 	uint32_t master_lcore;       /**< Id of the master lcore */
 	uint32_t lcore_count;        /**< Number of available logical cores. */
+	uint32_t numa_node_count;    /**< Number of detected NUMA nodes. */
 	uint32_t service_lcore_count;/**< Number of available service cores. */
 	enum rte_lcore_role_t lcore_role[RTE_MAX_LCORE]; /**< State of cores. */
 
diff --git a/lib/librte_eal/common/include/rte_lcore.h b/lib/librte_eal/common/include/rte_lcore.h
index d84bcff..ddf4c64 100644
--- a/lib/librte_eal/common/include/rte_lcore.h
+++ b/lib/librte_eal/common/include/rte_lcore.h
@@ -120,6 +120,14 @@ rte_lcore_index(int lcore_id)
 unsigned rte_socket_id(void);
 
 /**
+ * Return number of physical sockets on the system.
+ * @return
+ *   the number of physical sockets as recognized by EAL
+ *
+ */
+unsigned int rte_num_sockets(void);
+
+/**
  * Get the ID of the physical socket of the specified lcore
  *
  * @param lcore_id
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 7e5bbe8..b9c7727 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -10,7 +10,7 @@ ARCH_DIR ?= $(RTE_ARCH)
 EXPORT_MAP := ../../rte_eal_version.map
 VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR)
 
-LIBABIVER := 6
+LIBABIVER := 7
 
 VPATH += $(RTE_SDK)/lib/librte_eal/common
 
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 4146907..fc83e74 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -211,6 +211,13 @@ DPDK_18.02 {
 
 }  DPDK_17.11;
 
+DPDK_18.05 {
+	global:
+
+	rte_num_sockets;
+
+} DPDK_18.02;
+
 EXPERIMENTAL {
 	global:
 
@@ -255,4 +262,4 @@ EXPERIMENTAL {
 	rte_service_set_stats_enable;
 	rte_service_start_with_defaults;
 
-} DPDK_18.02;
+} DPDK_18.05;
-- 
2.7.4

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH 18.05 v4] Add function to return number of detected sockets
  2018-02-05 16:37  8%   ` [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets Anatoly Burakov
  2018-02-05 17:39  3%     ` Burakov, Anatoly
@ 2018-02-07  9:58  5%     ` Anatoly Burakov
  2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] eal: add " Anatoly Burakov
  2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2018-02-07  9:58 UTC (permalink / raw)
  To: dev

This patch is for 18.05 and implements changes referenced
in the deprecation notice[1]. (not yet merged as of this
writing)

This patchset breaks the EAL ABI and bumps its version. This
is arguably OK as memory changes will change a lot in EAL and
thus likely break ABI anyway. However, two other alternative
implementations are possible:

1) local static variable recording number of detected
   sockets. This is arguably less clean approach, but will not
   break the ABI and will have relatively little impact on the
   codebase.

2) keeping ABI compatibility, as shown in v3 of this patch [2].

My preference would be to keep this one.

[1] http://dpdk.org/dev/patchwork/patch/33853/
[2] http://dpdk.org/dev/patchwork/patch/34994/

Anatoly Burakov (1):
  eal: add function to return number of detected sockets

 lib/librte_eal/bsdapp/eal/Makefile        |  2 +-
 lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
 lib/librte_eal/common/include/rte_eal.h   |  1 +
 lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
 lib/librte_eal/linuxapp/eal/Makefile      |  2 +-
 lib/librte_eal/rte_eal_version.map        |  9 +++++++-
 6 files changed, 44 insertions(+), 15 deletions(-)

-- 
2.7.4

^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
  2018-01-23 10:39  4%   ` Mcnamara, John
@ 2018-02-07 10:10  4%     ` Jerin Jacob
  2018-02-09 14:42  4%       ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2018-02-07 10:10 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: Burakov, Anatoly, dev, Neil Horman, Kovacevic, Marko

-----Original Message-----
> Date: Tue, 23 Jan 2018 10:39:58 +0000
> From: "Mcnamara, John" <john.mcnamara@intel.com>
> To: "Burakov, Anatoly" <anatoly.burakov@intel.com>, "dev@dpdk.org"
>  <dev@dpdk.org>
> CC: Neil Horman <nhorman@tuxdriver.com>, "Kovacevic, Marko"
>  <marko.kovacevic@intel.com>
> Subject: Re: [dpdk-dev] [PATCH] doc: add ABI change notice for
>  numa_node_count in eal
> 
> 
> 
> > -----Original Message-----
> > From: Burakov, Anatoly
> > Sent: Tuesday, January 16, 2018 5:54 PM
> > To: dev@dpdk.org
> > Cc: Neil Horman <nhorman@tuxdriver.com>; Mcnamara, John
> > <john.mcnamara@intel.com>; Kovacevic, Marko <marko.kovacevic@intel.com>
> > Subject: [PATCH] doc: add ABI change notice for numa_node_count in eal
> > 
> > There will be a new function added in v18.05 that will return number of
> > detected sockets, which will change the ABI.
> > 
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index 13e8543..9662150 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -8,6 +8,8 @@ API and ABI deprecation notices are to be posted here.
> >  Deprecation Notices
> >  -------------------
> > 
> > +* eal: new ``numa_node_count`` member will be added to ``rte_config``
> > +structure in v18.05.
> >  * eal: several API and ABI changes are planned for ``rte_devargs`` in  v18.02.
> 
> In general it is best to leave a blank line between the bullet points. But that
> doesn't affect the rendering so:
> 
> Acked-by: John McNamara <john.mcnamara@intel.com>

Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>


> 
> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-02-05 11:47  0%   ` Bruce Richardson
@ 2018-02-07 10:11  0%     ` Jerin Jacob
  2018-02-14 14:48  0%       ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2018-02-07 10:11 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Anatoly Burakov, dev, Neil Horman, John McNamara, Marko Kovacevic

-----Original Message-----
> Date: Mon, 5 Feb 2018 11:47:42 +0000
> From: Bruce Richardson <bruce.richardson@intel.com>
> To: Anatoly Burakov <anatoly.burakov@intel.com>
> CC: dev@dpdk.org, Neil Horman <nhorman@tuxdriver.com>, John McNamara
>  <john.mcnamara@intel.com>, Marko Kovacevic <marko.kovacevic@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory
>  hotplug changes
> User-Agent: Mutt/1.9.1 (2017-09-22)
> 
> On Thu, Jan 18, 2018 at 10:32:28AM +0000, Anatoly Burakov wrote:
> > Due to coming changes outlined in memory hotplug RFC, there will
> > be several API/ABI changes.
> > 
> > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > ---
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
  2018-02-07 10:10  4%     ` Jerin Jacob
@ 2018-02-09 14:42  4%       ` Bruce Richardson
  2018-02-14  0:04  4%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2018-02-09 14:42 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Mcnamara, John, Burakov, Anatoly, dev, Neil Horman, Kovacevic, Marko

On Wed, Feb 07, 2018 at 03:40:20PM +0530, Jerin Jacob wrote:
> -----Original Message-----
> > Date: Tue, 23 Jan 2018 10:39:58 +0000
> > From: "Mcnamara, John" <john.mcnamara@intel.com>
> > To: "Burakov, Anatoly" <anatoly.burakov@intel.com>, "dev@dpdk.org"
> >  <dev@dpdk.org>
> > CC: Neil Horman <nhorman@tuxdriver.com>, "Kovacevic, Marko"
> >  <marko.kovacevic@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH] doc: add ABI change notice for
> >  numa_node_count in eal
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Burakov, Anatoly
> > > Sent: Tuesday, January 16, 2018 5:54 PM
> > > To: dev@dpdk.org
> > > Cc: Neil Horman <nhorman@tuxdriver.com>; Mcnamara, John
> > > <john.mcnamara@intel.com>; Kovacevic, Marko <marko.kovacevic@intel.com>
> > > Subject: [PATCH] doc: add ABI change notice for numa_node_count in eal
> > > 
> > > There will be a new function added in v18.05 that will return number of
> > > detected sockets, which will change the ABI.
> > > 
> > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > > ---
> > >  doc/guides/rel_notes/deprecation.rst | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/doc/guides/rel_notes/deprecation.rst
> > > b/doc/guides/rel_notes/deprecation.rst
> > > index 13e8543..9662150 100644
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > @@ -8,6 +8,8 @@ API and ABI deprecation notices are to be posted here.
> > >  Deprecation Notices
> > >  -------------------
> > > 
> > > +* eal: new ``numa_node_count`` member will be added to ``rte_config``
> > > +structure in v18.05.
> > >  * eal: several API and ABI changes are planned for ``rte_devargs`` in  v18.02.
> > 
> > In general it is best to leave a blank line between the bullet points. But that
> > doesn't affect the rendering so:
> > 
> > Acked-by: John McNamara <john.mcnamara@intel.com>
> 
> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v5] checkpatches.sh: Add checks for ABI symbol addition
                     ` (2 preceding siblings ...)
  2018-02-05 17:29  6% ` [dpdk-dev] [PATCH v4] " Neil Horman
@ 2018-02-09 15:21  6% ` Neil Horman
  2018-02-13 22:57  4%   ` Thomas Monjalon
  2018-02-14 19:19  6% ` [dpdk-dev] [PATCH v6] " Neil Horman
  4 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-02-09 15:21 UTC (permalink / raw)
  To: dev
  Cc: Neil Horman, thomas, john.mcnamara, bruce.richardson,
	Ferruh Yigit, Stephen Hemminger

Recently, some additional patches were added to allow for programmatic
marking of C symbols as experimental.  The addition of these markers is
dependent on the manual addition of exported symbols to the EXPERIMENTAL
section of the corresponding libraries version map file.  The consensus
on review is that, in addition to mandating the addition of symbols to
the EXPERIMENTAL version in the map, we need a mechanism to enforce our
documented process of mandating that addition when they are introduced.
To that end, I am proposing this change.  It is an addition to the
checkpatches script, which scan incoming patches for additions and
removals of symbols to the map file, and warns the user appropriately

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: thomas@monjalon.net
CC: john.mcnamara@intel.com
CC: bruce.richardson@intel.com
CC: Ferruh Yigit <ferruh.yigit@intel.com>
CC: Stephen Hemminger <stephen@networkplumber.org>

---
Change notes

v2)
 * Cleaned up and documented awk script (shemminger)
 * fixed sort/uniq usage (shemminger)
 * moved checking to new script (tmonjalon)
 * added maintainer entry (tmonjalon)
 * added license (tmonjalon)

v3)
 * Changed symbol check script name (tmonjalon)
 * Trapped exit to clean temp file (tmonjalon)
 * Honored verbose command (tmonjalon)
 * Cleaned left over debug bits (tmonjalon)
 * Updated location in MAINTAINERS file (tmonjalon)

v4)
 * Updated maintainers file (tmonjalon)

v5)
 * undo V4 (tmojalon)
---
 MAINTAINERS                     |   1 +
 devtools/check-symbol-change.sh | 146 ++++++++++++++++++++++++++++++++++++++++
 devtools/checkpatches.sh        |  23 ++++++-
 3 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100755 devtools/check-symbol-change.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index acd056134..d9d2abff8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -86,6 +86,7 @@ M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
 F: doc/guides/rel_notes/deprecation.rst
 F: devtools/validate-abi.sh
+F: devtools/check-symbol-change.sh
 F: buildtools/check-experimental-syms.sh
 
 Driver information
diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh
new file mode 100755
index 000000000..22b17e6f2
--- /dev/null
+++ b/devtools/check-symbol-change.sh
@@ -0,0 +1,146 @@
+#!/bin/sh
+
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Neil Horman <nhorman@tuxdriver.com>
+
+build_map_changes()
+{
+	local fname=$1
+	local mapdb=$2
+
+	cat $fname | filterdiff -i *.map | awk '
+		# Initialize our variables
+		BEGIN {map="";sym="";ar="";sec=""; in_sec=0}
+
+		# Anything that starts with + or -, followed by an a
+		# and ends in the string .map is the name of our map file
+		# This may appear multiple times in a patch if multiple
+		# map files are altered, and all section/symbol names
+		# appearing between a triggering of this rule and the
+		# next trigger of this rule are associated with this file
+		/[-+] a\/.*\.map/ {map=$2}
+
+		# Triggering this rule, which starts a line with a + and ends it
+		# with a { identifies a versioned section.  The section name is
+		# the rest of the line with the + and { symbols remvoed.
+		# Triggering this rule sets in_sec to 1, which actives the
+		# symbol rule below
+		/+.*{/ {gsub("+","");sec=$1; in_sec=1}
+
+		# This rule idenfies the end of a section, and disables the
+		# symbol rule
+		/.*}/ {in_sec=0}
+
+		# This rule matches on a + followed by any characters except a :
+		# (which denotes a global vs local segment), and ends with a ;.
+		# The semicolon is removed and the symbol is printed with its
+		# association file name and version section, along with an
+		# indicator that the symbol is a new addition.  Note this rule
+		# only works if we have found a version section in the rule
+		# above (hence the in_sec check).  Otherwise we flag it as an
+		# unknown section
+		/^+[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " add"
+			} else {
+				print map " " sym " unknown add"
+			}
+		}
+
+		# This is the same rule as above, but the rule matches on a
+		# leading - rather than a +, denoting that the symbol is being
+		# removed.
+		/^-[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " del"
+			} else {
+				print map " " sym " unknown del"
+			}
+		}' > ./$mapdb
+
+		sort -u $mapdb > ./$mapdb.2
+		mv -f $mapdb.2 $mapdb
+
+}
+
+check_for_rule_violations()
+{
+	local mapdb=$1
+	local mname
+	local symname
+	local secname
+	local ar
+	local ret=0
+
+	while read mname symname secname ar
+	do
+		if [ "$ar" == "add" ]
+		then
+
+			if [ "$secname" == "unknown" ]
+			then
+				# Just inform the user of this occurrence, but
+				# don't flag it as an error
+				echo -n "INFO: symbol $syname is added but "
+				echo -n "patch has insuficient context "
+				echo -n "to determine the section name "
+				echo -n "please ensure the version is "
+				echo "EXPERIMENTAL"
+				continue
+			fi
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Symbols that are getting added in a section
+				# other ithan the experimental section
+				# to be moving from an already supported
+				# section or its a violation
+				grep -q \
+				"$mname $symname [^EXPERIMENTAL] del" $mapdb
+				if [ $? -ne 0 ]
+				then
+					echo -n "ERROR: symbol $symname "
+					echo -n "is added in a section "
+					echo -n "other than the EXPERIMENTAL "
+					echo "section of the version map"
+					ret=1
+				fi
+			fi
+		else
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Just inform users that non-experimenal
+				# symbols need to go through a deprecation
+				# process
+				echo -n "INFO: symbol $symname is being "
+				echo -n "removed, ensure that it has "
+				echo "gone through the deprecation process"
+			fi
+		fi
+	done < $mapdb
+
+	return $ret
+}
+
+trap clean_and_exit_on_sig EXIT
+
+mapfile=`mktemp mapdb.XXXXXX`
+patch=$1
+exit_code=1
+
+clean_and_exit_on_sig()
+{
+	rm -f $mapfile
+	exit $exit_code
+}
+
+build_map_changes $patch $mapfile
+check_for_rule_violations $mapfile
+exit_code=$?
+
+rm -f $mapfile
+
+exit $exit_code
+
+
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 7676a6b50..0b2b5f039 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -35,6 +35,8 @@
 # - DPDK_CHECKPATCH_LINE_LENGTH
 . $(dirname $(readlink -e $0))/load-devel-config
 
+VALIDATE_NEW_API=$(dirname $(readlink -e $0))/check-symbol-change.sh
+
 length=${DPDK_CHECKPATCH_LINE_LENGTH:-80}
 
 # override default Linux options
@@ -61,6 +63,7 @@ print_usage () {
 	END_OF_HELP
 }
 
+
 number=0
 quiet=false
 verbose=false
@@ -86,6 +89,7 @@ total=0
 status=0
 
 check () { # <patch> <commit> <title>
+	local reta
 	total=$(($total + 1))
 	! $verbose || printf '\n### %s\n\n' "$3"
 	if [ -n "$1" ] ; then
@@ -96,9 +100,26 @@ check () { # <patch> <commit> <title>
 	else
 		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
 	fi
-	[ $? -ne 0 ] || return 0
+	reta=$?
+
 	$verbose || printf '\n### %s\n\n' "$3"
 	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+
+	! $verbose || echo
+	! $verbose || echo "Checking API additions/removals:"
+
+	if [ -n "$1" ] ; then
+		report=$($VALIDATE_NEW_API $1)
+	elif [ -n "$2" ] ; then
+		report=$(git format-patch \
+			 --find-renames --no-stat --stdout -1 $commit |
+			$VALIDATE_NEW_API -)
+	else
+		report=$($VALIDATE_NEW_API -)
+	fi
+	[ $? -ne 0 -o $reta -ne 0 ] || return 0
+	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+
 	status=$(($status + 1))
 }
 
-- 
2.14.3

^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-01-18 10:32 13% ` [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes Anatoly Burakov
  2018-01-23 10:36  0%   ` Mcnamara, John
  2018-02-05 11:47  0%   ` Bruce Richardson
@ 2018-02-12 15:58  0%   ` Jonas Pfefferle
  2018-02-13  0:24  0%   ` Yongseok Koh
  3 siblings, 0 replies; 200+ results
From: Jonas Pfefferle @ 2018-02-12 15:58 UTC (permalink / raw)
  To: Anatoly Burakov, dev; +Cc: Neil Horman, John McNamara, Marko Kovacevic

On Thu, 18 Jan 2018 10:32:28 +0000
  Anatoly Burakov <anatoly.burakov@intel.com> wrote:
> Due to coming changes outlined in memory hotplug RFC, there will
> be several API/ABI changes.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> 
> Notes:
>    Patch outlining future changes:
>    http://dpdk.org/dev/patchwork/patch/32467/
>    
>    v2: added rte_mem_config and rte_memzone changes to the 
>announcement
> 
> doc/guides/rel_notes/deprecation.rst | 9 +++++++++
> 1 file changed, 9 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
>b/doc/guides/rel_notes/deprecation.rst
> index 13e8543..93cbeea 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -8,6 +8,15 @@ API and ABI deprecation notices are to be posted 
>here.
> Deprecation Notices
> -------------------
> 
> +* eal: due to internal data layoyut reorganization, there will be 
>changes to
> +  several structures and functions as a result of coming changes to 
>support
> +  memory hotplug in v18.05.
> +  ``rte_eal_get_physmem_layout`` will be deprecated and removed in 
>subsequent
> +  releases.
> +  ``rte_mem_config`` contents will change due to switch to memseg 
>lists.
> +  ``rte_memzone`` member ``memseg_id`` will no longer serve any 
>useful purpose
> +  and will be removed.
> +
> * eal: several API and ABI changes are planned for ``rte_devargs`` 
>in v18.02.
>   The format of device command line parameters will change. The bus 
>will need
>   to be explicitly stated in the device declaration. The enum 
>``rte_devtype``
> -- 
> 2.7.4

Acked-by: Jonas Pfefferle <pepperjo@japf.ch>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
    2018-01-23 10:39  4%   ` Mcnamara, John
@ 2018-02-12 16:00  4%   ` Jonas Pfefferle
  1 sibling, 0 replies; 200+ results
From: Jonas Pfefferle @ 2018-02-12 16:00 UTC (permalink / raw)
  To: Anatoly Burakov, dev; +Cc: Neil Horman, John McNamara, Marko Kovacevic

On Tue, 16 Jan 2018 17:53:40 +0000
  Anatoly Burakov <anatoly.burakov@intel.com> wrote:
> There will be a new function added in v18.05 that will return
> number of detected sockets, which will change the ABI.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> doc/guides/rel_notes/deprecation.rst | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
>b/doc/guides/rel_notes/deprecation.rst
> index 13e8543..9662150 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -8,6 +8,8 @@ API and ABI deprecation notices are to be posted 
>here.
> Deprecation Notices
> -------------------
> 
> +* eal: new ``numa_node_count`` member will be added to 
>``rte_config`` structure
> +  in v18.05.
> * eal: several API and ABI changes are planned for ``rte_devargs`` 
>in v18.02.
>   The format of device command line parameters will change. The bus 
>will need
>   to be explicitly stated in the device declaration. The enum 
>``rte_devtype``
> -- 
> 2.7.4

Acked-by: Jonas Pfefferle <pepperjo@japf.ch>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-01-18 10:32 13% ` [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes Anatoly Burakov
                     ` (2 preceding siblings ...)
  2018-02-12 15:58  0%   ` Jonas Pfefferle
@ 2018-02-13  0:24  0%   ` Yongseok Koh
  3 siblings, 0 replies; 200+ results
From: Yongseok Koh @ 2018-02-13  0:24 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev, Neil Horman, John McNamara, Marko Kovacevic


> On Jan 18, 2018, at 2:32 AM, Anatoly Burakov <anatoly.burakov@intel.com> wrote:
> 
> Due to coming changes outlined in memory hotplug RFC, there will
> be several API/ABI changes.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
Acked-by: Yongseok Koh <yskoh@mellanox.com>
 
Thanks

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure
  2018-02-06  7:38  4% ` [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure Xueming Li
@ 2018-02-13  6:52  4%   ` Shahaf Shuler
  2018-02-13 11:27  4%     ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Shahaf Shuler @ 2018-02-13  6:52 UTC (permalink / raw)
  To: Xueming(Steven) Li, Thomas Monjalon, Neil Horman; +Cc: Xueming(Steven) Li, dev

Tuesday, February 6, 2018 9:39 AM, Xueming Li:
> Subject: [PATCH v2] doc: announce ABI change for RSS configuration
> structure
> 
> Update deprecation notice for the new rss_level field of rte_eth_rss_conf.
> 
> Link: http://www.dpdk.org/dev/patchwork/patch/31891
> 
> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index d59ad5988..4bfce3bd7 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -59,3 +59,7 @@ Deprecation Notices
>    be added between the producer and consumer structures. The size of the
>    structure and the offset of the fields will remain the same on
>    platforms with 64B cache line, but will change on other platforms.
> +
> +* ethdev: A new rss level field planned in 18.05.
> +  The new API add rss_level field to ``rte_eth_rss_conf`` to enable a
> +choice
> +  of RSS hash calculation on outer or inner header of tunneled packet.

Acked-By: Shahaf Shuler <shahafs@mellanox.com>

> --
> 2.13.3

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure
  2018-02-13  6:52  4%   ` Shahaf Shuler
@ 2018-02-13 11:27  4%     ` Ferruh Yigit
  2018-02-13 12:10  4%       ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-02-13 11:27 UTC (permalink / raw)
  To: Shahaf Shuler, Xueming(Steven) Li, Thomas Monjalon, Neil Horman; +Cc: dev

On 2/13/2018 6:52 AM, Shahaf Shuler wrote:
> Tuesday, February 6, 2018 9:39 AM, Xueming Li:
>> Subject: [PATCH v2] doc: announce ABI change for RSS configuration
>> structure
>>
>> Update deprecation notice for the new rss_level field of rte_eth_rss_conf.
>>
>> Link: http://www.dpdk.org/dev/patchwork/patch/31891
>>
>> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
>> ---
>>  doc/guides/rel_notes/deprecation.rst | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/doc/guides/rel_notes/deprecation.rst
>> b/doc/guides/rel_notes/deprecation.rst
>> index d59ad5988..4bfce3bd7 100644
>> --- a/doc/guides/rel_notes/deprecation.rst
>> +++ b/doc/guides/rel_notes/deprecation.rst
>> @@ -59,3 +59,7 @@ Deprecation Notices
>>    be added between the producer and consumer structures. The size of the
>>    structure and the offset of the fields will remain the same on
>>    platforms with 64B cache line, but will change on other platforms.
>> +
>> +* ethdev: A new rss level field planned in 18.05.
>> +  The new API add rss_level field to ``rte_eth_rss_conf`` to enable a
>> +choice
>> +  of RSS hash calculation on outer or inner header of tunneled packet.
> 
> Acked-By: Shahaf Shuler <shahafs@mellanox.com>

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: remove eal API for default mempool ops name
  2018-02-02 14:01  0%     ` Olivier Matz
@ 2018-02-13 11:28  0%       ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2018-02-13 11:28 UTC (permalink / raw)
  To: Olivier Matz, Hemant Agrawal
  Cc: thomas, pbhagavatula, nipun.gupta, jerin.jacob, santosh.shukla, dev

On 2/2/2018 2:01 PM, Olivier Matz wrote:
> On Fri, Feb 02, 2018 at 02:01:42PM +0530, Hemant Agrawal wrote:
>> Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>> ---
>> v2: fix checkpatch errors
>>
>>  doc/guides/rel_notes/deprecation.rst | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
>> index d59ad59..c7d8f25 100644
>> --- a/doc/guides/rel_notes/deprecation.rst
>> +++ b/doc/guides/rel_notes/deprecation.rst
>> @@ -8,6 +8,15 @@ API and ABI deprecation notices are to be posted here.
>>  Deprecation Notices
>>  -------------------
>>  
>> +* eal: a new set of mbuf mempool ops name APIs for user, platform and best
>> +  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
>> +  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
>> +  ``rte_mbuf_best_mempool_ops``.
>> +  The following function is now redundant and it is target to be deprecated in
>> +  18.05:
>> +
>> +  - ``rte_eal_mbuf_default_mempool_ops``
>> +
>>  * eal: several API and ABI changes are planned for ``rte_devargs`` in v18.02.
>>    The format of device command line parameters will change. The bus will need
>>    to be explicitly stated in the device declaration. The enum ``rte_devtype``
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices
  2018-01-30 12:14  7% ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices Pablo de Lara
  2018-01-30 12:14  4%   ` [dpdk-dev] [PATCH v2 1/3] doc: announce ABI change for crypto info struct Pablo de Lara
@ 2018-02-13 11:45  4%   ` De Lara Guarch, Pablo
  1 sibling, 0 replies; 200+ results
From: De Lara Guarch, Pablo @ 2018-02-13 11:45 UTC (permalink / raw)
  To: akhil.goyal, hemant.agrawal, Doherty, Declan, jerin.jacob, Trahe,
	Fiona, Griffin, John, Jain, Deepak K, jck, tdu, dima, nsamsono,
	jianbo.liu
  Cc: dev



> -----Original Message-----
> From: De Lara Guarch, Pablo
> Sent: Tuesday, January 30, 2018 12:14 PM
> To: akhil.goyal@nxp.com; hemant.agrawal@nxp.com; Doherty, Declan
> <declan.doherty@intel.com>; jerin.jacob@caviumnetworks.com; Trahe,
> Fiona <fiona.trahe@intel.com>; Griffin, John <john.griffin@intel.com>; Jain,
> Deepak K <deepak.k.jain@intel.com>; jck@semihalf.com;
> tdu@semihalf.com; dima@marvell.com; nsamsono@marvell.com;
> jianbo.liu@arm.com
> Cc: dev@dpdk.org; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>
> Subject: [PATCH v2 0/3] Cryptodev API/ABI deprecation notices
> 
> v2:
> - Added an extra deprecation announcement
> - Bonded the other two deprecation notices with the new one in a
>   patchset
> 
> Pablo de Lara (3):
>   doc: announce ABI change for crypto info struct
>   doc: announce deprecation for attach/detach crypto session
>   doc: announce deprecation in crypto queue pair start/stop
> 
>  doc/guides/rel_notes/deprecation.rst | 15 +++++++++++++++
> lib/librte_cryptodev/rte_cryptodev.h |  4 ++++
>  2 files changed, 19 insertions(+)
> 
> --
> 2.14.3

Deferring this to 18.05, so we could discuss a replacement for pci_dev structure
in the cryptodev info structure, also needed for ethdev.

Pablo

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] doc: ethdev ABI change deprecation notice
  @ 2018-02-13 12:09  4%     ` Ferruh Yigit
  2018-02-13 13:21  4%       ` Olivier Matz
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-02-13 12:09 UTC (permalink / raw)
  To: Neil Horman, Kirill Rybalchenko; +Cc: dev, andrey.chilikin, thomas

On 1/12/2018 2:38 PM, Neil Horman wrote:
> On Fri, Jan 12, 2018 at 10:29:46AM +0000, Kirill Rybalchenko wrote:
>> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
>>
>> Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
>> ---
>>  doc/guides/rel_notes/deprecation.rst | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
>> index 13e8543..aaf306a 100644
>> --- a/doc/guides/rel_notes/deprecation.rst
>> +++ b/doc/guides/rel_notes/deprecation.rst
>> @@ -45,6 +45,12 @@ Deprecation Notices
>>    Target release for removal of the legacy API will be defined once most
>>    PMDs have switched to rte_flow.
>>  
>> +* ethdev: announce ABI change
>> +  The size of variables flow_types_mask in rte_eth_fdir_info structure,
>> +  sym_hash_enable_mask and valid_bit_mask in rte_eth_hash_global_conf structure
>> +  will be increased from 32 to 64 bits to fulfill hardware requirements.
>> +  This change will break existing ABI as size of the structures will increase.
>> +
>>  * i40e: The default flexible payload configuration which extracts the first 16
>>    bytes of the payload for RSS will be deprecated starting from 18.02. If
>>    required the previous behavior can be configured using existing flow
>> -- 
>> 2.5.5
>>
>>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure
  2018-02-13 11:27  4%     ` Ferruh Yigit
@ 2018-02-13 12:10  4%       ` Jerin Jacob
  2018-02-14 16:28  4%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2018-02-13 12:10 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Shahaf Shuler, Xueming(Steven) Li, Thomas Monjalon, Neil Horman, dev

-----Original Message-----
> Date: Tue, 13 Feb 2018 11:27:34 +0000
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> To: Shahaf Shuler <shahafs@mellanox.com>, "Xueming(Steven) Li"
>  <xuemingl@mellanox.com>, Thomas Monjalon <thomas@monjalon.net>, Neil
>  Horman <nhorman@tuxdriver.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS
>  configuration structure
> User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
>  Thunderbird/52.6.0
> 
> On 2/13/2018 6:52 AM, Shahaf Shuler wrote:
> > Tuesday, February 6, 2018 9:39 AM, Xueming Li:
> >> Subject: [PATCH v2] doc: announce ABI change for RSS configuration
> >> structure
> >>
> >> Update deprecation notice for the new rss_level field of rte_eth_rss_conf.
> >>
> >> Link: http://www.dpdk.org/dev/patchwork/patch/31891
> >>
> >> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> >> ---
> >>  doc/guides/rel_notes/deprecation.rst | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/doc/guides/rel_notes/deprecation.rst
> >> b/doc/guides/rel_notes/deprecation.rst
> >> index d59ad5988..4bfce3bd7 100644
> >> --- a/doc/guides/rel_notes/deprecation.rst
> >> +++ b/doc/guides/rel_notes/deprecation.rst
> >> @@ -59,3 +59,7 @@ Deprecation Notices
> >>    be added between the producer and consumer structures. The size of the
> >>    structure and the offset of the fields will remain the same on
> >>    platforms with 64B cache line, but will change on other platforms.
> >> +
> >> +* ethdev: A new rss level field planned in 18.05.
> >> +  The new API add rss_level field to ``rte_eth_rss_conf`` to enable a
> >> +choice
> >> +  of RSS hash calculation on outer or inner header of tunneled packet.
> > 
> > Acked-By: Shahaf Shuler <shahafs@mellanox.com>
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] doc: ethdev ABI change deprecation notice
  2018-02-13 12:09  4%     ` Ferruh Yigit
@ 2018-02-13 13:21  4%       ` Olivier Matz
  2018-02-14  0:14  4%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2018-02-13 13:21 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Neil Horman, Kirill Rybalchenko, dev, andrey.chilikin, thomas

On Tue, Feb 13, 2018 at 12:09:19PM +0000, Ferruh Yigit wrote:
> On 1/12/2018 2:38 PM, Neil Horman wrote:
> > On Fri, Jan 12, 2018 at 10:29:46AM +0000, Kirill Rybalchenko wrote:
> >> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> >>
> >> Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
> >> ---
> >>  doc/guides/rel_notes/deprecation.rst | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> >>
> >> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> >> index 13e8543..aaf306a 100644
> >> --- a/doc/guides/rel_notes/deprecation.rst
> >> +++ b/doc/guides/rel_notes/deprecation.rst
> >> @@ -45,6 +45,12 @@ Deprecation Notices
> >>    Target release for removal of the legacy API will be defined once most
> >>    PMDs have switched to rte_flow.
> >>  
> >> +* ethdev: announce ABI change
> >> +  The size of variables flow_types_mask in rte_eth_fdir_info structure,
> >> +  sym_hash_enable_mask and valid_bit_mask in rte_eth_hash_global_conf structure
> >> +  will be increased from 32 to 64 bits to fulfill hardware requirements.
> >> +  This change will break existing ABI as size of the structures will increase.
> >> +
> >>  * i40e: The default flexible payload configuration which extracts the first 16
> >>    bytes of the payload for RSS will be deprecated starting from 18.02. If
> >>    required the previous behavior can be configured using existing flow
> >> -- 
> >> 2.5.5
> >>
> >>
> > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function
  @ 2018-02-13 14:37  0% ` Ferruh Yigit
  2018-02-13 14:43  0%   ` Van Haaren, Harry
  2018-02-14  0:09  0% ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-02-13 14:37 UTC (permalink / raw)
  To: Erik Gabriel Carrillo, nhorman; +Cc: dev, pbhagavatula, aconole, thomas

On 1/12/2018 8:45 PM, Erik Gabriel Carrillo wrote:
> This an API/ABI change notice for DPDK 18.05 announcing a change in
> the meaning of the return values of the rte_lcore_has_role() function.
> 
> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function
  2018-02-13 14:37  0% ` Ferruh Yigit
@ 2018-02-13 14:43  0%   ` Van Haaren, Harry
  2018-02-13 14:47  0%     ` Pavan Nikhilesh
  0 siblings, 1 reply; 200+ results
From: Van Haaren, Harry @ 2018-02-13 14:43 UTC (permalink / raw)
  To: pbhagavatula
  Cc: dev, aconole, thomas, nhorman, Carrillo, Erik G, Yigit, Ferruh

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> Sent: Tuesday, February 13, 2018 2:38 PM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; nhorman@tuxdriver.com
> Cc: dev@dpdk.org; pbhagavatula@caviumnetworks.com; aconole@redhat.com;
> thomas@monjalon.net
> Subject: Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role
> function
> 
> On 1/12/2018 8:45 PM, Erik Gabriel Carrillo wrote:
> > This an API/ABI change notice for DPDK 18.05 announcing a change in
> > the meaning of the return values of the rte_lcore_has_role() function.
> >
> > Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>


Ah yes, lets make the return be 1 if the correct RTE_ROLE is probed - makes sense.

@Pavan, as original author of code, do you have an Ack for this? :)


Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function
  2018-02-13 14:43  0%   ` Van Haaren, Harry
@ 2018-02-13 14:47  0%     ` Pavan Nikhilesh
  0 siblings, 0 replies; 200+ results
From: Pavan Nikhilesh @ 2018-02-13 14:47 UTC (permalink / raw)
  To: Van Haaren, Harry, aconole, thomas, nhorman, Carrillo, Erik G,
	Yigit, Ferruh
  Cc: dev

On Tue, Feb 13, 2018 at 02:43:39PM +0000, Van Haaren, Harry wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> > Sent: Tuesday, February 13, 2018 2:38 PM
> > To: Carrillo, Erik G <erik.g.carrillo@intel.com>; nhorman@tuxdriver.com
> > Cc: dev@dpdk.org; pbhagavatula@caviumnetworks.com; aconole@redhat.com;
> > thomas@monjalon.net
> > Subject: Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role
> > function
> >
> > On 1/12/2018 8:45 PM, Erik Gabriel Carrillo wrote:
> > > This an API/ABI change notice for DPDK 18.05 announcing a change in
> > > the meaning of the return values of the rte_lcore_has_role() function.
> > >
> > > Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> >
> > Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
>
>
> Ah yes, lets make the return be 1 if the correct RTE_ROLE is probed - makes sense.
>
> @Pavan, as original author of code, do you have an Ack for this? :)
>
>
> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] eal: fix rte_errno values for IPC API
  @ 2018-02-13 15:39  3%       ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2018-02-13 15:39 UTC (permalink / raw)
  To: Van Haaren, Harry; +Cc: Thomas Monjalon, Burakov, Anatoly, dev, Tan, Jianfeng

On Tue, Feb 13, 2018 at 02:16:08PM +0000, Van Haaren, Harry wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > Sent: Tuesday, February 13, 2018 1:51 PM
> > To: Burakov, Anatoly <anatoly.burakov@intel.com>
> > Cc: dev@dpdk.org; Tan, Jianfeng <jianfeng.tan@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH] eal: fix rte_errno values for IPC API
> > 
> > > > rte_errno values should not be negative.
> > > >
> > > > Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
> > > > Fixes: 783b6e54971d ("eal: add synchronous multi-process communication")
> > > > Cc: jianfeng.tan@intel.com
> > > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > >
> > > Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
> > >
> > > Thanks for fixing this.
> > 
> > Applied, thanks
> > 
> > There are a lot of similar issues:
> > 
> > git grep -l 'rte_errno = -E' | sed 's,[^/]*$,,' | sort -u
> > 
> > 	drivers/event/opdl/
> > 	drivers/event/sw/
> <snip>
> > 	lib/librte_eventdev/
> 
> 
> I just checked the eventdev.h port_link() docs, which indicate negative return values.
> Perhaps the header is wrong too - but the PMDs adhere to the library header in this case.
> 
> Is there a requirement for rte_errno to be positive?
> It looks to be declared as per-lcore signed int in rte_errno.h +20
>
I think I wrote that part of the documentation, and it never crossed my
mind that people would set rte_errno to negative values, given how
errno from system calls are always positive. However, I think this
omission should be rectified, and we should enforce having rte_errno
values as positive.

> Either-way, if we want to change the PMDs, we should change the Eventdev APIs,
> which means API breakage, and application changes to handle changed return values.
> 
> Sound like more work than it is worth it to me?

I would view it as restoring sanity (or balance to the force if you
prefer! :-) ), so I'd definitely be ok with an ABI break to do that.

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v5] checkpatches.sh: Add checks for ABI symbol addition
  2018-02-09 15:21  6% ` [dpdk-dev] [PATCH v5] " Neil Horman
@ 2018-02-13 22:57  4%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-13 22:57 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, john.mcnamara, bruce.richardson, Ferruh Yigit, Stephen Hemminger

Hi,

I wanted to push this patch in 18.02, but when looking more closely,
I see few things to improve.
As it is a tool, there is no harm to wait one more week and push it
early in 18.05.

09/02/2018 16:21, Neil Horman:
>  check () { # <patch> <commit> <title>
> +	local reta
>  	total=$(($total + 1))
>  	! $verbose || printf '\n### %s\n\n' "$3"
>  	if [ -n "$1" ] ; then
> @@ -96,9 +100,26 @@ check () { # <patch> <commit> <title>
>  	else
>  		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
>  	fi
> -	[ $? -ne 0 ] || return 0

You are removing the return, so the report will be always printed.
You must print the report only in case of error.

> +	reta=$?
> +
>  	$verbose || printf '\n### %s\n\n' "$3"
>  	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
> +
> +	! $verbose || echo
> +	! $verbose || echo "Checking API additions/removals:"

You can use printf to combine these lines.

> +
> +	if [ -n "$1" ] ; then
> +		report=$($VALIDATE_NEW_API $1)

Beware of spaces in file names: use quoted "$1".

> +	elif [ -n "$2" ] ; then
> +		report=$(git format-patch \
> +			 --find-renames --no-stat --stdout -1 $commit |
> +			$VALIDATE_NEW_API -)
> +	else
> +		report=$($VALIDATE_NEW_API -)

So your script supports "-" for stdin? Nice

> +	fi
> +	[ $? -ne 0 -o $reta -ne 0 ] || return 0

Suggestion of more explicit variable naming:
$reta -> style_result
$? -> symbol_result

> +	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'

Wrong copy/paste: the sed is useless for the API report.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
  2018-02-09 14:42  4%       ` Bruce Richardson
@ 2018-02-14  0:04  4%         ` Thomas Monjalon
  2018-02-14 14:25  4%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-02-14  0:04 UTC (permalink / raw)
  To: Burakov, Anatoly
  Cc: dev, Bruce Richardson, Jerin Jacob, Mcnamara, John, Neil Horman,
	Kovacevic, Marko

> > > > There will be a new function added in v18.05 that will return number of
> > > > detected sockets, which will change the ABI.
> > > > 
> > > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > > > ---
> > > > +* eal: new ``numa_node_count`` member will be added to ``rte_config``
> > > > +structure in v18.05.
> > > 
> > > Acked-by: John McNamara <john.mcnamara@intel.com>
> > 
> > Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> >
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function
    2018-02-13 14:37  0% ` Ferruh Yigit
@ 2018-02-14  0:09  0% ` Thomas Monjalon
  2018-02-14 10:59  0%   ` Thomas Monjalon
  1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-02-14  0:09 UTC (permalink / raw)
  To: Erik Gabriel Carrillo; +Cc: dev, nhorman, pbhagavatula, aconole

12/01/2018 21:45, Erik Gabriel Carrillo:
> This an API/ABI change notice for DPDK 18.05 announcing a change in
> the meaning of the return values of the rte_lcore_has_role() function.
> 
> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> ---
> +* eal: The semantics of the return value for the ``rte_lcore_has_role`` function
> +  are planned to change in v18.05. The function currently returns 0 and <0 for
> +  success and failure, respectively.  This will change to 1 and 0 for true and
> +  false, respectively, to make use of the function more intuitive.

It will introduce some subtle bugs in applications.
We must clearly advertise this API change in the release notes.

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3] doc: ethdev ABI change deprecation notice
  2018-02-13 13:21  4%       ` Olivier Matz
@ 2018-02-14  0:14  4%         ` Thomas Monjalon
  2018-02-14 17:18  4%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-02-14  0:14 UTC (permalink / raw)
  To: Kirill Rybalchenko
  Cc: dev, Olivier Matz, Ferruh Yigit, Neil Horman, andrey.chilikin,
	adrien.mazarguil

> > >> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> > >>
> > >> Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
> > >> ---
> > >> +* ethdev: announce ABI change
> > >> +  The size of variables flow_types_mask in rte_eth_fdir_info structure,
> > >> +  sym_hash_enable_mask and valid_bit_mask in rte_eth_hash_global_conf structure
> > >> +  will be increased from 32 to 64 bits to fulfill hardware requirements.
> > >> +  This change will break existing ABI as size of the structures will increase.
> > >> +
> > > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> > 
> > Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

I would prefer you drop the legacy code to keep only rte_flow.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function
  2018-02-14  0:09  0% ` Thomas Monjalon
@ 2018-02-14 10:59  0%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 10:59 UTC (permalink / raw)
  To: Erik Gabriel Carrillo; +Cc: dev, nhorman, pbhagavatula, aconole

14/02/2018 01:09, Thomas Monjalon:
> 12/01/2018 21:45, Erik Gabriel Carrillo:
> > This an API/ABI change notice for DPDK 18.05 announcing a change in
> > the meaning of the return values of the rte_lcore_has_role() function.
> > 
> > Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> > ---
> > +* eal: The semantics of the return value for the ``rte_lcore_has_role`` function
> > +  are planned to change in v18.05. The function currently returns 0 and <0 for
> > +  success and failure, respectively.  This will change to 1 and 0 for true and
> > +  false, respectively, to make use of the function more intuitive.
> 
> It will introduce some subtle bugs in applications.
> We must clearly advertise this API change in the release notes.
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>

Applied

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v1] doc: update release notes for 18.02
@ 2018-02-14 12:21  6% John McNamara
  2018-02-14 13:50  6% ` [dpdk-dev] [PATCH v2] " John McNamara
  0 siblings, 1 reply; 200+ results
From: John McNamara @ 2018-02-14 12:21 UTC (permalink / raw)
  To: dev; +Cc: John McNamara

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 14409 bytes --]

Fix grammar, spelling and formatting of DPDK 18.02 release notes.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/rel_notes/release_18_02.rst | 194 +++++++++++----------------------
 1 file changed, 64 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_02.rst b/doc/guides/rel_notes/release_18_02.rst
index 04202ba..fa41207 100644
--- a/doc/guides/rel_notes/release_18_02.rst
+++ b/doc/guides/rel_notes/release_18_02.rst
@@ -41,7 +41,7 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
-* **Add function to allow releasing internal EAL resources on exit**
+* **Added function to allow releasing internal EAL resources on exit.**
 
   During ``rte_eal_init()`` EAL allocates memory from hugepages to enable its
   core libraries to perform their tasks. The ``rte_eal_cleanup()`` function
@@ -50,32 +50,12 @@ New Features
   exiting. Not calling this function could result in leaking hugepages, leading
   to failure during initialization of secondary processes.
 
-* **Added the ixgbe ethernet driver to support RSS with flow API.**
+* **Added igb, ixgbe and i40e ethernet driver to support RSS with flow API.**
 
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb and ixgbe NIC with existing RSS
-  configuration using rte_flow API.
+  Added support for igb, ixgbe and i40e NICs with existing RSS configuration
+  using the ``rte_flow`` API.
 
-* **Add MAC loopback support for i40e.**
-
-  Add MAC loopback support for i40e in order to support test task asked by
-  users. According to the device configuration, it will setup TX->RX loopback
-  link or not.
-
-* **Add the support of run time determination of number of queues per i40e VF**
-
-  The number of queue per VF is determined by its host PF. If the PCI address
-  of an i40e PF is aaaa:bb.cc, the number of queues per VF can be configured
-  with EAL parameter like -w aaaa:bb.cc,queue-num-per-vf=n. The value n can be
-  1, 2, 4, 8 or 16. If no such parameter is configured, the number of queues
-  per VF is 4 by default.
-
-* **Added the i40e ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support i40e NIC with existing RSS
-  configuration using rte_flow API.It also enable queue region configuration
-  using flow API for i40e.
+  Also enabled queue region configuration using the ``rte_flow`` API for i40e.
 
 * **Updated i40e driver to support PPPoE/PPPoL2TP.**
 
@@ -83,6 +63,20 @@ New Features
   profiles which can be programmed by dynamic device personalization (DDP)
   process.
 
+* **Added MAC loopback support for i40e.**
+
+  Added MAC loopback support for i40e in order to support test tasks requested
+  by users. It will setup ``Tx -> Rx`` loopback link according to the device
+  configuration.
+
+* **Added support of run time determination of number of queues per i40e VF.**
+
+  The number of queue per VF is determined by its host PF. If the PCI address
+  of an i40e PF is ``aaaa:bb.cc``, the number of queues per VF can be
+  configured with EAL parameter like ``-w aaaa:bb.cc,queue-num-per-vf=n``. The
+  value n can be 1, 2, 4, 8 or 16. If no such parameter is configured, the
+  number of queues per VF is 4 by default.
+
 * **Updated mlx5 driver.**
 
   Updated the mlx5 driver including the following changes:
@@ -117,16 +111,10 @@ New Features
   * Added tunneled packets classification.
   * Added inner checksum offload.
 
-* **Added the igb ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb NIC with existing RSS configuration
-  using rte_flow API.
-
-* **Add AVF (Adaptive Virtual Function) net PMD.**
+* **Added AVF (Adaptive Virtual Function) net PMD.**
 
-  A new net PMD has been added, which supports Intel® Ethernet Adaptive
-  Virtual Function (AVF) with features list below:
+  Added a new net PMD called AVF (Adaptive Virtual Function), which supports
+  Intel® Ethernet Adaptive Virtual Function (AVF) with features such as:
 
   * Basic Rx/Tx burst
   * SSE vectorized Rx/Tx burst
@@ -140,16 +128,16 @@ New Features
   * Rx/Tx descriptor status
   * Link status update/event
 
-* **Add feature supports for live migration from vhost-net to vhost-user.**
+* **Added feature supports for live migration from vhost-net to vhost-user.**
 
-  To make live migration from vhost-net to vhost-user possible, added
-  feature supports for vhost-user. The features include:
+  Added feature supports for vhost-user to make live migration from vhost-net
+  to vhost-user possible. The features include:
 
-  * VIRTIO_F_ANY_LAYOUT
-  * VIRTIO_F_EVENT_IDX
-  * VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_HOST_ECN
-  * VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_HOST_UFO
-  * VIRTIO_NET_F_GSO
+  * ``VIRTIO_F_ANY_LAYOUT``
+  * ``VIRTIO_F_EVENT_IDX``
+  * ``VIRTIO_NET_F_GUEST_ECN``, ``VIRTIO_NET_F_HOST_ECN``
+  * ``VIRTIO_NET_F_GUEST_UFO``, ``VIRTIO_NET_F_HOST_UFO``
+  * ``VIRTIO_NET_F_GSO``
 
 * **Updated the AESNI-MB PMD.**
 
@@ -160,62 +148,65 @@ New Features
 * **Updated the DPAA_SEC crypto driver to support rte_security.**
 
   Updated the ``dpaa_sec`` crypto PMD to support ``rte_security`` lookaside
-  protocol offload for IPSec.
+  protocol offload for IPsec.
 
 * **Added Wireless Base Band Device (bbdev) abstraction.**
 
   The Wireless Baseband Device library is an acceleration abstraction
   framework for 3gpp Layer 1 processing functions that provides a common
-  programming interface for seamless opeartion on integrated or discrete
+  programming interface for seamless operation on integrated or discrete
   hardware accelerators or using optimized software libraries for signal
   processing.
+
   The current release only supports 3GPP CRC, Turbo Coding and Rate
   Matching operations, as specified in 3GPP TS 36.212.
 
   See the :doc:`../prog_guide/bbdev` programmer's guide for more details.
 
-* **Added New eventdev OPDL PMD**
+* **Added New eventdev Ordered Packet Distribution Library (OPDL) PMD.**
 
   The OPDL (Ordered Packet Distribution Library) eventdev is a specific
   implementation of the eventdev API. It is particularly suited to packet
   processing workloads that have high throughput and low latency requirements.
   All packets follow the same path through the device. The order in which
-  packets  follow is determinted by the order in which queues are set up.
+  packets follow is determined by the order in which queues are set up.
   Events are left on the ring until they are transmitted. As a result packets
   do not go out of order.
 
-  With this change, application can use OPDL PMD by eventdev api.
+  With this change, applications can use the OPDL PMD via the eventdev api.
 
-* **Added New pipeline use case for dpdk-test-eventdev application**
+* **Added new pipeline use case for dpdk-test-eventdev application.**
 
+  Added a new "pipeline" use case for the ``dpdk-test-eventdev`` application.
   The pipeline case can be used to simulate various stages in a real world
   application from packet receive to transmit while maintaining the packet
-  ordering also measure the performance of the event device across the stages
-  of the pipeline.
+  ordering. It can also be used to measure the performance of the event device
+  across the stages of the pipeline.
 
-  The pipeline use case has been made generic to work will all the event
+  The pipeline use case has been made generic to work with all the event
   devices based on the capabilities.
 
-* **Updated Eventdev Sample application to support event devices based on capability**
+* **Updated Eventdev sample application to support event devices based on capability.**
 
-  Updated Eventdev pipeline sample application to support various types of pipelines
-  based on the capabilities of the attached event and ethernet devices. Also,
-  renamed the application from SW PMD specific ``eventdev_pipeline_sw_pmd``
-  to PMD agnostic ``eventdev_pipeline``.
+  Updated the Eventdev pipeline sample application to support various types of
+  pipelines based on the capabilities of the attached event and ethernet
+  devices. Also, renamed the application from software PMD specific
+  ``eventdev_pipeline_sw_pmd`` to the more generic ``eventdev_pipeline``.
 
 * **Added Rawdev, a generic device support library.**
 
-  Rawdev library provides support for integrating any generic device type with
-  DPDK framework. Generic devices are those which do not have a pre-defined
+  The Rawdev library provides support for integrating any generic device type with
+  the DPDK framework. Generic devices are those which do not have a pre-defined
   type within DPDK, for example, ethernet, crypto, event etc.
+
   A set of northbound APIs have been defined which encompass a generic set of
   operations by allowing applications to interact with device using opaque
-  structures/buffers. Also, southbound APIs provide APIs for integrating device
+  structures/buffers. Also, southbound APIs provide a means of integrating devices
   either as as part of a physical bus (PCI, FSLMC etc) or through ``vdev``.
 
   See the :doc:`../prog_guide/rawdev` programmer's guide for more details.
 
-* **Added new multi-process communication channel**
+* **Added new multi-process communication channel.**
 
   Added a generic channel in EAL for multi-process (primary/secondary) communication.
   Consumers of this channel need to register an action with an action name to response
@@ -227,14 +218,14 @@ New Features
   * ``rte_mp_request`` is for sending a request message and will block until
     it gets a reply message which is sent from the peer by ``rte_mp_reply``.
 
-* **Add GRO support for VxLAN-tunneled packets.**
+* **Added GRO support for VxLAN-tunneled packets.**
 
-  Add GRO support for VxLAN-tunneled packets. Supported VxLAN packets
+  Added GRO support for VxLAN-tunneled packets. Supported VxLAN packets
   must contain an outer IPv4 header and inner TCP/IPv4 headers. VxLAN
   GRO doesn't check if input packets have correct checksums and doesn't
   update checksums for output packets. Additionally, it assumes the
-  packets are complete (i.e., MF==0 && frag_off==0), when IP
-  fragmentation is possible (i.e., DF==0).
+  packets are complete (i.e., ``MF==0 && frag_off==0``), when IP
+  fragmentation is possible (i.e., ``DF==0``).
 
 * **Increased default Rx and Tx ring size in sample applications.**
 
@@ -243,75 +234,19 @@ New Features
   general case. The user should experiment with various Rx and Tx ring sizes
   for their specific application to get best performance.
 
-* **Added new DPDK build system using the tools "meson" and "ninja" [EXPERIMENTAL]**
+* **Added new DPDK build system using the tools "meson" and "ninja" [EXPERIMENTAL].**
 
-  Added in support for building DPDK using ``meson`` and ``ninja``, which gives
+  Added support for building DPDK using ``meson`` and ``ninja``, which gives
   additional features, such as automatic build-time configuration, over the
   current build system using ``make``. For instructions on how to do a DPDK build
   using the new system, see the instructions in ``doc/build-sdk-meson.txt``.
 
-.. note::
-
-    This new build system support is incomplete at this point and is added
-    as experimental in this release. The existing build system using ``make``
-    is unaffected by these changes, and can continue to be used for this
-    and subsequent releases until such time as it's deprecation is announced.
-
-
-API Changes
------------
-
-.. This section should contain API changes. Sample format:
-
-   * Add a short 1-2 sentence description of the API change. Use fixed width
-     quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past
-     tense.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
-
+  .. note::
 
-ABI Changes
------------
-
-.. This section should contain ABI changes. Sample format:
-
-   * Add a short 1-2 sentence description of the ABI change that was announced
-     in the previous releases and made in this release. Use fixed width quotes
-     for ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
-
-
-Removed Items
--------------
-
-.. This section should contain removed items in this release. Sample format:
-
-   * Add a short 1-2 sentence description of the removed item in the past
-     tense.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
-
-
-Known Issues
-------------
-
-.. This section should contain new known issues in this release. Sample format:
-
-   * **Add title in present tense with full stop.**
-
-     Add a short 1-2 sentence description of the known issue in the present
-     tense. Add information on any known workarounds.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
+      This new build system support is incomplete at this point and is added
+      as experimental in this release. The existing build system using ``make``
+      is unaffected by these changes, and can continue to be used for this
+      and subsequent releases until such time as it's deprecation is announced.
 
 
 Shared Library Versions
@@ -428,10 +363,10 @@ Tested Platforms
      * Red Hat Enterprise Linux Server release 7.3
      * SUSE Enterprise Linux 12
      * Wind River Linux 8
-     * Ubantu 14.04
+     * Ubuntu 14.04
      * Ubuntu 16.04
      * Ubuntu 16.10
-     * Ubantu 17.10
+     * Ubuntu 17.10
 
    * NICs:
 
@@ -476,4 +411,3 @@ Tested Platforms
        * Firmware version: 1.63, 0x80000dda
        * Device id (pf/vf): 8086:1521 / 8086:1520
        * Driver version: 5.3.0-k (igb)
-
-- 
2.7.5

^ permalink raw reply	[relevance 6%]

* [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
@ 2018-02-14 12:32  4% Shahaf Shuler
  2018-02-14 13:50  4% ` Thomas Monjalon
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Shahaf Shuler @ 2018-02-14 12:32 UTC (permalink / raw)
  To: nhorman, thomas
  Cc: dev, declan.doherty, mohammad.abdul.awal, ferruh.yigit, remy.horton

This is following the RFC being discussed and targets 18.05

http://dpdk.org/ml/archives/dev/2018-January/085716.html

Cc: declan.doherty@intel.com
Cc: mohammad.abdul.awal@intel.com
Cc: ferruh.yigit@intel.com
Cc: remy.horton@intel.com

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
---
 doc/guides/rel_notes/deprecation.rst | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..f6151de63 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,9 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* ethdev: A work is being planned for 18.05 to expose VF port representors
+  as a mean to perform control and data path operation on the different VFs.
+  As VF representor is an ethdev port, new fields are needed in order to map
+  between the VF representor and the VF or the parent PF. Those new fields
+  are to be included in ``rte_eth_dev_info`` struct.
-- 
2.12.0

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
  2018-02-14 12:32  4% [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors Shahaf Shuler
@ 2018-02-14 13:50  4% ` Thomas Monjalon
  2018-02-14 13:54  4% ` Doherty, Declan
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 13:50 UTC (permalink / raw)
  To: Shahaf Shuler
  Cc: nhorman, dev, declan.doherty, mohammad.abdul.awal, ferruh.yigit,
	remy.horton

14/02/2018 13:32, Shahaf Shuler:
> This is following the RFC being discussed and targets 18.05
> 
> http://dpdk.org/ml/archives/dev/2018-January/085716.html
> 
> Cc: declan.doherty@intel.com
> Cc: mohammad.abdul.awal@intel.com
> Cc: ferruh.yigit@intel.com
> Cc: remy.horton@intel.com
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2] doc: update release notes for 18.02
  2018-02-14 12:21  6% [dpdk-dev] [PATCH v1] doc: update release notes for 18.02 John McNamara
@ 2018-02-14 13:50  6% ` John McNamara
  0 siblings, 0 replies; 200+ results
From: John McNamara @ 2018-02-14 13:50 UTC (permalink / raw)
  To: dev; +Cc: John McNamara

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 14767 bytes --]

Fix grammar, spelling and formatting of DPDK 18.02 release notes.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/rel_notes/release_18_02.rst | 199 ++++++++++++---------------------
 1 file changed, 69 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_02.rst b/doc/guides/rel_notes/release_18_02.rst
index 04202ba..bc08118 100644
--- a/doc/guides/rel_notes/release_18_02.rst
+++ b/doc/guides/rel_notes/release_18_02.rst
@@ -41,7 +41,7 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
-* **Add function to allow releasing internal EAL resources on exit**
+* **Added function to allow releasing internal EAL resources on exit.**
 
   During ``rte_eal_init()`` EAL allocates memory from hugepages to enable its
   core libraries to perform their tasks. The ``rte_eal_cleanup()`` function
@@ -50,32 +50,12 @@ New Features
   exiting. Not calling this function could result in leaking hugepages, leading
   to failure during initialization of secondary processes.
 
-* **Added the ixgbe ethernet driver to support RSS with flow API.**
+* **Added igb, ixgbe and i40e ethernet driver to support RSS with flow API.**
 
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb and ixgbe NIC with existing RSS
-  configuration using rte_flow API.
+  Added support for igb, ixgbe and i40e NICs with existing RSS configuration
+  using the ``rte_flow`` API.
 
-* **Add MAC loopback support for i40e.**
-
-  Add MAC loopback support for i40e in order to support test task asked by
-  users. According to the device configuration, it will setup TX->RX loopback
-  link or not.
-
-* **Add the support of run time determination of number of queues per i40e VF**
-
-  The number of queue per VF is determined by its host PF. If the PCI address
-  of an i40e PF is aaaa:bb.cc, the number of queues per VF can be configured
-  with EAL parameter like -w aaaa:bb.cc,queue-num-per-vf=n. The value n can be
-  1, 2, 4, 8 or 16. If no such parameter is configured, the number of queues
-  per VF is 4 by default.
-
-* **Added the i40e ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support i40e NIC with existing RSS
-  configuration using rte_flow API.It also enable queue region configuration
-  using flow API for i40e.
+  Also enabled queue region configuration using the ``rte_flow`` API for i40e.
 
 * **Updated i40e driver to support PPPoE/PPPoL2TP.**
 
@@ -83,6 +63,20 @@ New Features
   profiles which can be programmed by dynamic device personalization (DDP)
   process.
 
+* **Added MAC loopback support for i40e.**
+
+  Added MAC loopback support for i40e in order to support test tasks requested
+  by users. It will setup ``Tx -> Rx`` loopback link according to the device
+  configuration.
+
+* **Added support of run time determination of number of queues per i40e VF.**
+
+  The number of queue per VF is determined by its host PF. If the PCI address
+  of an i40e PF is ``aaaa:bb.cc``, the number of queues per VF can be
+  configured with EAL parameter like ``-w aaaa:bb.cc,queue-num-per-vf=n``. The
+  value n can be 1, 2, 4, 8 or 16. If no such parameter is configured, the
+  number of queues per VF is 4 by default.
+
 * **Updated mlx5 driver.**
 
   Updated the mlx5 driver including the following changes:
@@ -117,16 +111,10 @@ New Features
   * Added tunneled packets classification.
   * Added inner checksum offload.
 
-* **Added the igb ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb NIC with existing RSS configuration
-  using rte_flow API.
-
-* **Add AVF (Adaptive Virtual Function) net PMD.**
+* **Added AVF (Adaptive Virtual Function) net PMD.**
 
-  A new net PMD has been added, which supports Intel® Ethernet Adaptive
-  Virtual Function (AVF) with features list below:
+  Added a new net PMD called AVF (Adaptive Virtual Function), which supports
+  Intel® Ethernet Adaptive Virtual Function (AVF) with features such as:
 
   * Basic Rx/Tx burst
   * SSE vectorized Rx/Tx burst
@@ -140,17 +128,22 @@ New Features
   * Rx/Tx descriptor status
   * Link status update/event
 
-* **Add feature supports for live migration from vhost-net to vhost-user.**
+* **Added feature supports for live migration from vhost-net to vhost-user.**
 
-  To make live migration from vhost-net to vhost-user possible, added
-  feature supports for vhost-user. The features include:
+  Added feature supports for vhost-user to make live migration from vhost-net
+  to vhost-user possible. The features include:
 
-  * VIRTIO_F_ANY_LAYOUT
-  * VIRTIO_F_EVENT_IDX
-  * VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_HOST_ECN
-  * VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_HOST_UFO
-  * VIRTIO_NET_F_GSO
+  * ``VIRTIO_F_ANY_LAYOUT``
+  * ``VIRTIO_F_EVENT_IDX``
+  * ``VIRTIO_NET_F_GUEST_ECN``, ``VIRTIO_NET_F_HOST_ECN``
+  * ``VIRTIO_NET_F_GUEST_UFO``, ``VIRTIO_NET_F_HOST_UFO``
+  * ``VIRTIO_NET_F_GSO``
 
+  Also added ``VIRTIO_NET_F_GUEST_ANNOUNCE`` feature support in virtio pmd.
+  In a scenario where the vhost backend doesn't have the ability to generate
+  RARP packets, the VM running virtio pmd can still be live migrated if
+  ``VIRTIO_NET_F_GUEST_ANNOUNCE`` feature is negotiated.
+    
 * **Updated the AESNI-MB PMD.**
 
   The AESNI-MB PMD has been updated with additional support for:
@@ -160,62 +153,65 @@ New Features
 * **Updated the DPAA_SEC crypto driver to support rte_security.**
 
   Updated the ``dpaa_sec`` crypto PMD to support ``rte_security`` lookaside
-  protocol offload for IPSec.
+  protocol offload for IPsec.
 
 * **Added Wireless Base Band Device (bbdev) abstraction.**
 
   The Wireless Baseband Device library is an acceleration abstraction
   framework for 3gpp Layer 1 processing functions that provides a common
-  programming interface for seamless opeartion on integrated or discrete
+  programming interface for seamless operation on integrated or discrete
   hardware accelerators or using optimized software libraries for signal
   processing.
+
   The current release only supports 3GPP CRC, Turbo Coding and Rate
   Matching operations, as specified in 3GPP TS 36.212.
 
   See the :doc:`../prog_guide/bbdev` programmer's guide for more details.
 
-* **Added New eventdev OPDL PMD**
+* **Added New eventdev Ordered Packet Distribution Library (OPDL) PMD.**
 
   The OPDL (Ordered Packet Distribution Library) eventdev is a specific
   implementation of the eventdev API. It is particularly suited to packet
   processing workloads that have high throughput and low latency requirements.
   All packets follow the same path through the device. The order in which
-  packets  follow is determinted by the order in which queues are set up.
+  packets follow is determined by the order in which queues are set up.
   Events are left on the ring until they are transmitted. As a result packets
   do not go out of order.
 
-  With this change, application can use OPDL PMD by eventdev api.
+  With this change, applications can use the OPDL PMD via the eventdev api.
 
-* **Added New pipeline use case for dpdk-test-eventdev application**
+* **Added new pipeline use case for dpdk-test-eventdev application.**
 
+  Added a new "pipeline" use case for the ``dpdk-test-eventdev`` application.
   The pipeline case can be used to simulate various stages in a real world
   application from packet receive to transmit while maintaining the packet
-  ordering also measure the performance of the event device across the stages
-  of the pipeline.
+  ordering. It can also be used to measure the performance of the event device
+  across the stages of the pipeline.
 
-  The pipeline use case has been made generic to work will all the event
+  The pipeline use case has been made generic to work with all the event
   devices based on the capabilities.
 
-* **Updated Eventdev Sample application to support event devices based on capability**
+* **Updated Eventdev sample application to support event devices based on capability.**
 
-  Updated Eventdev pipeline sample application to support various types of pipelines
-  based on the capabilities of the attached event and ethernet devices. Also,
-  renamed the application from SW PMD specific ``eventdev_pipeline_sw_pmd``
-  to PMD agnostic ``eventdev_pipeline``.
+  Updated the Eventdev pipeline sample application to support various types of
+  pipelines based on the capabilities of the attached event and ethernet
+  devices. Also, renamed the application from software PMD specific
+  ``eventdev_pipeline_sw_pmd`` to the more generic ``eventdev_pipeline``.
 
 * **Added Rawdev, a generic device support library.**
 
-  Rawdev library provides support for integrating any generic device type with
-  DPDK framework. Generic devices are those which do not have a pre-defined
+  The Rawdev library provides support for integrating any generic device type with
+  the DPDK framework. Generic devices are those which do not have a pre-defined
   type within DPDK, for example, ethernet, crypto, event etc.
+
   A set of northbound APIs have been defined which encompass a generic set of
   operations by allowing applications to interact with device using opaque
-  structures/buffers. Also, southbound APIs provide APIs for integrating device
+  structures/buffers. Also, southbound APIs provide a means of integrating devices
   either as as part of a physical bus (PCI, FSLMC etc) or through ``vdev``.
 
   See the :doc:`../prog_guide/rawdev` programmer's guide for more details.
 
-* **Added new multi-process communication channel**
+* **Added new multi-process communication channel.**
 
   Added a generic channel in EAL for multi-process (primary/secondary) communication.
   Consumers of this channel need to register an action with an action name to response
@@ -227,14 +223,14 @@ New Features
   * ``rte_mp_request`` is for sending a request message and will block until
     it gets a reply message which is sent from the peer by ``rte_mp_reply``.
 
-* **Add GRO support for VxLAN-tunneled packets.**
+* **Added GRO support for VxLAN-tunneled packets.**
 
-  Add GRO support for VxLAN-tunneled packets. Supported VxLAN packets
+  Added GRO support for VxLAN-tunneled packets. Supported VxLAN packets
   must contain an outer IPv4 header and inner TCP/IPv4 headers. VxLAN
   GRO doesn't check if input packets have correct checksums and doesn't
   update checksums for output packets. Additionally, it assumes the
-  packets are complete (i.e., MF==0 && frag_off==0), when IP
-  fragmentation is possible (i.e., DF==0).
+  packets are complete (i.e., ``MF==0 && frag_off==0``), when IP
+  fragmentation is possible (i.e., ``DF==0``).
 
 * **Increased default Rx and Tx ring size in sample applications.**
 
@@ -243,75 +239,19 @@ New Features
   general case. The user should experiment with various Rx and Tx ring sizes
   for their specific application to get best performance.
 
-* **Added new DPDK build system using the tools "meson" and "ninja" [EXPERIMENTAL]**
+* **Added new DPDK build system using the tools "meson" and "ninja" [EXPERIMENTAL].**
 
-  Added in support for building DPDK using ``meson`` and ``ninja``, which gives
+  Added support for building DPDK using ``meson`` and ``ninja``, which gives
   additional features, such as automatic build-time configuration, over the
   current build system using ``make``. For instructions on how to do a DPDK build
   using the new system, see the instructions in ``doc/build-sdk-meson.txt``.
 
-.. note::
-
-    This new build system support is incomplete at this point and is added
-    as experimental in this release. The existing build system using ``make``
-    is unaffected by these changes, and can continue to be used for this
-    and subsequent releases until such time as it's deprecation is announced.
-
-
-API Changes
------------
-
-.. This section should contain API changes. Sample format:
-
-   * Add a short 1-2 sentence description of the API change. Use fixed width
-     quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past
-     tense.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
-
+  .. note::
 
-ABI Changes
------------
-
-.. This section should contain ABI changes. Sample format:
-
-   * Add a short 1-2 sentence description of the ABI change that was announced
-     in the previous releases and made in this release. Use fixed width quotes
-     for ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
-
-
-Removed Items
--------------
-
-.. This section should contain removed items in this release. Sample format:
-
-   * Add a short 1-2 sentence description of the removed item in the past
-     tense.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
-
-
-Known Issues
-------------
-
-.. This section should contain new known issues in this release. Sample format:
-
-   * **Add title in present tense with full stop.**
-
-     Add a short 1-2 sentence description of the known issue in the present
-     tense. Add information on any known workarounds.
-
-   This section is a comment. do not overwrite or remove it.
-   Also, make sure to start the actual text at the margin.
-   =========================================================
+      This new build system support is incomplete at this point and is added
+      as experimental in this release. The existing build system using ``make``
+      is unaffected by these changes, and can continue to be used for this
+      and subsequent releases until such time as it's deprecation is announced.
 
 
 Shared Library Versions
@@ -428,10 +368,10 @@ Tested Platforms
      * Red Hat Enterprise Linux Server release 7.3
      * SUSE Enterprise Linux 12
      * Wind River Linux 8
-     * Ubantu 14.04
+     * Ubuntu 14.04
      * Ubuntu 16.04
      * Ubuntu 16.10
-     * Ubantu 17.10
+     * Ubuntu 17.10
 
    * NICs:
 
@@ -476,4 +416,3 @@ Tested Platforms
        * Firmware version: 1.63, 0x80000dda
        * Device id (pf/vf): 8086:1521 / 8086:1520
        * Driver version: 5.3.0-k (igb)
-
-- 
2.7.5

^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
  2018-02-14 12:32  4% [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors Shahaf Shuler
  2018-02-14 13:50  4% ` Thomas Monjalon
@ 2018-02-14 13:54  4% ` Doherty, Declan
  2018-02-14 14:50  4% ` Remy Horton
  2018-02-14 15:27  4% ` Boccassi, Luca
  3 siblings, 0 replies; 200+ results
From: Doherty, Declan @ 2018-02-14 13:54 UTC (permalink / raw)
  To: Shahaf Shuler
  Cc: dev, mohammad.abdul.awal, ferruh.yigit, remy.horton, nhorman, thomas

On 14/02/2018 12:32 PM, Shahaf Shuler wrote:
> This is following the RFC being discussed and targets 18.05
>
> http://dpdk.org/ml/archives/dev/2018-January/085716.html
>
> Cc: declan.doherty@intel.com
> Cc: mohammad.abdul.awal@intel.com
> Cc: ferruh.yigit@intel.com
> Cc: remy.horton@intel.com
>
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> ---
>   doc/guides/rel_notes/deprecation.rst | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index d59ad5988..f6151de63 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -59,3 +59,9 @@ Deprecation Notices
>     be added between the producer and consumer structures. The size of the
>     structure and the offset of the fields will remain the same on
>     platforms with 64B cache line, but will change on other platforms.
> +
> +* ethdev: A work is being planned for 18.05 to expose VF port representors
> +  as a mean to perform control and data path operation on the different VFs.
> +  As VF representor is an ethdev port, new fields are needed in order to map
> +  between the VF representor and the VF or the parent PF. Those new fields
> +  are to be included in ``rte_eth_dev_info`` struct.
>
Acked-by: Declan Doherty <declan.doherty@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
  2018-02-14  0:04  4%         ` Thomas Monjalon
@ 2018-02-14 14:25  4%           ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 14:25 UTC (permalink / raw)
  To: Burakov, Anatoly
  Cc: dev, Bruce Richardson, Jerin Jacob, Mcnamara, John, Neil Horman,
	Kovacevic, Marko

14/02/2018 01:04, Thomas Monjalon:
> > > > > There will be a new function added in v18.05 that will return number of
> > > > > detected sockets, which will change the ABI.
> > > > > 
> > > > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > > > > ---
> > > > > +* eal: new ``numa_node_count`` member will be added to ``rte_config``
> > > > > +structure in v18.05.
> > > > 
> > > > Acked-by: John McNamara <john.mcnamara@intel.com>
> > > 
> > > Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > >
> > Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>

Applied

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes
  2018-02-07 10:11  0%     ` Jerin Jacob
@ 2018-02-14 14:48  0%       ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 14:48 UTC (permalink / raw)
  To: Anatoly Burakov
  Cc: dev, Jerin Jacob, Bruce Richardson, Neil Horman, John McNamara,
	Marko Kovacevic

07/02/2018 11:11, Jerin Jacob:
> -----Original Message-----
> > Date: Mon, 5 Feb 2018 11:47:42 +0000
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > To: Anatoly Burakov <anatoly.burakov@intel.com>
> > CC: dev@dpdk.org, Neil Horman <nhorman@tuxdriver.com>, John McNamara
> >  <john.mcnamara@intel.com>, Marko Kovacevic <marko.kovacevic@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory
> >  hotplug changes
> > User-Agent: Mutt/1.9.1 (2017-09-22)
> > 
> > On Thu, Jan 18, 2018 at 10:32:28AM +0000, Anatoly Burakov wrote:
> > > Due to coming changes outlined in memory hotplug RFC, there will
> > > be several API/ABI changes.
> > > 
> > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > > ---
> > Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

Applied

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
  2018-02-14 12:32  4% [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors Shahaf Shuler
  2018-02-14 13:50  4% ` Thomas Monjalon
  2018-02-14 13:54  4% ` Doherty, Declan
@ 2018-02-14 14:50  4% ` Remy Horton
  2018-02-14 15:27  4% ` Boccassi, Luca
  3 siblings, 0 replies; 200+ results
From: Remy Horton @ 2018-02-14 14:50 UTC (permalink / raw)
  To: Shahaf Shuler, nhorman, thomas
  Cc: dev, declan.doherty, mohammad.abdul.awal, ferruh.yigit


On 14/02/2018 12:32, Shahaf Shuler wrote:
[..]
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++++++
>  1 file changed, 6 insertions(+)

Acked-by: Remy Horton <remy.horton@intel.com>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
  2018-02-01 12:53  4%     ` Hemant Agrawal
@ 2018-02-14 15:23  4%       ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 15:23 UTC (permalink / raw)
  To: Andrew Rybchenko; +Cc: dev, Hemant Agrawal, Jerin Jacob, Olivier Matz

> >>> An API/ABI changes are planned for 18.05 [1]:
> >>>
> >>>   * Allow to customize how mempool objects are stored in memory.
> >>>   * Deprecate mempool XMEM API.
> >>>   * Add mempool driver ops to get information from mempool driver and
> >>>     dequeue contiguous blocks of objects if driver supports it.
> >>>
> >>> [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html
> >>>
> >>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
> >>
> >> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > 
> > Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > 
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

Applied

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
  2018-02-14 12:32  4% [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors Shahaf Shuler
                   ` (2 preceding siblings ...)
  2018-02-14 14:50  4% ` Remy Horton
@ 2018-02-14 15:27  4% ` Boccassi, Luca
  2018-02-14 15:54  4%   ` Jerin Jacob
  3 siblings, 1 reply; 200+ results
From: Boccassi, Luca @ 2018-02-14 15:27 UTC (permalink / raw)
  To: shahafs, thomas, nhorman
  Cc: remy.horton, mohammad.abdul.awal, declan.doherty, ferruh.yigit, dev

On Wed, 2018-02-14 at 14:32 +0200, Shahaf Shuler wrote:
> This is following the RFC being discussed and targets 18.05
> 
> http://dpdk.org/ml/archives/dev/2018-January/085716.html
> 
> Cc: declan.doherty@intel.com
> Cc: mohammad.abdul.awal@intel.com
> Cc: ferruh.yigit@intel.com
> Cc: remy.horton@intel.com
> 
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index d59ad5988..f6151de63 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -59,3 +59,9 @@ Deprecation Notices
>    be added between the producer and consumer structures. The size of
> the
>    structure and the offset of the fields will remain the same on
>    platforms with 64B cache line, but will change on other platforms.
> +
> +* ethdev: A work is being planned for 18.05 to expose VF port
> representors
> +  as a mean to perform control and data path operation on the
> different VFs.
> +  As VF representor is an ethdev port, new fields are needed in
> order to map
> +  between the VF representor and the VF or the parent PF. Those new
> fields
> +  are to be included in ``rte_eth_dev_info`` struct.

Acked-by: Luca Boccassi <luca.boccassi@intl.att.com>
Acked-by: Alex Zelezniak <alexz@att.com>

Acking on behalf of my colleague Alex as well, who replied privately.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules
@ 2018-02-14 15:37  4% Adrien Mazarguil
  2018-02-14 15:37  3% ` [dpdk-dev] [PATCH v1 3/4] doc: announce API change for flow RSS/RAW actions Adrien Mazarguil
  2018-02-14 15:55  0% ` [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Nélio Laranjeiro
  0 siblings, 2 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-14 15:37 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ferruh Yigit, dev, Doherty, Declan, Shahaf Shuler,
	John Daley (johndale),
	Nelio Laranjeiro, Xueming(Steven) Li, Thomas Monjalon

Series of API/ABI change announcements for rte_flow to enable proper
encap/decap support and address various design issues that can't be
addressed without ABI impact.

Adrien Mazarguil (4):
  doc: announce API change for flow actions
  doc: announce API change for flow RSS action
  doc: announce API change for flow RSS/RAW actions
  doc: announce API change for flow VLAN pattern item

 doc/guides/rel_notes/deprecation.rst | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

-- 
2.11.0

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v1 3/4] doc: announce API change for flow RSS/RAW actions
  2018-02-14 15:37  4% [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Adrien Mazarguil
@ 2018-02-14 15:37  3% ` Adrien Mazarguil
  2018-02-14 15:55  0% ` [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Nélio Laranjeiro
  1 sibling, 0 replies; 200+ results
From: Adrien Mazarguil @ 2018-02-14 15:37 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ferruh Yigit, dev, Doherty, Declan, Shahaf Shuler,
	John Daley (johndale),
	Nelio Laranjeiro, Xueming(Steven) Li, Thomas Monjalon

C99-style flexible arrays were a bad idea for this API. This announces a
minor API/ABI change to remove them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/rel_notes/deprecation.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 40b76b391..77390ce9f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -72,3 +72,8 @@ Deprecation Notices
   rte_eth_rss_conf`` due to its limitations. All parameters, including the
   currently missing hash function to use will be made part of ``struct
   rte_flow_action_rss`` directly.
+
+* rte_flow: C99-style flexible arrays in ``struct rte_flow_action_rss`` and
+  ``struct rte_flow_item_raw`` will be replaced by standard pointers to the
+  same data. They proved difficult to use in the field (e.g. no possibility
+  of static initialization) and are unsuitable for C++ applications.
-- 
2.11.0

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules
  2018-02-14 15:37  4% [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Adrien Mazarguil
  2018-02-14 15:37  3% ` [dpdk-dev] [PATCH v1 3/4] doc: announce API change for flow RSS/RAW actions Adrien Mazarguil
@ 2018-02-14 15:55  0% ` Nélio Laranjeiro
  2018-02-14 16:06  0%   ` Andrew Rybchenko
  1 sibling, 1 reply; 200+ results
From: Nélio Laranjeiro @ 2018-02-14 15:55 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Neil Horman, Ferruh Yigit, dev, Doherty, Declan, Shahaf Shuler,
	John Daley (johndale),
	Xueming(Steven) Li, Thomas Monjalon

On Wed, Feb 14, 2018 at 04:37:26PM +0100, Adrien Mazarguil wrote:
> Series of API/ABI change announcements for rte_flow to enable proper
> encap/decap support and address various design issues that can't be
> addressed without ABI impact.
> 
> Adrien Mazarguil (4):
>   doc: announce API change for flow actions
>   doc: announce API change for flow RSS action
>   doc: announce API change for flow RSS/RAW actions
>   doc: announce API change for flow VLAN pattern item
> 
>  doc/guides/rel_notes/deprecation.rst | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> -- 
> 2.11.0

For the series,

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
  2018-02-14 15:27  4% ` Boccassi, Luca
@ 2018-02-14 15:54  4%   ` Jerin Jacob
  2018-02-14 16:50  4%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2018-02-14 15:54 UTC (permalink / raw)
  To: Boccassi, Luca
  Cc: shahafs, thomas, nhorman, remy.horton, mohammad.abdul.awal,
	declan.doherty, ferruh.yigit, dev

-----Original Message-----
> Date: Wed, 14 Feb 2018 15:27:50 +0000
> From: "Boccassi, Luca" <luca.boccassi@intl.att.com>
> To: "shahafs@mellanox.com" <shahafs@mellanox.com>, "thomas@monjalon.net"
>  <thomas@monjalon.net>, "nhorman@tuxdriver.com" <nhorman@tuxdriver.com>
> CC: "remy.horton@intel.com" <remy.horton@intel.com>,
>  "mohammad.abdul.awal@intel.com" <mohammad.abdul.awal@intel.com>,
>  "declan.doherty@intel.com" <declan.doherty@intel.com>,
>  "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>, "dev@dpdk.org"
>  <dev@dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF
>  representors
> 
> On Wed, 2018-02-14 at 14:32 +0200, Shahaf Shuler wrote:
> > This is following the RFC being discussed and targets 18.05
> > 
> > http://dpdk.org/ml/archives/dev/2018-January/085716.html
> > 
> > Cc: declan.doherty@intel.com
> > Cc: mohammad.abdul.awal@intel.com
> > Cc: ferruh.yigit@intel.com
> > Cc: remy.horton@intel.com
> > 
> > Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index d59ad5988..f6151de63 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -59,3 +59,9 @@ Deprecation Notices
> >    be added between the producer and consumer structures. The size of
> > the
> >    structure and the offset of the fields will remain the same on
> >    platforms with 64B cache line, but will change on other platforms.
> > +
> > +* ethdev: A work is being planned for 18.05 to expose VF port
> > representors
> > +  as a mean to perform control and data path operation on the
> > different VFs.
> > +  As VF representor is an ethdev port, new fields are needed in
> > order to map
> > +  between the VF representor and the VF or the parent PF. Those new
> > fields
> > +  are to be included in ``rte_eth_dev_info`` struct.
> 
> Acked-by: Luca Boccassi <luca.boccassi@intl.att.com>
> Acked-by: Alex Zelezniak <alexz@att.com>

Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

> 
> Acking on behalf of my colleague Alex as well, who replied privately.
> 
> -- 
> Kind regards,
> Luca Boccassi

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules
  2018-02-14 15:55  0% ` [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Nélio Laranjeiro
@ 2018-02-14 16:06  0%   ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-02-14 16:06 UTC (permalink / raw)
  To: Nélio Laranjeiro, Adrien Mazarguil
  Cc: Neil Horman, Ferruh Yigit, dev, Doherty, Declan, Shahaf Shuler,
	John Daley (johndale),
	Xueming(Steven) Li, Thomas Monjalon

On 02/14/2018 06:55 PM, Nélio Laranjeiro wrote:
> On Wed, Feb 14, 2018 at 04:37:26PM +0100, Adrien Mazarguil wrote:
>> Series of API/ABI change announcements for rte_flow to enable proper
>> encap/decap support and address various design issues that can't be
>> addressed without ABI impact.
>>
>> Adrien Mazarguil (4):
>>    doc: announce API change for flow actions
>>    doc: announce API change for flow RSS action
>>    doc: announce API change for flow RSS/RAW actions
>>    doc: announce API change for flow VLAN pattern item
>>
>>   doc/guides/rel_notes/deprecation.rst | 23 +++++++++++++++++++++++
>>   1 file changed, 23 insertions(+)
>>
>> -- 
>> 2.11.0
> For the series,
>
> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

For the series,

Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure
  2018-02-13 12:10  4%       ` Jerin Jacob
@ 2018-02-14 16:28  4%         ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 16:28 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Jerin Jacob, Ferruh Yigit, Shahaf Shuler, Neil Horman

> > >> Update deprecation notice for the new rss_level field of rte_eth_rss_conf.
> > >>
> > >> Link: http://www.dpdk.org/dev/patchwork/patch/31891
> > >>
> > >> Signed-off-by: Xueming Li <xuemingl@mellanox.com>
> > >> ---
> > >> +* ethdev: A new rss level field planned in 18.05.
> > >> +  The new API add rss_level field to ``rte_eth_rss_conf`` to enable a
> > >> +choice
> > >> +  of RSS hash calculation on outer or inner header of tunneled packet.
> > > 
> > > Acked-By: Shahaf Shuler <shahafs@mellanox.com>
> > 
> > Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 
> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

Applied

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors
  2018-02-14 15:54  4%   ` Jerin Jacob
@ 2018-02-14 16:50  4%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 16:50 UTC (permalink / raw)
  To: shahafs
  Cc: dev, Jerin Jacob, Boccassi, Luca, nhorman, remy.horton,
	mohammad.abdul.awal, declan.doherty, ferruh.yigit

> > > This is following the RFC being discussed and targets 18.05
> > > 
> > > http://dpdk.org/ml/archives/dev/2018-January/085716.html
> > > 
> > > Cc: declan.doherty@intel.com
> > > Cc: mohammad.abdul.awal@intel.com
> > > Cc: ferruh.yigit@intel.com
> > > Cc: remy.horton@intel.com
> > > 
> > > Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> > > ---
> > > +* ethdev: A work is being planned for 18.05 to expose VF port
> > > representors
> > > +  as a mean to perform control and data path operation on the
> > > different VFs.
> > > +  As VF representor is an ethdev port, new fields are needed in
> > > order to map
> > > +  between the VF representor and the VF or the parent PF. Those new
> > > fields
> > > +  are to be included in ``rte_eth_dev_info`` struct.
> > 
> > Acked-by: Luca Boccassi <luca.boccassi@intl.att.com>
> > Acked-by: Alex Zelezniak <alexz@att.com>
> 
> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

Applied

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v3] doc: ethdev ABI change deprecation notice
  2018-02-14  0:14  4%         ` Thomas Monjalon
@ 2018-02-14 17:18  4%           ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 17:18 UTC (permalink / raw)
  To: Kirill Rybalchenko
  Cc: dev, Olivier Matz, Ferruh Yigit, Neil Horman, andrey.chilikin,
	adrien.mazarguil

14/02/2018 01:14, Thomas Monjalon:
> > > >> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> > > >>
> > > >> Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
> > > >> ---
> > > >> +* ethdev: announce ABI change
> > > >> +  The size of variables flow_types_mask in rte_eth_fdir_info structure,
> > > >> +  sym_hash_enable_mask and valid_bit_mask in rte_eth_hash_global_conf structure
> > > >> +  will be increased from 32 to 64 bits to fulfill hardware requirements.
> > > >> +  This change will break existing ABI as size of the structures will increase.
> > > >> +
> > > > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> > > 
> > > Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > 
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> 
> I would prefer you drop the legacy code to keep only rte_flow.

Applied

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [dpdk-announce] DPDK 18.02 released
@ 2018-02-14 19:11  3% Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-14 19:11 UTC (permalink / raw)
  To: announce

A new major release is available:
	http://fast.dpdk.org/rel/dpdk-18.02.tar.xz

Special attention was paid to not break the ABI in this release.
It means 18.02 could replace 17.11 without rebuilding the applications.
However it is advised to keep using 17.11 LTS for long term deployments.

Some highlights:
	- new license header (SPDX tag)
	- bbdev (Wireless Base Band) device class
	- rawdev device class
	- ethdev probe notifications and port ownership
	- Hyper-V platform driver
	- AVF (Adaptive Virtual Function) ethdev driver
	- IPsec offload in DPAA
	- DPAA eventdev driver
	- OPDL (Ordered Packet Distribution Library) eventdev driver
	- experimental tags and automatic check
	- meson build system (beta)

More details in the release notes:
	http://dpdk.org/doc/guides/rel_notes/release_18_02.html

The statistics are similar to previous release:
	1315 patches from 145 authors
	2316 files changed, 100569 insertions(+), 77209 deletions(-)

There are 46 new contributors
(including authors, reviewers and testers):
Thanks to Aleksey Baulin, Amr Mokhtar, Andrea Grandi, Andrew Jackson,
Anoob Joseph, Avi Kivity, Bao-Long Tran, Bharat Mota, Cheryl Houser,
Ciara Power, David Coyle, Dustin Lundquist, Erik Gabriel Carrillo,
George Wilkie, Georgios Katsikas, Gong Deli, Hyong Youb Kim,
Jerry Lilijun, Jun Yang, Junjie Chen, Kefu Chai, Kevin Laatz,
Laszlo Ersek, Liang Ma, Mallesh Koujalagi, Martin Klozik,
Matthew Smith, Michael McConville, Natalie Samsonov, Nikhil Agarwal,
Peter Mccarthy, Prashant Bhole, Rafal Kozik, Rosen Xu, Roy Franz,
Sharmila Podury, Stefan Hajnoczi, Sunil Kumar Kori, Thomas Speier,
Tomasz Jozwiak, Vijay Srivastava, Wisam Jaddo, Xin Long, Yang Zhang,
Yanglong Wu and Zhike Wang.

Below is the number of patches per company (accuracy not perfect):
    463     Intel (57)
    213     Mellanox (11)
    132     NXP (7)
    131     Cavium (9)
    102     6WIND (8)
     83     Solarflare (6)
     27     Broadcom (2)
     24     RedHat (5)
     21     Semihalf (3)
     20     Microsoft (2)
     17     Cisco (3)
     16     OKTET Labs (2)
      9     AT&T (4)
      6     Marvell (1)
      5     Netronome (1)
      5     IBM (2)
      4     ZTE (1)
      4     Linaro (1)
      4     HXT Semiconductor (1)
      4     ARM (2)

The new features for 18.05 must be submitted before the next month,
in order to be reviewed and integrated during March.
The next release is expected to happen at the beginning of May.

Thanks everyone

PS: Like last year, this release is done during Valentine's day.
It is an opportunity to stop working and offer a day to your Valentine!

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v6] checkpatches.sh: Add checks for ABI symbol addition
                     ` (3 preceding siblings ...)
  2018-02-09 15:21  6% ` [dpdk-dev] [PATCH v5] " Neil Horman
@ 2018-02-14 19:19  6% ` Neil Horman
  4 siblings, 0 replies; 200+ results
From: Neil Horman @ 2018-02-14 19:19 UTC (permalink / raw)
  To: dev
  Cc: Neil Horman, thomas, john.mcnamara, bruce.richardson,
	Ferruh Yigit, Stephen Hemminger

Recently, some additional patches were added to allow for programmatic
marking of C symbols as experimental.  The addition of these markers is
dependent on the manual addition of exported symbols to the EXPERIMENTAL
section of the corresponding libraries version map file.  The consensus
on review is that, in addition to mandating the addition of symbols to
the EXPERIMENTAL version in the map, we need a mechanism to enforce our
documented process of mandating that addition when they are introduced.
To that end, I am proposing this change.  It is an addition to the
checkpatches script, which scan incoming patches for additions and
removals of symbols to the map file, and warns the user appropriately

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: thomas@monjalon.net
CC: john.mcnamara@intel.com
CC: bruce.richardson@intel.com
CC: Ferruh Yigit <ferruh.yigit@intel.com>
CC: Stephen Hemminger <stephen@networkplumber.org>

---
Change notes

v2)
 * Cleaned up and documented awk script (shemminger)
 * fixed sort/uniq usage (shemminger)
 * moved checking to new script (tmonjalon)
 * added maintainer entry (tmonjalon)
 * added license (tmonjalon)

v3)
 * Changed symbol check script name (tmonjalon)
 * Trapped exit to clean temp file (tmonjalon)
 * Honored verbose command (tmonjalon)
 * Cleaned left over debug bits (tmonjalon)
 * Updated location in MAINTAINERS file (tmonjalon)

v4)
 * Updated maintainers file (tmonjalon)

v5)
 * undo V4 (tmojalon)

v6)
 * Cleaning up more nits (tmonjalon)
 * Combining some lines (tmonjalon)
 * Fixing error print condition (tmonjalon)
 * Redirect stdin to a file to allow rewinding for
   Multiple passes on tools (nhorman)
---
 MAINTAINERS                     |   1 +
 devtools/check-symbol-change.sh | 146 ++++++++++++++++++++++++++++++++++++++++
 devtools/checkpatches.sh        |  46 +++++++++++--
 3 files changed, 188 insertions(+), 5 deletions(-)
 create mode 100755 devtools/check-symbol-change.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index a646ca3e1..f83b9ab33 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -87,6 +87,7 @@ M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/librte_compat/
 F: doc/guides/rel_notes/deprecation.rst
 F: devtools/validate-abi.sh
+F: devtools/check-symbol-change.sh
 F: buildtools/check-experimental-syms.sh
 
 Driver information
diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh
new file mode 100755
index 000000000..22b17e6f2
--- /dev/null
+++ b/devtools/check-symbol-change.sh
@@ -0,0 +1,146 @@
+#!/bin/sh
+
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Neil Horman <nhorman@tuxdriver.com>
+
+build_map_changes()
+{
+	local fname=$1
+	local mapdb=$2
+
+	cat $fname | filterdiff -i *.map | awk '
+		# Initialize our variables
+		BEGIN {map="";sym="";ar="";sec=""; in_sec=0}
+
+		# Anything that starts with + or -, followed by an a
+		# and ends in the string .map is the name of our map file
+		# This may appear multiple times in a patch if multiple
+		# map files are altered, and all section/symbol names
+		# appearing between a triggering of this rule and the
+		# next trigger of this rule are associated with this file
+		/[-+] a\/.*\.map/ {map=$2}
+
+		# Triggering this rule, which starts a line with a + and ends it
+		# with a { identifies a versioned section.  The section name is
+		# the rest of the line with the + and { symbols remvoed.
+		# Triggering this rule sets in_sec to 1, which actives the
+		# symbol rule below
+		/+.*{/ {gsub("+","");sec=$1; in_sec=1}
+
+		# This rule idenfies the end of a section, and disables the
+		# symbol rule
+		/.*}/ {in_sec=0}
+
+		# This rule matches on a + followed by any characters except a :
+		# (which denotes a global vs local segment), and ends with a ;.
+		# The semicolon is removed and the symbol is printed with its
+		# association file name and version section, along with an
+		# indicator that the symbol is a new addition.  Note this rule
+		# only works if we have found a version section in the rule
+		# above (hence the in_sec check).  Otherwise we flag it as an
+		# unknown section
+		/^+[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " add"
+			} else {
+				print map " " sym " unknown add"
+			}
+		}
+
+		# This is the same rule as above, but the rule matches on a
+		# leading - rather than a +, denoting that the symbol is being
+		# removed.
+		/^-[^}].*[^:*];/ {gsub(";","");sym=$2;
+			if (in_sec == 1) {
+				print map " " sym " " sec " del"
+			} else {
+				print map " " sym " unknown del"
+			}
+		}' > ./$mapdb
+
+		sort -u $mapdb > ./$mapdb.2
+		mv -f $mapdb.2 $mapdb
+
+}
+
+check_for_rule_violations()
+{
+	local mapdb=$1
+	local mname
+	local symname
+	local secname
+	local ar
+	local ret=0
+
+	while read mname symname secname ar
+	do
+		if [ "$ar" == "add" ]
+		then
+
+			if [ "$secname" == "unknown" ]
+			then
+				# Just inform the user of this occurrence, but
+				# don't flag it as an error
+				echo -n "INFO: symbol $syname is added but "
+				echo -n "patch has insuficient context "
+				echo -n "to determine the section name "
+				echo -n "please ensure the version is "
+				echo "EXPERIMENTAL"
+				continue
+			fi
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Symbols that are getting added in a section
+				# other ithan the experimental section
+				# to be moving from an already supported
+				# section or its a violation
+				grep -q \
+				"$mname $symname [^EXPERIMENTAL] del" $mapdb
+				if [ $? -ne 0 ]
+				then
+					echo -n "ERROR: symbol $symname "
+					echo -n "is added in a section "
+					echo -n "other than the EXPERIMENTAL "
+					echo "section of the version map"
+					ret=1
+				fi
+			fi
+		else
+
+			if [ "$secname" != "EXPERIMENTAL" ]
+			then
+				# Just inform users that non-experimenal
+				# symbols need to go through a deprecation
+				# process
+				echo -n "INFO: symbol $symname is being "
+				echo -n "removed, ensure that it has "
+				echo "gone through the deprecation process"
+			fi
+		fi
+	done < $mapdb
+
+	return $ret
+}
+
+trap clean_and_exit_on_sig EXIT
+
+mapfile=`mktemp mapdb.XXXXXX`
+patch=$1
+exit_code=1
+
+clean_and_exit_on_sig()
+{
+	rm -f $mapfile
+	exit $exit_code
+}
+
+build_map_changes $patch $mapfile
+check_for_rule_violations $mapfile
+exit_code=$?
+
+rm -f $mapfile
+
+exit $exit_code
+
+
diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 7676a6b50..fa36b0d98 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -35,6 +35,10 @@
 # - DPDK_CHECKPATCH_LINE_LENGTH
 . $(dirname $(readlink -e $0))/load-devel-config
 
+trap "rm -f $TMPINPUT" SIGINT
+
+VALIDATE_NEW_API=$(dirname $(readlink -e $0))/check-symbol-change.sh
+
 length=${DPDK_CHECKPATCH_LINE_LENGTH:-80}
 
 # override default Linux options
@@ -61,6 +65,7 @@ print_usage () {
 	END_OF_HELP
 }
 
+
 number=0
 quiet=false
 verbose=false
@@ -86,19 +91,50 @@ total=0
 status=0
 
 check () { # <patch> <commit> <title>
+	local ret=0
+	TMPINPUT=`mktemp checkpatches.XXXXXX`
+
 	total=$(($total + 1))
 	! $verbose || printf '\n### %s\n\n' "$3"
 	if [ -n "$1" ] ; then
 		report=$($DPDK_CHECKPATCH_PATH $options "$1" 2>/dev/null)
 	elif [ -n "$2" ] ; then
-		report=$(git format-patch --find-renames --no-stat --stdout -1 $commit |
+		git format-patch --find-renames --no-stat --stdout -1 $commit > ./$TMPINPUT
+		report=$(cat ./$TMPINPUT |
 			$DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
 	else
-		report=$($DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
+		cat > ./$TMPINPUT
+		report=$(cat ./$TMPINPUT | $DPDK_CHECKPATCH_PATH $options - 2>/dev/null)
+	fi
+	if [ $? -ne 0 ]
+	then
+		$verbose || printf '\n### %s\n\n' "$3"
+		printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
+		ret=1
+	fi
+
+	! $verbose || printf '\nChecking API additions/removals:\n'
+
+	if [ -n "$1" ] ; then
+		report=$($VALIDATE_NEW_API "$1")
+	elif [ -n "$2" ] ; then
+		report=$( cat ./$TMPINPUT | 
+			$VALIDATE_NEW_API -)
+	else
+		report=$(cat ./$TMPINPUT | $VALIDATE_NEW_API -)
+	fi
+
+	if [ $? -ne 0 ]
+	then
+		printf '%s\n' "$report"
+		ret=1
+	fi
+
+	rm -f ./$TMPINPUT
+	if [ $ret -eq 0 ]
+	then
+		return 0
 	fi
-	[ $? -ne 0 ] || return 0
-	$verbose || printf '\n### %s\n\n' "$3"
-	printf '%s\n' "$report" | sed -n '1,/^total:.*lines checked$/p'
 	status=$(($status + 1))
 }
 
-- 
2.14.3

^ permalink raw reply	[relevance 6%]

* [dpdk-dev] [PATCH v1] doc: add template release notes for 18.05
@ 2018-02-15 13:04  6% John McNamara
  2018-02-16 22:54  0% ` Carrillo, Erik G
  0 siblings, 1 reply; 200+ results
From: John McNamara @ 2018-02-15 13:04 UTC (permalink / raw)
  To: dev; +Cc: John McNamara

Add template release notes for DPDK 18.05 with inline
comments and explanations of the various sections.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/rel_notes/release_18_05.rst | 187 +++++++++++++++++++++++++++++++++
 1 file changed, 187 insertions(+)
 create mode 100644 doc/guides/rel_notes/release_18_05.rst

diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
new file mode 100644
index 0000000..85f4dc5
--- /dev/null
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -0,0 +1,187 @@
+DPDK Release 18.05
+==================
+
+.. **Read this first.**
+
+   The text in the sections below explains how to update the release notes.
+
+   Use proper spelling, capitalization and punctuation in all sections.
+
+   Variable and config names should be quoted as fixed width text:
+   ``LIKE_THIS``.
+
+   Build the docs and view the output file to ensure the changes are correct::
+
+      make doc-guides-html
+
+      xdg-open build/doc/html/guides/rel_notes/release_18_05.html
+
+
+New Features
+------------
+
+.. This section should contain new features added in this release. Sample
+   format:
+
+   * **Add a title in the past tense with a full stop.**
+
+     Add a short 1-2 sentence description in the past tense. The description
+     should be enough to allow someone scanning the release notes to
+     understand the new feature.
+
+     If the feature adds a lot of sub-features you can use a bullet list like
+     this:
+
+     * Added feature foo to do something.
+     * Enhanced feature bar to do something else.
+
+     Refer to the previous release notes for examples.
+
+     This section is a comment. Do not overwrite or remove it.
+     Also, make sure to start the actual text at the margin.
+     =========================================================
+
+
+API Changes
+-----------
+
+.. This section should contain API changes. Sample format:
+
+   * Add a short 1-2 sentence description of the API change. Use fixed width
+     quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past
+     tense.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =========================================================
+
+
+ABI Changes
+-----------
+
+.. This section should contain ABI changes. Sample format:
+
+   * Add a short 1-2 sentence description of the ABI change that was announced
+     in the previous releases and made in this release. Use fixed width quotes
+     for ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =========================================================
+
+
+Removed Items
+-------------
+
+.. This section should contain removed items in this release. Sample format:
+
+   * Add a short 1-2 sentence description of the removed item in the past
+     tense.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =========================================================
+
+
+Known Issues
+------------
+
+.. This section should contain new known issues in this release. Sample format:
+
+   * **Add title in present tense with full stop.**
+
+     Add a short 1-2 sentence description of the known issue in the present
+     tense. Add information on any known workarounds.
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =========================================================
+
+
+Shared Library Versions
+-----------------------
+
+.. Update any library version updated in this release and prepend with a ``+``
+   sign, like this:
+
+     librte_acl.so.2
+   + librte_cfgfile.so.2
+     librte_cmdline.so.2
+
+   This section is a comment. Do not overwrite or remove it.
+   =========================================================
+
+
+The libraries prepended with a plus sign were incremented in this version.
+
+.. code-block:: diff
+
+     librte_acl.so.2
+     librte_bbdev.so.1
+     librte_bitratestats.so.2
+     librte_bus_dpaa.so.1
+     librte_bus_fslmc.so.1
+     librte_bus_pci.so.1
+     librte_bus_vdev.so.1
+     librte_cfgfile.so.2
+     librte_cmdline.so.2
+     librte_cryptodev.so.4
+     librte_distributor.so.1
+     librte_eal.so.6
+     librte_ethdev.so.8
+     librte_eventdev.so.3
+     librte_flow_classify.so.1
+     librte_gro.so.1
+     librte_gso.so.1
+     librte_hash.so.2
+     librte_ip_frag.so.1
+     librte_jobstats.so.1
+     librte_kni.so.2
+     librte_kvargs.so.1
+     librte_latencystats.so.1
+     librte_lpm.so.2
+     librte_mbuf.so.3
+     librte_mempool.so.3
+     librte_meter.so.1
+     librte_metrics.so.1
+     librte_net.so.1
+     librte_pci.so.1
+     librte_pdump.so.2
+     librte_pipeline.so.3
+     librte_pmd_bnxt.so.2
+     librte_pmd_bond.so.2
+     librte_pmd_i40e.so.2
+     librte_pmd_ixgbe.so.2
+     librte_pmd_ring.so.2
+     librte_pmd_softnic.so.1
+     librte_pmd_vhost.so.2
+     librte_port.so.3
+     librte_power.so.1
+     librte_rawdev.so.1
+     librte_reorder.so.1
+     librte_ring.so.1
+     librte_sched.so.1
+     librte_security.so.1
+     librte_table.so.3
+     librte_timer.so.1
+     librte_vhost.so.3
+
+
+Tested Platforms
+----------------
+
+.. This section should contain a list of platforms that were tested with this
+   release.
+
+   The format is:
+
+   * <vendor> platform with <vendor> <type of devices> combinations
+
+     * List of CPU
+     * List of OS
+     * List of devices
+     * Other relevant details...
+
+   This section is a comment. Do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =========================================================
-- 
2.7.5

^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH v2] net/tap: add CRC stripping capability
  @ 2018-02-15 21:55  3%   ` Stephen Hemminger
  2018-02-16 13:00  0%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2018-02-15 21:55 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Ophir Munk, dev, Pascal Mazon, Olga Shern, stable

On Tue, 13 Feb 2018 17:35:20 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> 13/02/2018 09:14, Ophir Munk:
> > CRC stripping is executed in the kernel outside of TAP PMD scope.
> > There is no prevention that the TAP PMD will report on Rx CRC
> > stripping capability.
> > In the corrupted code, TAP PMD did not report on this capability.
> > The fix enables TAP PMD to report that Rx CRC stripping is supported.
> > 
> > Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>  
> 
> Applied, thanks
> 

The whole CRC strip flag notion is backwards. It really should of been
a bit set if driver allows preserving CRC.

Since changing the ABI is not possible right now;
the ethdev core ought to log a warning whenever driver is registered
without CRC_STRIP flag.

Or is lack of CRC_STRIP in offload flags implying that driver can
do strip and not stripping?

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2] net/tap: add CRC stripping capability
  2018-02-15 21:55  3%   ` Stephen Hemminger
@ 2018-02-16 13:00  0%     ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-16 13:00 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ophir Munk, dev, Pascal Mazon, Olga Shern

15/02/2018 22:55, Stephen Hemminger:
> On Tue, 13 Feb 2018 17:35:20 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > 13/02/2018 09:14, Ophir Munk:
> > > CRC stripping is executed in the kernel outside of TAP PMD scope.
> > > There is no prevention that the TAP PMD will report on Rx CRC
> > > stripping capability.
> > > In the corrupted code, TAP PMD did not report on this capability.
> > > The fix enables TAP PMD to report that Rx CRC stripping is supported.
> > > 
> > > Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD")
> > > Cc: stable@dpdk.org
> > > 
> > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>  
> > 
> > Applied, thanks
> > 
> 
> The whole CRC strip flag notion is backwards. It really should of been
> a bit set if driver allows preserving CRC.
> 
> Since changing the ABI is not possible right now;
> the ethdev core ought to log a warning whenever driver is registered
> without CRC_STRIP flag.
> 
> Or is lack of CRC_STRIP in offload flags implying that driver can
> do strip and not stripping?

I agree we should change the API.
Let's open a new thread to discuss it with a wider audience.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1] doc: add template release notes for 18.05
  2018-02-15 13:04  6% [dpdk-dev] [PATCH v1] doc: add template release notes for 18.05 John McNamara
@ 2018-02-16 22:54  0% ` Carrillo, Erik G
  0 siblings, 0 replies; 200+ results
From: Carrillo, Erik G @ 2018-02-16 22:54 UTC (permalink / raw)
  To: Mcnamara, John, dev; +Cc: Mcnamara, John

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of John McNamara
> Sent: Thursday, February 15, 2018 7:04 AM
> To: dev@dpdk.org
> Cc: Mcnamara, John <john.mcnamara@intel.com>
> Subject: [dpdk-dev] [PATCH v1] doc: add template release notes for 18.05
> 
> Add template release notes for DPDK 18.05 with inline comments and
> explanations of the various sections.
> 
> Signed-off-by: John McNamara <john.mcnamara@intel.com>
> ---
>  doc/guides/rel_notes/release_18_05.rst | 187
> +++++++++++++++++++++++++++++++++
>  1 file changed, 187 insertions(+)
>  create mode 100644 doc/guides/rel_notes/release_18_05.rst
> 
> diff --git a/doc/guides/rel_notes/release_18_05.rst
> b/doc/guides/rel_notes/release_18_05.rst
> new file mode 100644
> index 0000000..85f4dc5
> --- /dev/null
> +++ b/doc/guides/rel_notes/release_18_05.rst
> @@ -0,0 +1,187 @@
> +DPDK Release 18.05
> +==================
> +
> +.. **Read this first.**
> +
> +   The text in the sections below explains how to update the release notes.
> +
> +   Use proper spelling, capitalization and punctuation in all sections.
> +
> +   Variable and config names should be quoted as fixed width text:
> +   ``LIKE_THIS``.
> +
> +   Build the docs and view the output file to ensure the changes are correct::
> +
> +      make doc-guides-html
> +
> +      xdg-open build/doc/html/guides/rel_notes/release_18_05.html
> +
> +
> +New Features
> +------------
> +
> +.. This section should contain new features added in this release. Sample
> +   format:
> +
> +   * **Add a title in the past tense with a full stop.**
> +
> +     Add a short 1-2 sentence description in the past tense. The description
> +     should be enough to allow someone scanning the release notes to
> +     understand the new feature.
> +
> +     If the feature adds a lot of sub-features you can use a bullet list like
> +     this:
> +
> +     * Added feature foo to do something.
> +     * Enhanced feature bar to do something else.
> +
> +     Refer to the previous release notes for examples.
> +
> +     This section is a comment. Do not overwrite or remove it.
> +     Also, make sure to start the actual text at the margin.
> +
> =========================================================
> +
> +
> +API Changes
> +-----------
> +
> +.. This section should contain API changes. Sample format:
> +
> +   * Add a short 1-2 sentence description of the API change. Use fixed width
> +     quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past
> +     tense.
> +
> +   This section is a comment. Do not overwrite or remove it.
> +   Also, make sure to start the actual text at the margin.
> +
> =========================================================
> +
> +
> +ABI Changes
> +-----------
> +
> +.. This section should contain ABI changes. Sample format:
> +
> +   * Add a short 1-2 sentence description of the ABI change that was
> announced
> +     in the previous releases and made in this release. Use fixed width quotes
> +     for ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
> +
> +   This section is a comment. Do not overwrite or remove it.
> +   Also, make sure to start the actual text at the margin.
> +
> =========================================================
> +
> +
> +Removed Items
> +-------------
> +
> +.. This section should contain removed items in this release. Sample format:
> +
> +   * Add a short 1-2 sentence description of the removed item in the past
> +     tense.
> +
> +   This section is a comment. Do not overwrite or remove it.
> +   Also, make sure to start the actual text at the margin.
> +
> =========================================================
> +
> +
> +Known Issues
> +------------
> +
> +.. This section should contain new known issues in this release. Sample
> format:
> +
> +   * **Add title in present tense with full stop.**
> +
> +     Add a short 1-2 sentence description of the known issue in the present
> +     tense. Add information on any known workarounds.
> +
> +   This section is a comment. Do not overwrite or remove it.
> +   Also, make sure to start the actual text at the margin.
> +
> =========================================================
> +
> +
> +Shared Library Versions
> +-----------------------
> +
> +.. Update any library version updated in this release and prepend with a ``+``
> +   sign, like this:
> +
> +     librte_acl.so.2
> +   + librte_cfgfile.so.2
> +     librte_cmdline.so.2
> +
> +   This section is a comment. Do not overwrite or remove it.
> +
> =========================================================
> +
> +
> +The libraries prepended with a plus sign were incremented in this version.
> +
> +.. code-block:: diff
> +
> +     librte_acl.so.2
> +     librte_bbdev.so.1
> +     librte_bitratestats.so.2
> +     librte_bus_dpaa.so.1
> +     librte_bus_fslmc.so.1
> +     librte_bus_pci.so.1
> +     librte_bus_vdev.so.1
> +     librte_cfgfile.so.2
> +     librte_cmdline.so.2
> +     librte_cryptodev.so.4
> +     librte_distributor.so.1
> +     librte_eal.so.6
> +     librte_ethdev.so.8
> +     librte_eventdev.so.3
> +     librte_flow_classify.so.1
> +     librte_gro.so.1
> +     librte_gso.so.1
> +     librte_hash.so.2
> +     librte_ip_frag.so.1
> +     librte_jobstats.so.1
> +     librte_kni.so.2
> +     librte_kvargs.so.1
> +     librte_latencystats.so.1
> +     librte_lpm.so.2
> +     librte_mbuf.so.3
> +     librte_mempool.so.3
> +     librte_meter.so.1
> +     librte_metrics.so.1
> +     librte_net.so.1
> +     librte_pci.so.1
> +     librte_pdump.so.2
> +     librte_pipeline.so.3
> +     librte_pmd_bnxt.so.2
> +     librte_pmd_bond.so.2
> +     librte_pmd_i40e.so.2
> +     librte_pmd_ixgbe.so.2
> +     librte_pmd_ring.so.2
> +     librte_pmd_softnic.so.1
> +     librte_pmd_vhost.so.2
> +     librte_port.so.3
> +     librte_power.so.1
> +     librte_rawdev.so.1
> +     librte_reorder.so.1
> +     librte_ring.so.1
> +     librte_sched.so.1
> +     librte_security.so.1
> +     librte_table.so.3
> +     librte_timer.so.1
> +     librte_vhost.so.3
> +
> +
> +Tested Platforms
> +----------------
> +
> +.. This section should contain a list of platforms that were tested with this
> +   release.
> +
> +   The format is:
> +
> +   * <vendor> platform with <vendor> <type of devices> combinations
> +
> +     * List of CPU
> +     * List of OS
> +     * List of devices
> +     * Other relevant details...
> +
> +   This section is a comment. Do not overwrite or remove it.
> +   Also, make sure to start the actual text at the margin.
> +
> =========================================================
> --
> 2.7.5

Acked-by:  Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4] lib/librte_meter: add meter configuration profile
  @ 2018-02-19 21:12  3%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-19 21:12 UTC (permalink / raw)
  To: Jasvinder Singh, cristian.dumitrescu; +Cc: dev

08/01/2018 16:43, Jasvinder Singh:
> From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> 
> This patch adds support for meter configuration profiles.
> Benefits: simplified configuration procedure, improved performance.
> 
> Q1: What is the configuration profile and why does it make sense?
> A1: The configuration profile represents the set of configuration
>     parameters for a given meter object, such as the rates and sizes for
>     the token buckets. The configuration profile concept makes sense when
>     many meter objects share the same configuration, which is the typical
>     usage model: thousands of traffic flows are each individually metered
>     according to just a few service levels (i.e. profiles).
> 
> Q2: How is the configuration profile improving the performance?
> A2: The performance improvement is achieved by reducing the memory
>     footprint of a meter object, which results in better cache utilization
>     for the typical case when large arrays of meter objects are used. The
>     internal data structures stored for each meter object contain:
>        a) Constant fields: Low level translation of the configuration
>           parameters that does not change post-configuration. This is
>           really duplicated for all meters that use the same
>           configuration. This is the configuration profile data that is
>           moved away from the meter object. Current size (implementation
>           dependent): srTCM = 32 bytes, trTCM = 32 bytes.
>        b) Variable fields: Time stamps and running counters that change
>           during the on-going traffic metering process. Current size
>           (implementation dependant): srTCM = 24 bytes, trTCM = 32 bytes.
>           Therefore, by moving the constant fields to a separate profile
>           data structure shared by all the meters with the same
>           configuration, the size of the meter object is reduced by ~50%.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Applied for 18.05 (was postponed to preserve 18.02 ABI), thanks.

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH] doc: fixing grammar
@ 2018-02-22 12:15  8% Alejandro Lucero
  0 siblings, 0 replies; 200+ results
From: Alejandro Lucero @ 2018-02-22 12:15 UTC (permalink / raw)
  To: dev; +Cc: stable

My english is far worse than those from the marketing team.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
 doc/guides/nics/nfp.rst | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
index 99a3b76..67e574e 100644
--- a/doc/guides/nics/nfp.rst
+++ b/doc/guides/nics/nfp.rst
@@ -34,14 +34,14 @@ NFP poll mode driver library
 Netronome's sixth generation of flow processors pack 216 programmable
 cores and over 100 hardware accelerators that uniquely combine packet,
 flow, security and content processing in a single device that scales
-up to 400 Gbps.
+up to 400-Gb/s.
 
 This document explains how to use DPDK with the Netronome Poll Mode
 Driver (PMD) supporting Netronome's Network Flow Processor 6xxx
 (NFP-6xxx) and Netronome's Flow Processor 4xxx (NFP-4xxx).
 
 NFP is a SRIOV capable device and the PMD driver supports the physical
-function (PF) and virtual functions (VFs).
+function (PF) and the virtual functions (VFs).
 
 Dependencies
 ------------
@@ -49,17 +49,18 @@ Dependencies
 Before using the Netronome's DPDK PMD some NFP configuration,
 which is not related to DPDK, is required. The system requires
 installation of **Netronome's BSP (Board Support Package)** along
-with some specific NFP firmware application. Netronome's NSP ABI
+with a specific NFP firmware application. Netronome's NSP ABI
 version should be 0.20 or higher.
 
 If you have a NFP device you should already have the code and
-documentation for doing all this configuration. Contact
+documentation for this configuration. Contact
 **support@netronome.com** to obtain the latest available firmware.
 
-The NFP Linux netdev kernel driver for VFs is part of vanilla kernel
-since kernel version 4.5, and support for the PF since kernel version
-4.11. Support for older kernels can be obtained on Github at
-**https://github.com/Netronome/nfp-drv-kmods** along with build
+The NFP Linux netdev kernel driver for VFs has been a part of the
+vanilla kernel since kernel version 4.5, and support for the PF
+since kernel version 4.11. Support for older kernels can be obtained
+on Github at
+**https://github.com/Netronome/nfp-drv-kmods** along with the build
 instructions.
 
 NFP PMD needs to be used along with UIO ``igb_uio`` or VFIO (``vfio-pci``)
@@ -70,15 +71,15 @@ Building the software
 
 Netronome's PMD code is provided in the **drivers/net/nfp** directory.
 Although NFP PMD has Netronome´s BSP dependencies, it is possible to
-compile it along with other DPDK PMDs even if no BSP was installed before.
+compile it along with other DPDK PMDs even if no BSP was installed previously.
 Of course, a DPDK app will require such a BSP installed for using the
 NFP PMD, along with a specific NFP firmware application.
 
-Default PMD configuration is at **common_linuxapp configuration** file:
+Default PMD configuration is at the **common_linuxapp configuration** file:
 
 - **CONFIG_RTE_LIBRTE_NFP_PMD=y**
 
-Once DPDK is built all the DPDK apps and examples include support for
+Once the DPDK is built all the DPDK apps and examples include support for
 the NFP PMD.
 
 
@@ -91,18 +92,18 @@ for details.
 Using the PF
 ------------
 
-NFP PMD has support for using the NFP PF as another DPDK port, but it does not
+NFP PMD supports using the NFP PF as another DPDK port, but it does not
 have any functionality for controlling VFs. In fact, it is not possible to use
 the PMD with the VFs if the PF is being used by DPDK, that is, with the NFP PF
-bound to ``igb_uio`` or ``vfio-pci`` kernel drivers. Future DPDK version will
+bound to ``igb_uio`` or ``vfio-pci`` kernel drivers. Future DPDK versions will
 have a PMD able to work with the PF and VFs at the same time and with the PF
 implementing VF management along with other PF-only functionalities/offloads.
 
-The PMD PF has extra work to do which will delay the DPDK app initialization
-like checking if a firmware is already available in the device, uploading the
-firmware if necessary, and configure the Link state properly when starting or
-stopping a PF port. Note that firmware upload is not always necessary which is
-the main delay for NFP PF PMD initialization.
+The PMD PF has extra work to do which will delay the DPDK app initialization.
+This additional effort could be checking if a firmware is already available in
+the device, uploading the firmware if necessary or configuring the Link state
+properly when starting or stopping a PF port. Note that firmware upload is not
+always necessary which is the main delay for NFP PF PMD initialization.
 
 Depending on the Netronome product installed in the system, firmware files
 should be available under ``/lib/firmware/netronome``. DPDK PMD supporting the
@@ -114,14 +115,14 @@ PF multiport support
 --------------------
 
 Some NFP cards support several physical ports with just one single PCI device.
-DPDK core is designed with the 1:1 relationship between PCI devices and DPDK
+The DPDK core is designed with a 1:1 relationship between PCI devices and DPDK
 ports, so NFP PMD PF support requires handling the multiport case specifically.
 During NFP PF initialization, the PMD will extract the information about the
 number of PF ports from the firmware and will create as many DPDK ports as
 needed.
 
 Because the unusual relationship between a single PCI device and several DPDK
-ports, there are some limitations when using more than one PF DPDK ports: there
+ports, there are some limitations when using more than one PF DPDK port: there
 is no support for RX interrupts and it is not possible either to use those PF
 ports with the device hotplug functionality.
 
@@ -136,7 +137,7 @@ System configuration
    get the drivers from the above Github repository and follow the instructions
    for building and installing it.
 
-   Virtual Functions need to be enabled before they can be used with the PMD.
+   VFs need to be enabled before they can be used with the PMD.
    Before enabling the VFs it is useful to obtain information about the
    current NFP PCI device detected by the system:
 
-- 
1.9.1

^ permalink raw reply	[relevance 8%]

* [dpdk-dev] [PATCH v2 1/7] crypto/virtio: add virtio related fundamental functions
  @ 2018-02-24 13:14  2% ` Jay Zhou
  0 siblings, 0 replies; 200+ results
From: Jay Zhou @ 2018-02-24 13:14 UTC (permalink / raw)
  To: dev
  Cc: pablo.de.lara.guarch, roy.fan.zhang, thomas, arei.gonglei,
	xin.zeng, weidong.huang, wangxinxin.wang, longpeng2,
	jianjay.zhou

Since there are not have the common virtio library, we have to put
these files here. They are basically the same with virtio net related files
with some minor changes.

Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
---
 config/common_base                  |  20 ++
 drivers/crypto/virtio/virtio_logs.h |  47 ++++
 drivers/crypto/virtio/virtio_pci.c  | 460 ++++++++++++++++++++++++++++++++++++
 drivers/crypto/virtio/virtio_pci.h  | 252 ++++++++++++++++++++
 drivers/crypto/virtio/virtio_ring.h | 137 +++++++++++
 drivers/crypto/virtio/virtqueue.c   |  43 ++++
 drivers/crypto/virtio/virtqueue.h   | 176 ++++++++++++++
 7 files changed, 1135 insertions(+)
 create mode 100644 drivers/crypto/virtio/virtio_logs.h
 create mode 100644 drivers/crypto/virtio/virtio_pci.c
 create mode 100644 drivers/crypto/virtio/virtio_pci.h
 create mode 100644 drivers/crypto/virtio/virtio_ring.h
 create mode 100644 drivers/crypto/virtio/virtqueue.c
 create mode 100644 drivers/crypto/virtio/virtqueue.h

diff --git a/config/common_base b/config/common_base
index ad03cf4..19d0cdd 100644
--- a/config/common_base
+++ b/config/common_base
@@ -482,6 +482,26 @@ CONFIG_RTE_LIBRTE_PMD_QAT_DEBUG_DRIVER=n
 CONFIG_RTE_QAT_PMD_MAX_NB_SESSIONS=2048
 
 #
+# Compile PMD for virtio crypto devices
+#
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO=n
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_INIT=n
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_SESSION=n
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_DRIVER=n
+CONFIG_RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_DUMP=n
+#
+# Number of maximum virtio crypto devices
+#
+CONFIG_RTE_MAX_VIRTIO_CRYPTO=32
+#
+# Number of sessions to create in the session memory pool
+# on a single virtio crypto device.
+#
+CONFIG_RTE_VIRTIO_CRYPTO_PMD_MAX_NB_SESSIONS=1024
+
+#
 # Compile PMD for AESNI backed device
 #
 CONFIG_RTE_LIBRTE_PMD_AESNI_MB=n
diff --git a/drivers/crypto/virtio/virtio_logs.h b/drivers/crypto/virtio/virtio_logs.h
new file mode 100644
index 0000000..20582a4
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_logs.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 HUAWEI TECHNOLOGIES CO., LTD.
+ */
+
+#ifndef _VIRTIO_LOGS_H_
+#define _VIRTIO_LOGS_H_
+
+#include <rte_log.h>
+
+#ifdef RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_INIT
+#define PMD_INIT_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
+#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
+#else
+#define PMD_INIT_LOG(level, fmt, args...) do { } while (0)
+#define PMD_INIT_FUNC_TRACE() do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_SESSION
+#define PMD_SESSION_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s() session: " fmt "\n", __func__, ## args)
+#else
+#define PMD_SESSION_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_RX
+#define PMD_RX_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s() rx: " fmt "\n", __func__, ## args)
+#else
+#define PMD_RX_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_TX
+#define PMD_TX_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s() tx: " fmt "\n", __func__, ## args)
+#else
+#define PMD_TX_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_DRIVER
+#define PMD_DRV_LOG(level, fmt, args...) \
+	RTE_LOG(level, PMD, "%s(): driver " fmt "\n", __func__, ## args)
+#else
+#define PMD_DRV_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#endif /* _VIRTIO_LOGS_H_ */
diff --git a/drivers/crypto/virtio/virtio_pci.c b/drivers/crypto/virtio/virtio_pci.c
new file mode 100644
index 0000000..7aa5cdd
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_pci.c
@@ -0,0 +1,460 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 HUAWEI TECHNOLOGIES CO., LTD.
+ */
+
+#include <stdint.h>
+
+#ifdef RTE_EXEC_ENV_LINUXAPP
+ #include <dirent.h>
+ #include <fcntl.h>
+#endif
+
+#include <rte_io.h>
+#include <rte_bus.h>
+
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+/*
+ * Following macros are derived from linux/pci_regs.h, however,
+ * we can't simply include that header here, as there is no such
+ * file for non-Linux platform.
+ */
+#define PCI_CAPABILITY_LIST	0x34
+#define PCI_CAP_ID_VNDR		0x09
+#define PCI_CAP_ID_MSIX		0x11
+
+/*
+ * The remaining space is defined by each driver as the per-driver
+ * configuration space.
+ */
+#define VIRTIO_PCI_CONFIG(hw) \
+		(((hw)->use_msix == VIRTIO_MSIX_ENABLED) ? 24 : 20)
+
+static inline int
+check_vq_phys_addr_ok(struct virtqueue *vq)
+{
+	/* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit,
+	 * and only accepts 32 bit page frame number.
+	 * Check if the allocated physical memory exceeds 16TB.
+	 */
+	if ((vq->vq_ring_mem + vq->vq_ring_size - 1) >>
+			(VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) {
+		PMD_INIT_LOG(ERR, "vring address shouldn't be above 16TB!");
+		return 0;
+	}
+
+	return 1;
+}
+
+static inline void
+io_write64_twopart(uint64_t val, uint32_t *lo, uint32_t *hi)
+{
+	rte_write32(val & ((1ULL << 32) - 1), lo);
+	rte_write32(val >> 32,		     hi);
+}
+
+static void
+modern_read_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+		       void *dst, int length)
+{
+	int i;
+	uint8_t *p;
+	uint8_t old_gen, new_gen;
+
+	do {
+		old_gen = rte_read8(&hw->common_cfg->config_generation);
+
+		p = dst;
+		for (i = 0;  i < length; i++)
+			*p++ = rte_read8((uint8_t *)hw->dev_cfg + offset + i);
+
+		new_gen = rte_read8(&hw->common_cfg->config_generation);
+	} while (old_gen != new_gen);
+}
+
+static void
+modern_write_dev_config(struct virtio_crypto_hw *hw, size_t offset,
+			const void *src, int length)
+{
+	int i;
+	const uint8_t *p = src;
+
+	for (i = 0;  i < length; i++)
+		rte_write8((*p++), (((uint8_t *)hw->dev_cfg) + offset + i));
+}
+
+static uint64_t
+modern_get_features(struct virtio_crypto_hw *hw)
+{
+	uint32_t features_lo, features_hi;
+
+	rte_write32(0, &hw->common_cfg->device_feature_select);
+	features_lo = rte_read32(&hw->common_cfg->device_feature);
+
+	rte_write32(1, &hw->common_cfg->device_feature_select);
+	features_hi = rte_read32(&hw->common_cfg->device_feature);
+
+	return ((uint64_t)features_hi << 32) | features_lo;
+}
+
+static void
+modern_set_features(struct virtio_crypto_hw *hw, uint64_t features)
+{
+	rte_write32(0, &hw->common_cfg->guest_feature_select);
+	rte_write32(features & ((1ULL << 32) - 1),
+		    &hw->common_cfg->guest_feature);
+
+	rte_write32(1, &hw->common_cfg->guest_feature_select);
+	rte_write32(features >> 32,
+		    &hw->common_cfg->guest_feature);
+}
+
+static uint8_t
+modern_get_status(struct virtio_crypto_hw *hw)
+{
+	return rte_read8(&hw->common_cfg->device_status);
+}
+
+static void
+modern_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	rte_write8(status, &hw->common_cfg->device_status);
+}
+
+static void
+modern_reset(struct virtio_crypto_hw *hw)
+{
+	modern_set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
+	modern_get_status(hw);
+}
+
+static uint8_t
+modern_get_isr(struct virtio_crypto_hw *hw)
+{
+	return rte_read8(hw->isr);
+}
+
+static uint16_t
+modern_set_config_irq(struct virtio_crypto_hw *hw, uint16_t vec)
+{
+	rte_write16(vec, &hw->common_cfg->msix_config);
+	return rte_read16(&hw->common_cfg->msix_config);
+}
+
+static uint16_t
+modern_set_queue_irq(struct virtio_crypto_hw *hw, struct virtqueue *vq,
+		uint16_t vec)
+{
+	rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+	rte_write16(vec, &hw->common_cfg->queue_msix_vector);
+	return rte_read16(&hw->common_cfg->queue_msix_vector);
+}
+
+static uint16_t
+modern_get_queue_num(struct virtio_crypto_hw *hw, uint16_t queue_id)
+{
+	rte_write16(queue_id, &hw->common_cfg->queue_select);
+	return rte_read16(&hw->common_cfg->queue_size);
+}
+
+static int
+modern_setup_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	uint64_t desc_addr, avail_addr, used_addr;
+	uint16_t notify_off;
+
+	if (!check_vq_phys_addr_ok(vq))
+		return -1;
+
+	desc_addr = vq->vq_ring_mem;
+	avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
+	used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
+							 ring[vq->vq_nentries]),
+				   VIRTIO_PCI_VRING_ALIGN);
+
+	rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+
+	io_write64_twopart(desc_addr, &hw->common_cfg->queue_desc_lo,
+				      &hw->common_cfg->queue_desc_hi);
+	io_write64_twopart(avail_addr, &hw->common_cfg->queue_avail_lo,
+				       &hw->common_cfg->queue_avail_hi);
+	io_write64_twopart(used_addr, &hw->common_cfg->queue_used_lo,
+				      &hw->common_cfg->queue_used_hi);
+
+	notify_off = rte_read16(&hw->common_cfg->queue_notify_off);
+	vq->notify_addr = (void *)((uint8_t *)hw->notify_base +
+				notify_off * hw->notify_off_multiplier);
+
+	rte_write16(1, &hw->common_cfg->queue_enable);
+
+	PMD_INIT_LOG(DEBUG, "queue %u addresses:", vq->vq_queue_index);
+	PMD_INIT_LOG(DEBUG, "\t desc_addr: %" PRIx64, desc_addr);
+	PMD_INIT_LOG(DEBUG, "\t aval_addr: %" PRIx64, avail_addr);
+	PMD_INIT_LOG(DEBUG, "\t used_addr: %" PRIx64, used_addr);
+	PMD_INIT_LOG(DEBUG, "\t notify addr: %p (notify offset: %u)",
+		vq->notify_addr, notify_off);
+
+	return 0;
+}
+
+static void
+modern_del_queue(struct virtio_crypto_hw *hw, struct virtqueue *vq)
+{
+	rte_write16(vq->vq_queue_index, &hw->common_cfg->queue_select);
+
+	io_write64_twopart(0, &hw->common_cfg->queue_desc_lo,
+				  &hw->common_cfg->queue_desc_hi);
+	io_write64_twopart(0, &hw->common_cfg->queue_avail_lo,
+				  &hw->common_cfg->queue_avail_hi);
+	io_write64_twopart(0, &hw->common_cfg->queue_used_lo,
+				  &hw->common_cfg->queue_used_hi);
+
+	rte_write16(0, &hw->common_cfg->queue_enable);
+}
+
+static void
+modern_notify_queue(struct virtio_crypto_hw *hw __rte_unused,
+		struct virtqueue *vq)
+{
+	rte_write16(vq->vq_queue_index, vq->notify_addr);
+}
+
+const struct virtio_pci_ops virtio_crypto_modern_ops = {
+	.read_dev_cfg	= modern_read_dev_config,
+	.write_dev_cfg	= modern_write_dev_config,
+	.reset		= modern_reset,
+	.get_status	= modern_get_status,
+	.set_status	= modern_set_status,
+	.get_features	= modern_get_features,
+	.set_features	= modern_set_features,
+	.get_isr	= modern_get_isr,
+	.set_config_irq	= modern_set_config_irq,
+	.set_queue_irq  = modern_set_queue_irq,
+	.get_queue_num	= modern_get_queue_num,
+	.setup_queue	= modern_setup_queue,
+	.del_queue	= modern_del_queue,
+	.notify_queue	= modern_notify_queue,
+};
+
+void
+vtpci_read_cryptodev_config(struct virtio_crypto_hw *hw, size_t offset,
+		void *dst, int length)
+{
+	VTPCI_OPS(hw)->read_dev_cfg(hw, offset, dst, length);
+}
+
+void
+vtpci_write_cryptodev_config(struct virtio_crypto_hw *hw, size_t offset,
+		const void *src, int length)
+{
+	VTPCI_OPS(hw)->write_dev_cfg(hw, offset, src, length);
+}
+
+uint64_t
+vtpci_cryptodev_negotiate_features(struct virtio_crypto_hw *hw,
+		uint64_t host_features)
+{
+	uint64_t features;
+
+	/*
+	 * Limit negotiated features to what the driver, virtqueue, and
+	 * host all support.
+	 */
+	features = host_features & hw->guest_features;
+	VTPCI_OPS(hw)->set_features(hw, features);
+
+	return features;
+}
+
+void
+vtpci_cryptodev_reset(struct virtio_crypto_hw *hw)
+{
+	VTPCI_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
+	/* flush status write */
+	VTPCI_OPS(hw)->get_status(hw);
+}
+
+void
+vtpci_cryptodev_reinit_complete(struct virtio_crypto_hw *hw)
+{
+	vtpci_cryptodev_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK);
+}
+
+void
+vtpci_cryptodev_set_status(struct virtio_crypto_hw *hw, uint8_t status)
+{
+	if (status != VIRTIO_CONFIG_STATUS_RESET)
+		status |= VTPCI_OPS(hw)->get_status(hw);
+
+	VTPCI_OPS(hw)->set_status(hw, status);
+}
+
+uint8_t
+vtpci_cryptodev_get_status(struct virtio_crypto_hw *hw)
+{
+	return VTPCI_OPS(hw)->get_status(hw);
+}
+
+uint8_t
+vtpci_cryptodev_isr(struct virtio_crypto_hw *hw)
+{
+	return VTPCI_OPS(hw)->get_isr(hw);
+}
+
+static void *
+get_cfg_addr(struct rte_pci_device *dev, struct virtio_pci_cap *cap)
+{
+	uint8_t  bar    = cap->bar;
+	uint32_t length = cap->length;
+	uint32_t offset = cap->offset;
+	uint8_t *base;
+
+	if (bar >= PCI_MAX_RESOURCE) {
+		PMD_INIT_LOG(ERR, "invalid bar: %u", bar);
+		return NULL;
+	}
+
+	if (offset + length < offset) {
+		PMD_INIT_LOG(ERR, "offset(%u) + length(%u) overflows",
+			offset, length);
+		return NULL;
+	}
+
+	if (offset + length > dev->mem_resource[bar].len) {
+		PMD_INIT_LOG(ERR,
+			"invalid cap: overflows bar space: %u > %" PRIu64,
+			offset + length, dev->mem_resource[bar].len);
+		return NULL;
+	}
+
+	base = dev->mem_resource[bar].addr;
+	if (base == NULL) {
+		PMD_INIT_LOG(ERR, "bar %u base addr is NULL", bar);
+		return NULL;
+	}
+
+	return base + offset;
+}
+
+#define PCI_MSIX_ENABLE 0x8000
+
+static int
+virtio_read_caps(struct rte_pci_device *dev, struct virtio_crypto_hw *hw)
+{
+	uint8_t pos;
+	struct virtio_pci_cap cap;
+	int ret;
+
+	if (rte_pci_map_device(dev)) {
+		PMD_INIT_LOG(DEBUG, "failed to map pci device!");
+		return -1;
+	}
+
+	ret = rte_pci_read_config(dev, &pos, 1, PCI_CAPABILITY_LIST);
+	if (ret < 0) {
+		PMD_INIT_LOG(DEBUG, "failed to read pci capability list");
+		return -1;
+	}
+
+	while (pos) {
+		ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
+		if (ret < 0) {
+			PMD_INIT_LOG(ERR,
+				"failed to read pci cap at pos: %x", pos);
+			break;
+		}
+
+		if (cap.cap_vndr == PCI_CAP_ID_MSIX) {
+			/* Transitional devices would also have this capability,
+			 * that's why we also check if msix is enabled.
+			 * 1st byte is cap ID; 2nd byte is the position of next
+			 * cap; next two bytes are the flags.
+			 */
+			uint16_t flags = ((uint16_t *)&cap)[1];
+
+			if (flags & PCI_MSIX_ENABLE)
+				hw->use_msix = VIRTIO_MSIX_ENABLED;
+			else
+				hw->use_msix = VIRTIO_MSIX_DISABLED;
+		}
+
+		if (cap.cap_vndr != PCI_CAP_ID_VNDR) {
+			PMD_INIT_LOG(DEBUG,
+				"[%2x] skipping non VNDR cap id: %02x",
+				pos, cap.cap_vndr);
+			goto next;
+		}
+
+		PMD_INIT_LOG(DEBUG,
+			"[%2x] cfg type: %u, bar: %u, offset: %04x, len: %u",
+			pos, cap.cfg_type, cap.bar, cap.offset, cap.length);
+
+		switch (cap.cfg_type) {
+		case VIRTIO_PCI_CAP_COMMON_CFG:
+			hw->common_cfg = get_cfg_addr(dev, &cap);
+			break;
+		case VIRTIO_PCI_CAP_NOTIFY_CFG:
+			rte_pci_read_config(dev, &hw->notify_off_multiplier,
+					4, pos + sizeof(cap));
+			hw->notify_base = get_cfg_addr(dev, &cap);
+			break;
+		case VIRTIO_PCI_CAP_DEVICE_CFG:
+			hw->dev_cfg = get_cfg_addr(dev, &cap);
+			break;
+		case VIRTIO_PCI_CAP_ISR_CFG:
+			hw->isr = get_cfg_addr(dev, &cap);
+			break;
+		}
+
+next:
+		pos = cap.cap_next;
+	}
+
+	if (hw->common_cfg == NULL || hw->notify_base == NULL ||
+	    hw->dev_cfg == NULL    || hw->isr == NULL) {
+		PMD_INIT_LOG(INFO, "no modern virtio pci device found.");
+		return -1;
+	}
+
+	PMD_INIT_LOG(INFO, "found modern virtio pci device.");
+
+	PMD_INIT_LOG(DEBUG, "common cfg mapped at: %p", hw->common_cfg);
+	PMD_INIT_LOG(DEBUG, "device cfg mapped at: %p", hw->dev_cfg);
+	PMD_INIT_LOG(DEBUG, "isr cfg mapped at: %p", hw->isr);
+	PMD_INIT_LOG(DEBUG, "notify base: %p, notify off multiplier: %u",
+		hw->notify_base, hw->notify_off_multiplier);
+
+	return 0;
+}
+
+/*
+ * Return -1:
+ *   if there is error mapping with VFIO/UIO.
+ *   if port map error when driver type is KDRV_NONE.
+ *   if whitelisted but driver type is KDRV_UNKNOWN.
+ * Return 1 if kernel driver is managing the device.
+ * Return 0 on success.
+ */
+int
+vtpci_cryptodev_init(struct rte_pci_device *dev, struct virtio_crypto_hw *hw)
+{
+	/*
+	 * Try if we can succeed reading virtio pci caps, which exists
+	 * only on modern pci device. If failed, we fallback to legacy
+	 * virtio handling.
+	 */
+	if (virtio_read_caps(dev, hw) == 0) {
+		PMD_INIT_LOG(INFO, "modern virtio pci detected.");
+		virtio_hw_internal[hw->dev_id].vtpci_ops =
+					&virtio_crypto_modern_ops;
+		hw->modern = 1;
+		return 0;
+	}
+
+	/*
+	 * virtio crypto conforms to virtio 1.0 and doesn't support
+	 * legacy mode
+	 */
+	return -1;
+}
diff --git a/drivers/crypto/virtio/virtio_pci.h b/drivers/crypto/virtio/virtio_pci.h
new file mode 100644
index 0000000..a469ea3
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_pci.h
@@ -0,0 +1,252 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 HUAWEI TECHNOLOGIES CO., LTD.
+ */
+
+#ifndef _VIRTIO_PCI_H_
+#define _VIRTIO_PCI_H_
+
+#include <linux/virtio_crypto.h>
+
+#include <stdint.h>
+
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+#include <rte_cryptodev.h>
+
+struct virtqueue;
+
+/* VirtIO PCI vendor/device ID. */
+#define VIRTIO_CRYPTO_PCI_VENDORID 0x1AF4
+#define VIRTIO_CRYPTO_PCI_DEVICEID 0x1054
+
+/* VirtIO ABI version, this must match exactly. */
+#define VIRTIO_PCI_ABI_VERSION 0
+
+/*
+ * VirtIO Header, located in BAR 0.
+ */
+#define VIRTIO_PCI_HOST_FEATURES  0  /* host's supported features (32bit, RO)*/
+#define VIRTIO_PCI_GUEST_FEATURES 4  /* guest's supported features (32, RW) */
+#define VIRTIO_PCI_QUEUE_PFN      8  /* physical address of VQ (32, RW) */
+#define VIRTIO_PCI_QUEUE_NUM      12 /* number of ring entries (16, RO) */
+#define VIRTIO_PCI_QUEUE_SEL      14 /* current VQ selection (16, RW) */
+#define VIRTIO_PCI_QUEUE_NOTIFY   16 /* notify host regarding VQ (16, RW) */
+#define VIRTIO_PCI_STATUS         18 /* device status register (8, RW) */
+#define VIRTIO_PCI_ISR            19 /* interrupt status register, reading
+				      * also clears the register (8, RO)
+				      */
+/* Only if MSIX is enabled: */
+
+/* configuration change vector (16, RW) */
+#define VIRTIO_MSI_CONFIG_VECTOR  20
+/* vector for selected VQ notifications */
+#define VIRTIO_MSI_QUEUE_VECTOR	  22
+
+/* The bit of the ISR which indicates a device has an interrupt. */
+#define VIRTIO_PCI_ISR_INTR   0x1
+/* The bit of the ISR which indicates a device configuration change. */
+#define VIRTIO_PCI_ISR_CONFIG 0x2
+/* Vector value used to disable MSI for queue. */
+#define VIRTIO_MSI_NO_VECTOR 0xFFFF
+
+/* Status byte for guest to report progress. */
+#define VIRTIO_CONFIG_STATUS_RESET     0x00
+#define VIRTIO_CONFIG_STATUS_ACK       0x01
+#define VIRTIO_CONFIG_STATUS_DRIVER    0x02
+#define VIRTIO_CONFIG_STATUS_DRIVER_OK 0x04
+#define VIRTIO_CONFIG_STATUS_FEATURES_OK 0x08
+#define VIRTIO_CONFIG_STATUS_FAILED    0x80
+
+/*
+ * Each virtqueue indirect descriptor list must be physically contiguous.
+ * To allow us to malloc(9) each list individually, limit the number
+ * supported to what will fit in one page. With 4KB pages, this is a limit
+ * of 256 descriptors. If there is ever a need for more, we can switch to
+ * contigmalloc(9) for the larger allocations, similar to what
+ * bus_dmamem_alloc(9) does.
+ *
+ * Note the sizeof(struct vring_desc) is 16 bytes.
+ */
+#define VIRTIO_MAX_INDIRECT ((int) (PAGE_SIZE / 16))
+
+/* Do we get callbacks when the ring is completely used, even if we've
+ * suppressed them?
+ */
+#define VIRTIO_F_NOTIFY_ON_EMPTY	24
+
+/* Can the device handle any descriptor layout? */
+#define VIRTIO_F_ANY_LAYOUT		27
+
+/* We support indirect buffer descriptors */
+#define VIRTIO_RING_F_INDIRECT_DESC	28
+
+#define VIRTIO_F_VERSION_1		32
+#define VIRTIO_F_IOMMU_PLATFORM	33
+
+/* The Guest publishes the used index for which it expects an interrupt
+ * at the end of the avail ring. Host should ignore the avail->flags field.
+ */
+/* The Host publishes the avail index for which it expects a kick
+ * at the end of the used ring. Guest should ignore the used->flags field.
+ */
+#define VIRTIO_RING_F_EVENT_IDX		29
+
+/* Common configuration */
+#define VIRTIO_PCI_CAP_COMMON_CFG	1
+/* Notifications */
+#define VIRTIO_PCI_CAP_NOTIFY_CFG	2
+/* ISR Status */
+#define VIRTIO_PCI_CAP_ISR_CFG		3
+/* Device specific configuration */
+#define VIRTIO_PCI_CAP_DEVICE_CFG	4
+/* PCI configuration access */
+#define VIRTIO_PCI_CAP_PCI_CFG		5
+
+/* This is the PCI capability header: */
+struct virtio_pci_cap {
+	uint8_t cap_vndr;	/* Generic PCI field: PCI_CAP_ID_VNDR */
+	uint8_t cap_next;	/* Generic PCI field: next ptr. */
+	uint8_t cap_len;	/* Generic PCI field: capability length */
+	uint8_t cfg_type;	/* Identifies the structure. */
+	uint8_t bar;		/* Where to find it. */
+	uint8_t padding[3];	/* Pad to full dword. */
+	uint32_t offset;	/* Offset within bar. */
+	uint32_t length;	/* Length of the structure, in bytes. */
+};
+
+struct virtio_pci_notify_cap {
+	struct virtio_pci_cap cap;
+	uint32_t notify_off_multiplier;	/* Multiplier for queue_notify_off. */
+};
+
+/* Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
+struct virtio_pci_common_cfg {
+	/* About the whole device. */
+	uint32_t device_feature_select;	/* read-write */
+	uint32_t device_feature;	/* read-only */
+	uint32_t guest_feature_select;	/* read-write */
+	uint32_t guest_feature;		/* read-write */
+	uint16_t msix_config;		/* read-write */
+	uint16_t num_queues;		/* read-only */
+	uint8_t device_status;		/* read-write */
+	uint8_t config_generation;	/* read-only */
+
+	/* About a specific virtqueue. */
+	uint16_t queue_select;		/* read-write */
+	uint16_t queue_size;		/* read-write, power of 2. */
+	uint16_t queue_msix_vector;	/* read-write */
+	uint16_t queue_enable;		/* read-write */
+	uint16_t queue_notify_off;	/* read-only */
+	uint32_t queue_desc_lo;		/* read-write */
+	uint32_t queue_desc_hi;		/* read-write */
+	uint32_t queue_avail_lo;	/* read-write */
+	uint32_t queue_avail_hi;	/* read-write */
+	uint32_t queue_used_lo;		/* read-write */
+	uint32_t queue_used_hi;		/* read-write */
+};
+
+struct virtio_crypto_hw;
+
+struct virtio_pci_ops {
+	void (*read_dev_cfg)(struct virtio_crypto_hw *hw, size_t offset,
+			     void *dst, int len);
+	void (*write_dev_cfg)(struct virtio_crypto_hw *hw, size_t offset,
+			      const void *src, int len);
+	void (*reset)(struct virtio_crypto_hw *hw);
+
+	uint8_t (*get_status)(struct virtio_crypto_hw *hw);
+	void (*set_status)(struct virtio_crypto_hw *hw, uint8_t status);
+
+	uint64_t (*get_features)(struct virtio_crypto_hw *hw);
+	void (*set_features)(struct virtio_crypto_hw *hw, uint64_t features);
+
+	uint8_t (*get_isr)(struct virtio_crypto_hw *hw);
+
+	uint16_t (*set_config_irq)(struct virtio_crypto_hw *hw, uint16_t vec);
+
+	uint16_t (*set_queue_irq)(struct virtio_crypto_hw *hw,
+			struct virtqueue *vq, uint16_t vec);
+
+	uint16_t (*get_queue_num)(struct virtio_crypto_hw *hw,
+			uint16_t queue_id);
+	int (*setup_queue)(struct virtio_crypto_hw *hw, struct virtqueue *vq);
+	void (*del_queue)(struct virtio_crypto_hw *hw, struct virtqueue *vq);
+	void (*notify_queue)(struct virtio_crypto_hw *hw, struct virtqueue *vq);
+};
+
+struct virtio_crypto_hw {
+	/* control queue */
+	struct virtqueue *cvq;
+	uint16_t    dev_id;
+	uint16_t    max_dataqueues;
+	uint64_t    req_guest_features;
+	uint64_t    guest_features;
+	uint8_t	    use_msix;
+	uint8_t     modern;
+	uint32_t    notify_off_multiplier;
+	uint8_t     *isr;
+	uint16_t    *notify_base;
+	struct virtio_pci_common_cfg *common_cfg;
+	struct virtio_crypto_config *dev_cfg;
+};
+
+/*
+ * While virtio_crypto_hw is stored in shared memory, this structure stores
+ * some infos that may vary in the multiple process model locally.
+ * For example, the vtpci_ops pointer.
+ */
+struct virtio_hw_internal {
+	const struct virtio_pci_ops *vtpci_ops;
+	struct rte_pci_ioport io;
+};
+
+#define VTPCI_OPS(hw)	(virtio_hw_internal[(hw)->dev_id].vtpci_ops)
+#define VTPCI_IO(hw)	(&virtio_hw_internal[(hw)->dev_id].io)
+
+extern struct virtio_hw_internal virtio_hw_internal[RTE_MAX_VIRTIO_CRYPTO];
+
+/*
+ * How many bits to shift physical queue address written to QUEUE_PFN.
+ * 12 is historical, and due to x86 page size.
+ */
+#define VIRTIO_PCI_QUEUE_ADDR_SHIFT 12
+
+/* The alignment to use between consumer and producer parts of vring. */
+#define VIRTIO_PCI_VRING_ALIGN 4096
+
+enum virtio_msix_status {
+	VIRTIO_MSIX_NONE = 0,
+	VIRTIO_MSIX_DISABLED = 1,
+	VIRTIO_MSIX_ENABLED = 2
+};
+
+static inline int
+vtpci_with_feature(struct virtio_crypto_hw *hw, uint64_t bit)
+{
+	return (hw->guest_features & (1ULL << bit)) != 0;
+}
+
+/*
+ * Function declaration from virtio_pci.c
+ */
+int vtpci_cryptodev_init(struct rte_pci_device *dev,
+	struct virtio_crypto_hw *hw);
+void vtpci_cryptodev_reset(struct virtio_crypto_hw *hw);
+
+void vtpci_cryptodev_reinit_complete(struct virtio_crypto_hw *hw);
+
+uint8_t vtpci_cryptodev_get_status(struct virtio_crypto_hw *hw);
+void vtpci_cryptodev_set_status(struct virtio_crypto_hw *hw, uint8_t status);
+
+uint64_t vtpci_cryptodev_negotiate_features(struct virtio_crypto_hw *hw,
+	uint64_t host_features);
+
+void vtpci_write_cryptodev_config(struct virtio_crypto_hw *hw, size_t offset,
+	const void *src, int length);
+
+void vtpci_read_cryptodev_config(struct virtio_crypto_hw *hw, size_t offset,
+	void *dst, int length);
+
+uint8_t vtpci_cryptodev_isr(struct virtio_crypto_hw *hw);
+
+#endif /* _VIRTIO_PCI_H_ */
diff --git a/drivers/crypto/virtio/virtio_ring.h b/drivers/crypto/virtio/virtio_ring.h
new file mode 100644
index 0000000..ee30674
--- /dev/null
+++ b/drivers/crypto/virtio/virtio_ring.h
@@ -0,0 +1,137 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 HUAWEI TECHNOLOGIES CO., LTD.
+ */
+
+#ifndef _VIRTIO_RING_H_
+#define _VIRTIO_RING_H_
+
+#include <stdint.h>
+
+#include <rte_common.h>
+
+/* This marks a buffer as continuing via the next field. */
+#define VRING_DESC_F_NEXT       1
+/* This marks a buffer as write-only (otherwise read-only). */
+#define VRING_DESC_F_WRITE      2
+/* This means the buffer contains a list of buffer descriptors. */
+#define VRING_DESC_F_INDIRECT   4
+
+/* The Host uses this in used->flags to advise the Guest: don't kick me
+ * when you add a buffer.  It's unreliable, so it's simply an
+ * optimization.  Guest will still kick if it's out of buffers.
+ */
+#define VRING_USED_F_NO_NOTIFY  1
+/* The Guest uses this in avail->flags to advise the Host: don't
+ * interrupt me when you consume a buffer.  It's unreliable, so it's
+ * simply an optimization.
+ */
+#define VRING_AVAIL_F_NO_INTERRUPT  1
+
+/* VirtIO ring descriptors: 16 bytes.
+ * These can chain together via "next".
+ */
+struct vring_desc {
+	uint64_t addr;  /*  Address (guest-physical). */
+	uint32_t len;   /* Length. */
+	uint16_t flags; /* The flags as indicated above. */
+	uint16_t next;  /* We chain unused descriptors via this. */
+};
+
+struct vring_avail {
+	uint16_t flags;
+	uint16_t idx;
+	uint16_t ring[0];
+};
+
+/* id is a 16bit index. uint32_t is used here for ids for padding reasons. */
+struct vring_used_elem {
+	/* Index of start of used descriptor chain. */
+	uint32_t id;
+	/* Total length of the descriptor chain which was written to. */
+	uint32_t len;
+};
+
+struct vring_used {
+	uint16_t flags;
+	volatile uint16_t idx;
+	struct vring_used_elem ring[0];
+};
+
+struct vring {
+	unsigned int num;
+	struct vring_desc  *desc;
+	struct vring_avail *avail;
+	struct vring_used  *used;
+};
+
+/* The standard layout for the ring is a continuous chunk of memory which
+ * looks like this.  We assume num is a power of 2.
+ *
+ * struct vring {
+ *      // The actual descriptors (16 bytes each)
+ *      struct vring_desc desc[num];
+ *
+ *      // A ring of available descriptor heads with free-running index.
+ *      __u16 avail_flags;
+ *      __u16 avail_idx;
+ *      __u16 available[num];
+ *      __u16 used_event_idx;
+ *
+ *      // Padding to the next align boundary.
+ *      char pad[];
+ *
+ *      // A ring of used descriptor heads with free-running index.
+ *      __u16 used_flags;
+ *      __u16 used_idx;
+ *      struct vring_used_elem used[num];
+ *      __u16 avail_event_idx;
+ * };
+ *
+ * NOTE: for VirtIO PCI, align is 4096.
+ */
+
+/*
+ * We publish the used event index at the end of the available ring, and vice
+ * versa. They are at the end for backwards compatibility.
+ */
+#define vring_used_event(vr)  ((vr)->avail->ring[(vr)->num])
+#define vring_avail_event(vr) (*(uint16_t *)&(vr)->used->ring[(vr)->num])
+
+static inline size_t
+vring_size(unsigned int num, unsigned long align)
+{
+	size_t size;
+
+	size = num * sizeof(struct vring_desc);
+	size += sizeof(struct vring_avail) + (num * sizeof(uint16_t));
+	size = RTE_ALIGN_CEIL(size, align);
+	size += sizeof(struct vring_used) +
+		(num * sizeof(struct vring_used_elem));
+	return size;
+}
+
+static inline void
+vring_init(struct vring *vr, unsigned int num, uint8_t *p,
+	unsigned long align)
+{
+	vr->num = num;
+	vr->desc = (struct vring_desc *) p;
+	vr->avail = (struct vring_avail *) (p +
+		num * sizeof(struct vring_desc));
+	vr->used = (void *)
+		RTE_ALIGN_CEIL((uintptr_t)(&vr->avail->ring[num]), align);
+}
+
+/*
+ * The following is used with VIRTIO_RING_F_EVENT_IDX.
+ * Assuming a given event_idx value from the other size, if we have
+ * just incremented index from old to new_idx, should we trigger an
+ * event?
+ */
+static inline int
+vring_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old)
+{
+	return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old);
+}
+
+#endif /* _VIRTIO_RING_H_ */
diff --git a/drivers/crypto/virtio/virtqueue.c b/drivers/crypto/virtio/virtqueue.c
new file mode 100644
index 0000000..fd8be58
--- /dev/null
+++ b/drivers/crypto/virtio/virtqueue.c
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 HUAWEI TECHNOLOGIES CO., LTD.
+ */
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_crypto.h>
+#include <rte_malloc.h>
+
+#include "virtqueue.h"
+
+void
+virtqueue_disable_intr(struct virtqueue *vq)
+{
+	/*
+	 * Set VRING_AVAIL_F_NO_INTERRUPT to hint host
+	 * not to interrupt when it consumes packets
+	 * Note: this is only considered a hint to the host
+	 */
+	vq->vq_ring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
+}
+
+void
+virtqueue_detatch_unused(struct virtqueue *vq)
+{
+	struct rte_crypto_op *cop = NULL;
+
+	int idx;
+
+	if (vq != NULL)
+		for (idx = 0; idx < vq->vq_nentries; idx++) {
+			cop = vq->vq_descx[idx].crypto_op;
+			if (cop) {
+				if (cop->sym->m_src)
+					rte_pktmbuf_free(cop->sym->m_src);
+				if (cop->sym->m_dst)
+					rte_pktmbuf_free(cop->sym->m_dst);
+				rte_crypto_op_free(cop);
+				vq->vq_descx[idx].crypto_op = NULL;
+			}
+		}
+}
diff --git a/drivers/crypto/virtio/virtqueue.h b/drivers/crypto/virtio/virtqueue.h
new file mode 100644
index 0000000..1bd0e89
--- /dev/null
+++ b/drivers/crypto/virtio/virtqueue.h
@@ -0,0 +1,176 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 HUAWEI TECHNOLOGIES CO., LTD.
+ */
+
+#ifndef _VIRTQUEUE_H_
+#define _VIRTQUEUE_H_
+
+#include <linux/virtio_crypto.h>
+
+#include <stdint.h>
+
+#include <rte_atomic.h>
+#include <rte_memory.h>
+#include <rte_memzone.h>
+#include <rte_mempool.h>
+
+#include "virtio_pci.h"
+#include "virtio_ring.h"
+#include "virtio_logs.h"
+
+struct rte_mbuf;
+
+/*
+ * Per virtio_config.h in Linux.
+ *     For virtio_pci on SMP, we don't need to order with respect to MMIO
+ *     accesses through relaxed memory I/O windows, so smp_mb() et al are
+ *     sufficient.
+ *
+ */
+#define virtio_mb()	rte_smp_mb()
+#define virtio_rmb()	rte_smp_rmb()
+#define virtio_wmb()	rte_smp_wmb()
+
+#define VIRTQUEUE_MAX_NAME_SZ 32
+
+enum { VTCRYPTO_DATAQ = 0, VTCRYPTO_CTRLQ = 1 };
+
+/**
+ * The maximum virtqueue size is 2^15. Use that value as the end of
+ * descriptor chain terminator since it will never be a valid index
+ * in the descriptor table. This is used to verify we are correctly
+ * handling vq_free_cnt.
+ */
+#define VQ_RING_DESC_CHAIN_END 32768
+
+struct vq_desc_extra {
+	void     *crypto_op;
+	void     *cookie;
+	uint16_t ndescs;
+};
+
+struct virtqueue {
+	/**< virtio_crypto_hw structure pointer. */
+	struct virtio_crypto_hw *hw;
+	/**< mem zone to populate RX ring. */
+	const struct rte_memzone *mz;
+	/**< memzone to populate hdr and request. */
+	struct rte_mempool *mpool;
+	uint8_t     dev_id;              /**< Device identifier. */
+	uint16_t    vq_queue_index;       /**< PCI queue index */
+
+	void        *vq_ring_virt_mem;    /**< linear address of vring*/
+	unsigned int vq_ring_size;
+	phys_addr_t vq_ring_mem;          /**< physical address of vring */
+
+	struct vring vq_ring;    /**< vring keeping desc, used and avail */
+	uint16_t    vq_free_cnt; /**< num of desc available */
+	uint16_t    vq_nentries; /**< vring desc numbers */
+
+	/**
+	 * Head of the free chain in the descriptor table. If
+	 * there are no free descriptors, this will be set to
+	 * VQ_RING_DESC_CHAIN_END.
+	 */
+	uint16_t  vq_desc_head_idx;
+	uint16_t  vq_desc_tail_idx;
+	/**
+	 * Last consumed descriptor in the used table,
+	 * trails vq_ring.used->idx.
+	 */
+	uint16_t vq_used_cons_idx;
+	uint16_t vq_avail_idx;
+
+	/* Statistics */
+	uint64_t	packets_sent_total;
+	uint64_t	packets_sent_failed;
+	uint64_t	packets_received_total;
+	uint64_t	packets_received_failed;
+
+	uint16_t  *notify_addr;
+
+	struct vq_desc_extra vq_descx[0];
+};
+
+/**
+ * Tell the backend not to interrupt us.
+ */
+void virtqueue_disable_intr(struct virtqueue *vq);
+
+/**
+ *  Get all mbufs to be freed.
+ */
+void virtqueue_detatch_unused(struct virtqueue *vq);
+
+static inline int
+virtqueue_full(const struct virtqueue *vq)
+{
+	return vq->vq_free_cnt == 0;
+}
+
+#define VIRTQUEUE_NUSED(vq) \
+	((uint16_t)((vq)->vq_ring.used->idx - (vq)->vq_used_cons_idx))
+
+static inline void
+vq_update_avail_idx(struct virtqueue *vq)
+{
+	virtio_wmb();
+	vq->vq_ring.avail->idx = vq->vq_avail_idx;
+}
+
+static inline void
+vq_update_avail_ring(struct virtqueue *vq, uint16_t desc_idx)
+{
+	uint16_t avail_idx;
+	/*
+	 * Place the head of the descriptor chain into the next slot and make
+	 * it usable to the host. The chain is made available now rather than
+	 * deferring to virtqueue_notify() in the hopes that if the host is
+	 * currently running on another CPU, we can keep it processing the new
+	 * descriptor.
+	 */
+	avail_idx = (uint16_t)(vq->vq_avail_idx & (vq->vq_nentries - 1));
+	if (unlikely(vq->vq_ring.avail->ring[avail_idx] != desc_idx))
+		vq->vq_ring.avail->ring[avail_idx] = desc_idx;
+	vq->vq_avail_idx++;
+}
+
+static inline int
+virtqueue_kick_prepare(struct virtqueue *vq)
+{
+	return !(vq->vq_ring.used->flags & VRING_USED_F_NO_NOTIFY);
+}
+
+static inline void
+virtqueue_notify(struct virtqueue *vq)
+{
+	/*
+	 * Ensure updated avail->idx is visible to host.
+	 * For virtio on IA, the notificaiton is through io port operation
+	 * which is a serialization instruction itself.
+	 */
+	VTPCI_OPS(vq->hw)->notify_queue(vq->hw, vq);
+}
+
+/**
+ * Dump virtqueue internal structures, for debug purpose only.
+ */
+#ifdef RTE_LIBRTE_PMD_VIRTIO_CRYPTO_DEBUG_DUMP
+#define VIRTQUEUE_DUMP(vq) do { \
+	uint16_t used_idx, nused; \
+	used_idx = (vq)->vq_ring.used->idx; \
+	nused = (uint16_t)(used_idx - (vq)->vq_used_cons_idx); \
+	PMD_INIT_LOG(DEBUG, \
+	  "VQ: - size=%d; free=%d; used=%d; desc_head_idx=%d;" \
+	  " avail.idx=%d; used_cons_idx=%d; used.idx=%d;" \
+	  " avail.flags=0x%x; used.flags=0x%x", \
+	  (vq)->vq_nentries, (vq)->vq_free_cnt, nused, \
+	  (vq)->vq_desc_head_idx, (vq)->vq_ring.avail->idx, \
+	  (vq)->vq_used_cons_idx, (vq)->vq_ring.used->idx, \
+	  (vq)->vq_ring.avail->flags, (vq)->vq_ring.used->flags); \
+} while (0)
+#else
+#define VIRTQUEUE_DUMP(vq) do { } while (0)
+#endif
+
+#endif /* _VIRTQUEUE_H_ */
-- 
1.8.3.1

^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH] ethdev: remove versioning of ethdev filter control function
@ 2018-02-27 10:29  3% Kirill Rybalchenko
  2018-02-27 11:01  0% ` Ferruh Yigit
  2018-02-27 14:18  7% ` [dpdk-dev] [PATCH v2] " Kirill Rybalchenko
  0 siblings, 2 replies; 200+ results
From: Kirill Rybalchenko @ 2018-02-27 10:29 UTC (permalink / raw)
  To: dev; +Cc: kirill.rybalchenko, andrey.chilikin, thomas, ferruh.yigit

In 18.02 release the ABI of ethdev component was changed.
To keep compatibility with previous versions of the library
the versioning of rte_eth_dev_filter_ctrl function was implemented.
As soon as deprecation note was issued in 18.02 release, there is
no need to keep compatibility with previous versions.
Remove the versioning of rte_eth_dev_filter_ctrl function.

Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 155 +-----------------------------------------
 1 file changed, 2 insertions(+), 153 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0590f0c..78b8376 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -34,7 +34,6 @@
 #include <rte_errno.h>
 #include <rte_spinlock.h>
 #include <rte_string_fns.h>
-#include <rte_compat.h>
 
 #include "rte_ether.h"
 #include "rte_ethdev.h"
@@ -3490,153 +3489,8 @@ rte_eth_dev_filter_supported(uint16_t port_id,
 }
 
 int
-rte_eth_dev_filter_ctrl_v22(uint16_t port_id,
-			    enum rte_filter_type filter_type,
-			    enum rte_filter_op filter_op, void *arg);
-
-int
-rte_eth_dev_filter_ctrl_v22(uint16_t port_id,
-			    enum rte_filter_type filter_type,
-			    enum rte_filter_op filter_op, void *arg)
-{
-	struct rte_eth_fdir_info_v22 {
-		enum rte_fdir_mode mode;
-		struct rte_eth_fdir_masks mask;
-		struct rte_eth_fdir_flex_conf flex_conf;
-		uint32_t guarant_spc;
-		uint32_t best_spc;
-		uint32_t flow_types_mask[1];
-		uint32_t max_flexpayload;
-		uint32_t flex_payload_unit;
-		uint32_t max_flex_payload_segment_num;
-		uint16_t flex_payload_limit;
-		uint32_t flex_bitmask_unit;
-		uint32_t max_flex_bitmask_num;
-	};
-
-	struct rte_eth_hash_global_conf_v22 {
-		enum rte_eth_hash_function hash_func;
-		uint32_t sym_hash_enable_mask[1];
-		uint32_t valid_bit_mask[1];
-	};
-
-	struct rte_eth_hash_filter_info_v22 {
-		enum rte_eth_hash_filter_info_type info_type;
-		union {
-			uint8_t enable;
-			struct rte_eth_hash_global_conf_v22 global_conf;
-			struct rte_eth_input_set_conf input_set_conf;
-		} info;
-	};
-
-	struct rte_eth_dev *dev;
-
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-
-	dev = &rte_eth_devices[port_id];
-	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	if (filter_op == RTE_ETH_FILTER_INFO) {
-		int retval;
-		struct rte_eth_fdir_info_v22 *fdir_info_v22;
-		struct rte_eth_fdir_info fdir_info;
-
-		fdir_info_v22 = (struct rte_eth_fdir_info_v22 *)arg;
-
-		retval = (*dev->dev_ops->filter_ctrl)(dev, filter_type,
-			  filter_op, (void *)&fdir_info);
-		fdir_info_v22->mode = fdir_info.mode;
-		fdir_info_v22->mask = fdir_info.mask;
-		fdir_info_v22->flex_conf = fdir_info.flex_conf;
-		fdir_info_v22->guarant_spc = fdir_info.guarant_spc;
-		fdir_info_v22->best_spc = fdir_info.best_spc;
-		fdir_info_v22->flow_types_mask[0] =
-			(uint32_t)fdir_info.flow_types_mask[0];
-		fdir_info_v22->max_flexpayload = fdir_info.max_flexpayload;
-		fdir_info_v22->flex_payload_unit = fdir_info.flex_payload_unit;
-		fdir_info_v22->max_flex_payload_segment_num =
-			fdir_info.max_flex_payload_segment_num;
-		fdir_info_v22->flex_payload_limit =
-			fdir_info.flex_payload_limit;
-		fdir_info_v22->flex_bitmask_unit = fdir_info.flex_bitmask_unit;
-		fdir_info_v22->max_flex_bitmask_num =
-			fdir_info.max_flex_bitmask_num;
-		return retval;
-	} else if (filter_op == RTE_ETH_FILTER_GET) {
-		int retval;
-		struct rte_eth_hash_filter_info f_info;
-		struct rte_eth_hash_filter_info_v22 *f_info_v22 =
-			(struct rte_eth_hash_filter_info_v22 *)arg;
-
-		f_info.info_type = f_info_v22->info_type;
-		retval = (*dev->dev_ops->filter_ctrl)(dev, filter_type,
-			  filter_op, (void *)&f_info);
-
-		switch (f_info_v22->info_type) {
-		case RTE_ETH_HASH_FILTER_SYM_HASH_ENA_PER_PORT:
-			f_info_v22->info.enable = f_info.info.enable;
-			break;
-		case RTE_ETH_HASH_FILTER_GLOBAL_CONFIG:
-			f_info_v22->info.global_conf.hash_func =
-				f_info.info.global_conf.hash_func;
-			f_info_v22->info.global_conf.sym_hash_enable_mask[0] =
-				(uint32_t)
-				f_info.info.global_conf.sym_hash_enable_mask[0];
-			f_info_v22->info.global_conf.valid_bit_mask[0] =
-				(uint32_t)
-				f_info.info.global_conf.valid_bit_mask[0];
-			break;
-		case RTE_ETH_HASH_FILTER_INPUT_SET_SELECT:
-			f_info_v22->info.input_set_conf =
-				f_info.info.input_set_conf;
-			break;
-		default:
-			break;
-		}
-		return retval;
-	} else if (filter_op == RTE_ETH_FILTER_SET) {
-		struct rte_eth_hash_filter_info f_info;
-		struct rte_eth_hash_filter_info_v22 *f_v22 =
-			(struct rte_eth_hash_filter_info_v22 *)arg;
-
-		f_info.info_type = f_v22->info_type;
-		switch (f_v22->info_type) {
-		case RTE_ETH_HASH_FILTER_SYM_HASH_ENA_PER_PORT:
-			f_info.info.enable = f_v22->info.enable;
-			break;
-		case RTE_ETH_HASH_FILTER_GLOBAL_CONFIG:
-			f_info.info.global_conf.hash_func =
-				f_v22->info.global_conf.hash_func;
-			f_info.info.global_conf.sym_hash_enable_mask[0] =
-				(uint32_t)
-				f_v22->info.global_conf.sym_hash_enable_mask[0];
-			f_info.info.global_conf.valid_bit_mask[0] =
-				(uint32_t)
-				f_v22->info.global_conf.valid_bit_mask[0];
-			break;
-		case RTE_ETH_HASH_FILTER_INPUT_SET_SELECT:
-			f_info.info.input_set_conf =
-				f_v22->info.input_set_conf;
-			break;
-		default:
-			break;
-		}
-		return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op,
-						    (void *)&f_info);
-	} else
-		return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op,
-						    arg);
-}
-VERSION_SYMBOL(rte_eth_dev_filter_ctrl, _v22, 2.2);
-
-int
-rte_eth_dev_filter_ctrl_v1802(uint16_t port_id,
-			      enum rte_filter_type filter_type,
-			      enum rte_filter_op filter_op, void *arg);
-
-int
-rte_eth_dev_filter_ctrl_v1802(uint16_t port_id,
-			      enum rte_filter_type filter_type,
-			      enum rte_filter_op filter_op, void *arg)
+rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
+			enum rte_filter_op filter_op, void *arg)
 {
 	struct rte_eth_dev *dev;
 
@@ -3647,11 +3501,6 @@ rte_eth_dev_filter_ctrl_v1802(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
 							     filter_op, arg));
 }
-BIND_DEFAULT_SYMBOL(rte_eth_dev_filter_ctrl, _v1802, 18.02);
-MAP_STATIC_SYMBOL(int rte_eth_dev_filter_ctrl(uint16_t port_id,
-		  enum rte_filter_type filter_type,
-		  enum rte_filter_op filter_op, void *arg),
-		  rte_eth_dev_filter_ctrl_v1802);
 
 void *
 rte_eth_add_rx_callback(uint16_t port_id, uint16_t queue_id,
-- 
2.5.5

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] ethdev: remove versioning of ethdev filter control function
  2018-02-27 10:29  3% [dpdk-dev] [PATCH] ethdev: remove versioning of ethdev filter control function Kirill Rybalchenko
@ 2018-02-27 11:01  0% ` Ferruh Yigit
  2018-02-27 13:45  3%   ` Thomas Monjalon
  2018-02-27 14:18  7% ` [dpdk-dev] [PATCH v2] " Kirill Rybalchenko
  1 sibling, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-02-27 11:01 UTC (permalink / raw)
  To: Kirill Rybalchenko, dev; +Cc: andrey.chilikin, thomas

On 2/27/2018 10:29 AM, Kirill Rybalchenko wrote:
> In 18.02 release the ABI of ethdev component was changed.
> To keep compatibility with previous versions of the library
> the versioning of rte_eth_dev_filter_ctrl function was implemented.
> As soon as deprecation note was issued in 18.02 release, there is
> no need to keep compatibility with previous versions.
> Remove the versioning of rte_eth_dev_filter_ctrl function.
> 
> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.c | 155 +-----------------------------------------

Hi Kirill,

You need to update .map file and removed deprecation notice in this patch.

Thanks,
ferruh

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] ethdev: remove versioning of ethdev filter control function
  2018-02-27 11:01  0% ` Ferruh Yigit
@ 2018-02-27 13:45  3%   ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2018-02-27 13:45 UTC (permalink / raw)
  To: Ferruh Yigit, Kirill Rybalchenko; +Cc: dev, andrey.chilikin

27/02/2018 12:01, Ferruh Yigit:
> On 2/27/2018 10:29 AM, Kirill Rybalchenko wrote:
> > In 18.02 release the ABI of ethdev component was changed.
> > To keep compatibility with previous versions of the library
> > the versioning of rte_eth_dev_filter_ctrl function was implemented.
> > As soon as deprecation note was issued in 18.02 release, there is
> > no need to keep compatibility with previous versions.
> > Remove the versioning of rte_eth_dev_filter_ctrl function.
> > 
> > Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.c | 155 +-----------------------------------------
> 
> Hi Kirill,
> 
> You need to update .map file and removed deprecation notice in this patch.

And bump the ABI version in Makefile and release notes.

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2] ethdev: remove versioning of ethdev filter control function
  2018-02-27 10:29  3% [dpdk-dev] [PATCH] ethdev: remove versioning of ethdev filter control function Kirill Rybalchenko
  2018-02-27 11:01  0% ` Ferruh Yigit
@ 2018-02-27 14:18  7% ` Kirill Rybalchenko
  2018-03-07 17:17  0%   ` Ferruh Yigit
  1 sibling, 1 reply; 200+ results
From: Kirill Rybalchenko @ 2018-02-27 14:18 UTC (permalink / raw)
  To: dev; +Cc: kirill.rybalchenko, andrey.chilikin, thomas, ferruh.yigit

In 18.02 release the ABI of ethdev component was changed.
To keep compatibility with previous versions of the library
the versioning of rte_eth_dev_filter_ctrl function was implemented.
As soon as deprecation note was issued in 18.02 release, there is
no need to keep compatibility with previous versions.
Remove the versioning of rte_eth_dev_filter_ctrl function.

v2:
Modify map file, increment library version,
remove deprecation notice

Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
---
 doc/guides/rel_notes/deprecation.rst    |   6 --
 doc/guides/rel_notes/release_18_05.rst  |   2 +-
 lib/librte_ether/Makefile               |   2 +-
 lib/librte_ether/rte_ethdev.c           | 155 +-------------------------------
 lib/librte_ether/rte_ethdev_version.map |   1 -
 5 files changed, 4 insertions(+), 162 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 74c18ed..6594585 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -149,12 +149,6 @@ Deprecation Notices
   as parameter. For consistency functions adding callback will return
   ``struct rte_eth_rxtx_callback \*`` instead of ``void \*``.
 
-* ethdev: The size of variables ``flow_types_mask`` in
-  ``rte_eth_fdir_info structure``, ``sym_hash_enable_mask`` and
-  ``valid_bit_mask`` in ``rte_eth_hash_global_conf`` structure
-  will be increased from 32 to 64 bits to fulfill hardware requirements.
-  This change will break existing ABI as size of the structures will increase.
-
 * ethdev: ``rte_eth_dev_get_sec_ctx()`` fix port id storage
   ``rte_eth_dev_get_sec_ctx()`` is using ``uint8_t`` for ``port_id``,
   which should be ``uint16_t``.
diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index 3923dc2..22da411 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -128,7 +128,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_cryptodev.so.4
      librte_distributor.so.1
      librte_eal.so.6
-     librte_ethdev.so.8
+   + librte_ethdev.so.9
      librte_eventdev.so.3
      librte_flow_classify.so.1
      librte_gro.so.1
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 3ca5782..c2f2f7d 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -16,7 +16,7 @@ LDLIBS += -lrte_mbuf
 
 EXPORT_MAP := rte_ethdev_version.map
 
-LIBABIVER := 8
+LIBABIVER := 9
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0590f0c..78b8376 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -34,7 +34,6 @@
 #include <rte_errno.h>
 #include <rte_spinlock.h>
 #include <rte_string_fns.h>
-#include <rte_compat.h>
 
 #include "rte_ether.h"
 #include "rte_ethdev.h"
@@ -3490,153 +3489,8 @@ rte_eth_dev_filter_supported(uint16_t port_id,
 }
 
 int
-rte_eth_dev_filter_ctrl_v22(uint16_t port_id,
-			    enum rte_filter_type filter_type,
-			    enum rte_filter_op filter_op, void *arg);
-
-int
-rte_eth_dev_filter_ctrl_v22(uint16_t port_id,
-			    enum rte_filter_type filter_type,
-			    enum rte_filter_op filter_op, void *arg)
-{
-	struct rte_eth_fdir_info_v22 {
-		enum rte_fdir_mode mode;
-		struct rte_eth_fdir_masks mask;
-		struct rte_eth_fdir_flex_conf flex_conf;
-		uint32_t guarant_spc;
-		uint32_t best_spc;
-		uint32_t flow_types_mask[1];
-		uint32_t max_flexpayload;
-		uint32_t flex_payload_unit;
-		uint32_t max_flex_payload_segment_num;
-		uint16_t flex_payload_limit;
-		uint32_t flex_bitmask_unit;
-		uint32_t max_flex_bitmask_num;
-	};
-
-	struct rte_eth_hash_global_conf_v22 {
-		enum rte_eth_hash_function hash_func;
-		uint32_t sym_hash_enable_mask[1];
-		uint32_t valid_bit_mask[1];
-	};
-
-	struct rte_eth_hash_filter_info_v22 {
-		enum rte_eth_hash_filter_info_type info_type;
-		union {
-			uint8_t enable;
-			struct rte_eth_hash_global_conf_v22 global_conf;
-			struct rte_eth_input_set_conf input_set_conf;
-		} info;
-	};
-
-	struct rte_eth_dev *dev;
-
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
-
-	dev = &rte_eth_devices[port_id];
-	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP);
-	if (filter_op == RTE_ETH_FILTER_INFO) {
-		int retval;
-		struct rte_eth_fdir_info_v22 *fdir_info_v22;
-		struct rte_eth_fdir_info fdir_info;
-
-		fdir_info_v22 = (struct rte_eth_fdir_info_v22 *)arg;
-
-		retval = (*dev->dev_ops->filter_ctrl)(dev, filter_type,
-			  filter_op, (void *)&fdir_info);
-		fdir_info_v22->mode = fdir_info.mode;
-		fdir_info_v22->mask = fdir_info.mask;
-		fdir_info_v22->flex_conf = fdir_info.flex_conf;
-		fdir_info_v22->guarant_spc = fdir_info.guarant_spc;
-		fdir_info_v22->best_spc = fdir_info.best_spc;
-		fdir_info_v22->flow_types_mask[0] =
-			(uint32_t)fdir_info.flow_types_mask[0];
-		fdir_info_v22->max_flexpayload = fdir_info.max_flexpayload;
-		fdir_info_v22->flex_payload_unit = fdir_info.flex_payload_unit;
-		fdir_info_v22->max_flex_payload_segment_num =
-			fdir_info.max_flex_payload_segment_num;
-		fdir_info_v22->flex_payload_limit =
-			fdir_info.flex_payload_limit;
-		fdir_info_v22->flex_bitmask_unit = fdir_info.flex_bitmask_unit;
-		fdir_info_v22->max_flex_bitmask_num =
-			fdir_info.max_flex_bitmask_num;
-		return retval;
-	} else if (filter_op == RTE_ETH_FILTER_GET) {
-		int retval;
-		struct rte_eth_hash_filter_info f_info;
-		struct rte_eth_hash_filter_info_v22 *f_info_v22 =
-			(struct rte_eth_hash_filter_info_v22 *)arg;
-
-		f_info.info_type = f_info_v22->info_type;
-		retval = (*dev->dev_ops->filter_ctrl)(dev, filter_type,
-			  filter_op, (void *)&f_info);
-
-		switch (f_info_v22->info_type) {
-		case RTE_ETH_HASH_FILTER_SYM_HASH_ENA_PER_PORT:
-			f_info_v22->info.enable = f_info.info.enable;
-			break;
-		case RTE_ETH_HASH_FILTER_GLOBAL_CONFIG:
-			f_info_v22->info.global_conf.hash_func =
-				f_info.info.global_conf.hash_func;
-			f_info_v22->info.global_conf.sym_hash_enable_mask[0] =
-				(uint32_t)
-				f_info.info.global_conf.sym_hash_enable_mask[0];
-			f_info_v22->info.global_conf.valid_bit_mask[0] =
-				(uint32_t)
-				f_info.info.global_conf.valid_bit_mask[0];
-			break;
-		case RTE_ETH_HASH_FILTER_INPUT_SET_SELECT:
-			f_info_v22->info.input_set_conf =
-				f_info.info.input_set_conf;
-			break;
-		default:
-			break;
-		}
-		return retval;
-	} else if (filter_op == RTE_ETH_FILTER_SET) {
-		struct rte_eth_hash_filter_info f_info;
-		struct rte_eth_hash_filter_info_v22 *f_v22 =
-			(struct rte_eth_hash_filter_info_v22 *)arg;
-
-		f_info.info_type = f_v22->info_type;
-		switch (f_v22->info_type) {
-		case RTE_ETH_HASH_FILTER_SYM_HASH_ENA_PER_PORT:
-			f_info.info.enable = f_v22->info.enable;
-			break;
-		case RTE_ETH_HASH_FILTER_GLOBAL_CONFIG:
-			f_info.info.global_conf.hash_func =
-				f_v22->info.global_conf.hash_func;
-			f_info.info.global_conf.sym_hash_enable_mask[0] =
-				(uint32_t)
-				f_v22->info.global_conf.sym_hash_enable_mask[0];
-			f_info.info.global_conf.valid_bit_mask[0] =
-				(uint32_t)
-				f_v22->info.global_conf.valid_bit_mask[0];
-			break;
-		case RTE_ETH_HASH_FILTER_INPUT_SET_SELECT:
-			f_info.info.input_set_conf =
-				f_v22->info.input_set_conf;
-			break;
-		default:
-			break;
-		}
-		return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op,
-						    (void *)&f_info);
-	} else
-		return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op,
-						    arg);
-}
-VERSION_SYMBOL(rte_eth_dev_filter_ctrl, _v22, 2.2);
-
-int
-rte_eth_dev_filter_ctrl_v1802(uint16_t port_id,
-			      enum rte_filter_type filter_type,
-			      enum rte_filter_op filter_op, void *arg);
-
-int
-rte_eth_dev_filter_ctrl_v1802(uint16_t port_id,
-			      enum rte_filter_type filter_type,
-			      enum rte_filter_op filter_op, void *arg)
+rte_eth_dev_filter_ctrl(uint16_t port_id, enum rte_filter_type filter_type,
+			enum rte_filter_op filter_op, void *arg)
 {
 	struct rte_eth_dev *dev;
 
@@ -3647,11 +3501,6 @@ rte_eth_dev_filter_ctrl_v1802(uint16_t port_id,
 	return eth_err(port_id, (*dev->dev_ops->filter_ctrl)(dev, filter_type,
 							     filter_op, arg));
 }
-BIND_DEFAULT_SYMBOL(rte_eth_dev_filter_ctrl, _v1802, 18.02);
-MAP_STATIC_SYMBOL(int rte_eth_dev_filter_ctrl(uint16_t port_id,
-		  enum rte_filter_type filter_type,
-		  enum rte_filter_op filter_op, void *arg),
-		  rte_eth_dev_filter_ctrl_v1802);
 
 void *
 rte_eth_add_rx_callback(uint16_t port_id, uint16_t queue_id,
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index 87f02fb..34df6c8 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -16,7 +16,6 @@ DPDK_2.2 {
 	rte_eth_dev_count;
 	rte_eth_dev_default_mac_addr_set;
 	rte_eth_dev_detach;
-	rte_eth_dev_filter_ctrl;
 	rte_eth_dev_filter_supported;
 	rte_eth_dev_flow_ctrl_get;
 	rte_eth_dev_flow_ctrl_set;
-- 
2.5.5

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH] eal: register rte_panic user callback
@ 2018-03-06 18:28  3% Arnon Warshavsky
  2018-03-07  8:32  0% ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Arnon Warshavsky @ 2018-03-06 18:28 UTC (permalink / raw)
  To: thomas, bruce.richardson; +Cc: dev, Arnon Warshavsky

The use case addressed here is dpdk environment init
aborting the process due to panic,
preventing the calling process from running its own tear-down actions.
A preferred, though ABI breaking solution would be
to have the environment init always return a value
rather than abort upon distress.

This patch defines a couple of callback registration functions,
one for panic and one for exit
in case one wishes to distinguish between these events.
Once a callback is set and panic takes place,
it will be called prior to calling abort.

Maiden voyage patch for Qwilt and myself.

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
---
 lib/librte_eal/bsdapp/eal/eal_debug.c     | 37 ++++++++++++++++++++++++++++++
 lib/librte_eal/common/include/rte_debug.h | 24 +++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_debug.c   | 38 +++++++++++++++++++++++++++++++
 lib/librte_eal/rte_eal_version.map        |  2 ++
 4 files changed, 101 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_debug.c b/lib/librte_eal/bsdapp/eal/eal_debug.c
index 5d92500..010859d 100644
--- a/lib/librte_eal/bsdapp/eal/eal_debug.c
+++ b/lib/librte_eal/bsdapp/eal/eal_debug.c
@@ -18,6 +18,39 @@
 
 #define BACKTRACE_SIZE 256
 
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_exit()
+ */
+static rte_user_abort_callback_t *exit_user_callback;
+
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_panic()
+ */
+static rte_user_abort_callback_t *panic_user_callback;
+
+/**
+ * Register user callback function to be called during rte_panic()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_panic_user_callback_register(rte_user_abort_callback_t *cb)
+{
+	panic_user_callback = cb;
+}
+
+/**
+ * Register user callback function to be called during rte_exit()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_exit_user_callback_register(rte_user_abort_callback_t *cb)
+{
+	exit_user_callback = cb;
+}
+
+
 /* dump the stack of the calling core */
 void rte_dump_stack(void)
 {
@@ -59,6 +92,8 @@ void __rte_panic(const char *funcname, const char *format, ...)
 	va_end(ap);
 	rte_dump_stack();
 	rte_dump_registers();
+	if (panic_user_callback)
+		(*panic_user_callback)();
 	abort();
 }
 
@@ -78,6 +113,8 @@ rte_exit(int exit_code, const char *format, ...)
 	va_start(ap, format);
 	rte_vlog(RTE_LOG_CRIT, RTE_LOGTYPE_EAL, format, ap);
 	va_end(ap);
+	if (exit_user_callback)
+		(*exit_user_callback)();
 
 #ifndef RTE_EAL_ALWAYS_PANIC_ON_ERROR
 	if (rte_eal_cleanup() != 0)
diff --git a/lib/librte_eal/common/include/rte_debug.h b/lib/librte_eal/common/include/rte_debug.h
index 272df49..7e3d0a2 100644
--- a/lib/librte_eal/common/include/rte_debug.h
+++ b/lib/librte_eal/common/include/rte_debug.h
@@ -16,11 +16,35 @@
 
 #include "rte_log.h"
 #include "rte_branch_prediction.h"
+#include <rte_compat.h>
 
 #ifdef __cplusplus
 extern "C" {
 #endif
 
+
+/*
+ * Definition of user function pointer type to be called during
+ * the execution of rte_panic
+ */
+
+typedef void  (*rte_user_abort_callback_t)(void);
+/**< @internal Ethernet device configuration. */
+
+/**
+ * Register user callback function to be called during rte_panic()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_panic_user_callback_register(rte_user_abort_callback_t *cb);
+
+/**
+ * Register user callback function to be called during rte_exit()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_exit_user_callback_register(rte_user_abort_callback_t *cb);
+
 /**
  * Dump the stack of the calling core to the console.
  */
diff --git a/lib/librte_eal/linuxapp/eal/eal_debug.c b/lib/librte_eal/linuxapp/eal/eal_debug.c
index 5d92500..b1748b8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_debug.c
+++ b/lib/librte_eal/linuxapp/eal/eal_debug.c
@@ -16,8 +16,42 @@
 #include <rte_common.h>
 #include <rte_eal.h>
 
+
 #define BACKTRACE_SIZE 256
 
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_exit()
+ */
+static rte_user_abort_callback_t *exit_user_callback;
+
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_panic()
+ */
+static rte_user_abort_callback_t *panic_user_callback;
+
+/**
+ * Register user callback function to be called during rte_panic()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_panic_user_callback_register(rte_user_abort_callback_t *cb)
+{
+	panic_user_callback = cb;
+}
+
+/**
+ * Register user callback function to be called during rte_exit()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_exit_user_callback_register(rte_user_abort_callback_t *cb)
+{
+	exit_user_callback = cb;
+}
+
+
 /* dump the stack of the calling core */
 void rte_dump_stack(void)
 {
@@ -59,6 +93,8 @@ void __rte_panic(const char *funcname, const char *format, ...)
 	va_end(ap);
 	rte_dump_stack();
 	rte_dump_registers();
+	if (panic_user_callback)
+		(*panic_user_callback)();
 	abort();
 }
 
@@ -78,6 +114,8 @@ rte_exit(int exit_code, const char *format, ...)
 	va_start(ap, format);
 	rte_vlog(RTE_LOG_CRIT, RTE_LOGTYPE_EAL, format, ap);
 	va_end(ap);
+	if (exit_user_callback)
+		(*exit_user_callback)();
 
 #ifndef RTE_EAL_ALWAYS_PANIC_ON_ERROR
 	if (rte_eal_cleanup() != 0)
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index d123602..7b8f55d 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -221,11 +221,13 @@ EXPERIMENTAL {
 	rte_eal_hotplug_add;
 	rte_eal_hotplug_remove;
 	rte_eal_mbuf_user_pool_ops;
+	rte_exit_user_callback_register;
 	rte_mp_action_register;
 	rte_mp_action_unregister;
 	rte_mp_sendmsg;
 	rte_mp_request;
 	rte_mp_reply;
+	rte_panic_user_callback_register;
 	rte_service_attr_get;
 	rte_service_attr_reset_all;
 	rte_service_component_register;
-- 
2.7.4

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] eal: register rte_panic user callback
  2018-03-06 18:28  3% [dpdk-dev] [PATCH] eal: register rte_panic user callback Arnon Warshavsky
@ 2018-03-07  8:32  0% ` Thomas Monjalon
  2018-03-07  9:05  0%   ` Burakov, Anatoly
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-03-07  8:32 UTC (permalink / raw)
  To: Arnon Warshavsky; +Cc: bruce.richardson, dev

Hi,

06/03/2018 19:28, Arnon Warshavsky:
> The use case addressed here is dpdk environment init
> aborting the process due to panic,
> preventing the calling process from running its own tear-down actions.

Thank you for working on this long standing issue.

> A preferred, though ABI breaking solution would be
> to have the environment init always return a value
> rather than abort upon distress.

Yes, it is the preferred solution.
We should not use exit (panic & co) inside a library.
It is important enough to break the API.
I would be in favor of accepting such breakage in 18.05.

> This patch defines a couple of callback registration functions,
> one for panic and one for exit
> in case one wishes to distinguish between these events.
> Once a callback is set and panic takes place,
> it will be called prior to calling abort.
> 
> Maiden voyage patch for Qwilt and myself.

Are you OK to visit the other side of the solution?

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] eal: register rte_panic user callback
  2018-03-07  8:32  0% ` Thomas Monjalon
@ 2018-03-07  9:05  0%   ` Burakov, Anatoly
  2018-03-07  9:59  0%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Burakov, Anatoly @ 2018-03-07  9:05 UTC (permalink / raw)
  To: Thomas Monjalon, Arnon Warshavsky; +Cc: bruce.richardson, dev

On 07-Mar-18 8:32 AM, Thomas Monjalon wrote:
> Hi,
> 
> 06/03/2018 19:28, Arnon Warshavsky:
>> The use case addressed here is dpdk environment init
>> aborting the process due to panic,
>> preventing the calling process from running its own tear-down actions.
> 
> Thank you for working on this long standing issue.
> 
>> A preferred, though ABI breaking solution would be
>> to have the environment init always return a value
>> rather than abort upon distress.
> 
> Yes, it is the preferred solution.
> We should not use exit (panic & co) inside a library.
> It is important enough to break the API.

+1, panic exists mostly for historical reasons AFAIK. it's a pity i 
didn't think of it at the time of submitting the memory hotplug RFC, 
because i now hit the same issue with the v1 - we might panic while 
holding a lock, and didn't realize that it was an API break to change 
this behavior.

Can this really go into current release without deprecation notices?

-- 
Thanks,
Anatoly

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] eal: register rte_panic user callback
  2018-03-07  9:05  0%   ` Burakov, Anatoly
@ 2018-03-07  9:59  0%     ` Thomas Monjalon
  2018-03-07 11:29  0%       ` Burakov, Anatoly
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-03-07  9:59 UTC (permalink / raw)
  To: Burakov, Anatoly, Arnon Warshavsky; +Cc: bruce.richardson, dev

07/03/2018 10:05, Burakov, Anatoly:
> On 07-Mar-18 8:32 AM, Thomas Monjalon wrote:
> > Hi,
> > 
> > 06/03/2018 19:28, Arnon Warshavsky:
> >> The use case addressed here is dpdk environment init
> >> aborting the process due to panic,
> >> preventing the calling process from running its own tear-down actions.
> > 
> > Thank you for working on this long standing issue.
> > 
> >> A preferred, though ABI breaking solution would be
> >> to have the environment init always return a value
> >> rather than abort upon distress.
> > 
> > Yes, it is the preferred solution.
> > We should not use exit (panic & co) inside a library.
> > It is important enough to break the API.
> 
> +1, panic exists mostly for historical reasons AFAIK. it's a pity i 
> didn't think of it at the time of submitting the memory hotplug RFC, 
> because i now hit the same issue with the v1 - we might panic while 
> holding a lock, and didn't realize that it was an API break to change 
> this behavior.
> 
> Can this really go into current release without deprecation notices?

If such an exception is done, it must be approved by the technical board.
We need to check few criterias:
	- which functions need to be changed
	- how the application is impacted
	- what is the urgency

If a panic is removed and the application is not already checking some
error code, the execution will continue without considering the error.

Some rte_panic could be probably removed without any impact on applications.
Some rte_panic could wait for 18.08 with a notice in 18.05.
If some rte_panic cannot wait, it must be discussed specifically.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] eal: register rte_panic user callback
  2018-03-07  9:59  0%     ` Thomas Monjalon
@ 2018-03-07 11:29  0%       ` Burakov, Anatoly
  0 siblings, 0 replies; 200+ results
From: Burakov, Anatoly @ 2018-03-07 11:29 UTC (permalink / raw)
  To: Thomas Monjalon, Arnon Warshavsky; +Cc: bruce.richardson, dev

On 07-Mar-18 9:59 AM, Thomas Monjalon wrote:
> 07/03/2018 10:05, Burakov, Anatoly:
>> On 07-Mar-18 8:32 AM, Thomas Monjalon wrote:
>>> Hi,
>>>
>>> 06/03/2018 19:28, Arnon Warshavsky:
>>>> The use case addressed here is dpdk environment init
>>>> aborting the process due to panic,
>>>> preventing the calling process from running its own tear-down actions.
>>>
>>> Thank you for working on this long standing issue.
>>>
>>>> A preferred, though ABI breaking solution would be
>>>> to have the environment init always return a value
>>>> rather than abort upon distress.
>>>
>>> Yes, it is the preferred solution.
>>> We should not use exit (panic & co) inside a library.
>>> It is important enough to break the API.
>>
>> +1, panic exists mostly for historical reasons AFAIK. it's a pity i
>> didn't think of it at the time of submitting the memory hotplug RFC,
>> because i now hit the same issue with the v1 - we might panic while
>> holding a lock, and didn't realize that it was an API break to change
>> this behavior.
>>
>> Can this really go into current release without deprecation notices?
> 
> If such an exception is done, it must be approved by the technical board.
> We need to check few criterias:
> 	- which functions need to be changed
> 	- how the application is impacted
> 	- what is the urgency
> 
> If a panic is removed and the application is not already checking some
> error code, the execution will continue without considering the error.
> 
> Some rte_panic could be probably removed without any impact on applications.
> Some rte_panic could wait for 18.08 with a notice in 18.05.
> If some rte_panic cannot wait, it must be discussed specifically.
> 

Can we add a compile warning for adding new rte_panic's into code? It's 
a nice tool while debugging, but it probably shouldn't be in any new 
production code.

-- 
Thanks,
Anatoly

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC PATCH v1 0/4] ethdev: add per-PMD tuning of RxTx parmeters
@ 2018-03-07 12:08  3% Remy Horton
  0 siblings, 0 replies; 200+ results
From: Remy Horton @ 2018-03-07 12:08 UTC (permalink / raw)
  To: dev
  Cc: Wenzhuo Lu, Jingjing Wu, Qi Zhang, Beilei Xing, Shreyansh Jain,
	Thomas Monjalon

The optimal values of several transmission & reception related parameters,
such as burst sizes, descriptor ring sizes, and number of queues, varies
between different network interface devices. This patchset allows individual
PMDs to specify their preferred parameter values, and if so indicated by an
application, for them to be used automatically by the ethdev layer.

This RFC/V1 includes per-PMD values for e1000 and i40e but it is expected
that subsequent patchsets will cover other PMDs. A deprecation notice
covering the API/ABI change is in place.

Remy Horton (4):
  ethdev: add support for PMD-tuned Tx/Rx parameters
  net/e1000: add TxRx tuning parameters
  net/i40e: add TxRx tuning parameters
  testpmd: make use of per-PMD TxRx parameters

 app/test-pmd/testpmd.c         |  5 +++--
 drivers/net/e1000/em_ethdev.c  |  8 ++++++++
 drivers/net/i40e/i40e_ethdev.c | 35 ++++++++++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.c  | 18 ++++++++++++++++++
 lib/librte_ether/rte_ethdev.h  | 15 +++++++++++++++
 5 files changed, 76 insertions(+), 5 deletions(-)

-- 
2.9.5

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2] ethdev: remove versioning of ethdev filter control function
  2018-02-27 14:18  7% ` [dpdk-dev] [PATCH v2] " Kirill Rybalchenko
@ 2018-03-07 17:17  0%   ` Ferruh Yigit
  2018-03-07 17:47  0%     ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-03-07 17:17 UTC (permalink / raw)
  To: Kirill Rybalchenko, dev; +Cc: andrey.chilikin, thomas

On 2/27/2018 2:18 PM, Kirill Rybalchenko wrote:
> In 18.02 release the ABI of ethdev component was changed.
> To keep compatibility with previous versions of the library
> the versioning of rte_eth_dev_filter_ctrl function was implemented.
> As soon as deprecation note was issued in 18.02 release, there is
> no need to keep compatibility with previous versions.
> Remove the versioning of rte_eth_dev_filter_ctrl function.
> 
> v2:
> Modify map file, increment library version,
> remove deprecation notice
> 
> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>

Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
@ 2018-03-07 17:44 23% Ferruh Yigit
  2018-03-07 18:06  0% ` Luca Boccassi
  2018-03-08  8:05  5% ` Thomas Monjalon
  0 siblings, 2 replies; 200+ results
From: Ferruh Yigit @ 2018-03-07 17:44 UTC (permalink / raw)
  To: Thomas Monjalon, Neil Horman, John McNamara, Marko Kovacevic
  Cc: dev, Ferruh Yigit, Luca Boccassi, Christian Ehrhardt

After experimental API process defined do we still need RTE_NEXT_ABI
config and process which has similar targets?

Are distros disable experimental APIs when delivering DPDK? And is there
any config required to control this, as RTE_NEXT_ABI intended to do?

Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Thomas Monjalon <thomas@monjalon.net>
Cc: Luca Boccassi <bluca@debian.org>
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_base                     |  5 -----
 devtools/test-build.sh                 |  2 --
 devtools/validate-abi.sh               |  1 -
 doc/guides/contributing/versioning.rst | 10 ----------
 mk/rte.lib.mk                          |  5 -----
 pkg/dpdk.spec                          |  1 -
 6 files changed, 24 deletions(-)

diff --git a/config/common_base b/config/common_base
index ad03cf433..6b867f6a9 100644
--- a/config/common_base
+++ b/config/common_base
@@ -41,11 +41,6 @@ CONFIG_RTE_ARCH_STRICT_ALIGN=n
 CONFIG_RTE_BUILD_SHARED_LIB=n
 
 #
-# Use newest code breaking previous ABI
-#
-CONFIG_RTE_NEXT_ABI=y
-
-#
 # Major ABI to overwrite library specific LIBABIVER
 #
 CONFIG_RTE_MAJOR_ABI=
diff --git a/devtools/test-build.sh b/devtools/test-build.sh
index 3362edcc5..22b4e1a98 100755
--- a/devtools/test-build.sh
+++ b/devtools/test-build.sh
@@ -154,8 +154,6 @@ config () # <directory> <target> <options>
 		# Built-in options (lowercase)
 		! echo $3 | grep -q '+default' || \
 		sed -ri 's,(RTE_MACHINE=")native,\1default,' $1/.config
-		echo $3 | grep -q '+next' || \
-		sed -ri           's,(NEXT_ABI=)y,\1n,' $1/.config
 		! echo $3 | grep -q '+shared' || \
 		sed -ri         's,(SHARED_LIB=)n,\1y,' $1/.config
 		! echo $3 | grep -q '+debug' || ( \
diff --git a/devtools/validate-abi.sh b/devtools/validate-abi.sh
index 138436d93..a64edf92f 100755
--- a/devtools/validate-abi.sh
+++ b/devtools/validate-abi.sh
@@ -105,7 +105,6 @@ set_log_file() {
 fixup_config() {
 	local conf=config/defconfig_$target
 	cmd sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" $conf
-	cmd sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" $conf
 	cmd sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" $conf
 	cmd sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" $conf
 	cmd sed -i -e"$ a\CONFIG_RTE_KNI_KMOD=n" $conf
diff --git a/doc/guides/contributing/versioning.rst b/doc/guides/contributing/versioning.rst
index c495294db..59ff0e8b7 100644
--- a/doc/guides/contributing/versioning.rst
+++ b/doc/guides/contributing/versioning.rst
@@ -91,19 +91,9 @@ being provided. The requirements for doing so are:
      interest" be sought for each deprecation, for example: from NIC vendors,
      CPU vendors, end-users, etc.
 
-#. The changes (including an alternative map file) must be gated with
-   the ``RTE_NEXT_ABI`` option, and provided with a deprecation notice at the
-   same time.
-   It will become the default ABI in the next release.
-
 #. A full deprecation cycle, as explained above, must be made to offer
    downstream consumers sufficient warning of the change.
 
-#. At the beginning of the next release cycle, every ``RTE_NEXT_ABI``
-   conditions will be removed, the ``LIBABIVER`` variable in the makefile(s)
-   where the ABI is changed will be incremented, and the map files will
-   be updated.
-
 Note that the above process for ABI deprecation should not be undertaken
 lightly. ABI stability is extremely important for downstream consumers of the
 DPDK, especially when distributed in shared object form. Every effort should
diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index c696a2174..8ac26face 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -20,11 +20,6 @@ endif
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
 ifeq ($(EXTLIB_BUILD),n)
-ifeq ($(CONFIG_RTE_MAJOR_ABI),)
-ifeq ($(CONFIG_RTE_NEXT_ABI),y)
-LIB := $(LIB).1
-endif
-endif
 CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
 endif
 endif
diff --git a/pkg/dpdk.spec b/pkg/dpdk.spec
index 4d3b5745c..d118f0463 100644
--- a/pkg/dpdk.spec
+++ b/pkg/dpdk.spec
@@ -84,7 +84,6 @@ make O=%{target} T=%{config} config
 sed -ri 's,(RTE_MACHINE=).*,\1%{machine},' %{target}/.config
 sed -ri 's,(RTE_APP_TEST=).*,\1n,'         %{target}/.config
 sed -ri 's,(RTE_BUILD_SHARED_LIB=).*,\1y,' %{target}/.config
-sed -ri 's,(RTE_NEXT_ABI=).*,\1n,'         %{target}/.config
 sed -ri 's,(LIBRTE_VHOST=).*,\1y,'         %{target}/.config
 sed -ri 's,(LIBRTE_PMD_PCAP=).*,\1y,'      %{target}/.config
 make O=%{target} %{?_smp_mflags}
-- 
2.13.6

^ permalink raw reply	[relevance 23%]

* Re: [dpdk-dev] [PATCH v2] ethdev: remove versioning of ethdev filter control function
  2018-03-07 17:17  0%   ` Ferruh Yigit
@ 2018-03-07 17:47  0%     ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2018-03-07 17:47 UTC (permalink / raw)
  To: Kirill Rybalchenko, dev; +Cc: andrey.chilikin, thomas

On 3/7/2018 5:17 PM, Ferruh Yigit wrote:
> On 2/27/2018 2:18 PM, Kirill Rybalchenko wrote:
>> In 18.02 release the ABI of ethdev component was changed.
>> To keep compatibility with previous versions of the library
>> the versioning of rte_eth_dev_filter_ctrl function was implemented.
>> As soon as deprecation note was issued in 18.02 release, there is
>> no need to keep compatibility with previous versions.
>> Remove the versioning of rte_eth_dev_filter_ctrl function.
>>
>> v2:
>> Modify map file, increment library version,
>> remove deprecation notice
>>
>> Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
> 
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-07 17:44 23% [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI Ferruh Yigit
@ 2018-03-07 18:06  0% ` Luca Boccassi
  2018-03-08  8:05  5% ` Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Luca Boccassi @ 2018-03-07 18:06 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Neil Horman, John McNamara,
	Marko Kovacevic
  Cc: dev, Christian Ehrhardt

On Wed, 2018-03-07 at 17:44 +0000, Ferruh Yigit wrote:
> After experimental API process defined do we still need RTE_NEXT_ABI
> config and process which has similar targets?
> 
> Are distros disable experimental APIs when delivering DPDK? And is
> there
> any config required to control this, as RTE_NEXT_ABI intended to do?

I tried to tinker with not exporting experimental APIs - but the
problem is intra-project dependencies, iow: librte_foo has a
foo_experimental API that librte_bar uses, so if librte_foo
foo_experimental symbol is not available everything breaks down. I need
to spend a bit more on this problem but -ENOTIME

> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>
> Cc: Luca Boccassi <bluca@debian.org>
> Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> 
> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
>  config/common_base                     |  5 -----
>  devtools/test-build.sh                 |  2 --
>  devtools/validate-abi.sh               |  1 -
>  doc/guides/contributing/versioning.rst | 10 ----------
>  mk/rte.lib.mk                          |  5 -----
>  pkg/dpdk.spec                          |  1 -
>  6 files changed, 24 deletions(-)
> 
> diff --git a/config/common_base b/config/common_base
> index ad03cf433..6b867f6a9 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -41,11 +41,6 @@ CONFIG_RTE_ARCH_STRICT_ALIGN=n
>  CONFIG_RTE_BUILD_SHARED_LIB=n
>  
>  #
> -# Use newest code breaking previous ABI
> -#
> -CONFIG_RTE_NEXT_ABI=y
> -
> -#
>  # Major ABI to overwrite library specific LIBABIVER
>  #
>  CONFIG_RTE_MAJOR_ABI=
> diff --git a/devtools/test-build.sh b/devtools/test-build.sh
> index 3362edcc5..22b4e1a98 100755
> --- a/devtools/test-build.sh
> +++ b/devtools/test-build.sh
> @@ -154,8 +154,6 @@ config () # <directory> <target> <options>
>  		# Built-in options (lowercase)
>  		! echo $3 | grep -q '+default' || \
>  		sed -ri 's,(RTE_MACHINE=")native,\1default,'
> $1/.config
> -		echo $3 | grep -q '+next' || \
> -		sed -ri           's,(NEXT_ABI=)y,\1n,' $1/.config
>  		! echo $3 | grep -q '+shared' || \
>  		sed -ri         's,(SHARED_LIB=)n,\1y,' $1/.config
>  		! echo $3 | grep -q '+debug' || ( \
> diff --git a/devtools/validate-abi.sh b/devtools/validate-abi.sh
> index 138436d93..a64edf92f 100755
> --- a/devtools/validate-abi.sh
> +++ b/devtools/validate-abi.sh
> @@ -105,7 +105,6 @@ set_log_file() {
>  fixup_config() {
>  	local conf=config/defconfig_$target
>  	cmd sed -i -e"$ a\CONFIG_RTE_BUILD_SHARED_LIB=y" $conf
> -	cmd sed -i -e"$ a\CONFIG_RTE_NEXT_ABI=n" $conf
>  	cmd sed -i -e"$ a\CONFIG_RTE_EAL_IGB_UIO=n" $conf
>  	cmd sed -i -e"$ a\CONFIG_RTE_LIBRTE_KNI=n" $conf
>  	cmd sed -i -e"$ a\CONFIG_RTE_KNI_KMOD=n" $conf
> diff --git a/doc/guides/contributing/versioning.rst
> b/doc/guides/contributing/versioning.rst
> index c495294db..59ff0e8b7 100644
> --- a/doc/guides/contributing/versioning.rst
> +++ b/doc/guides/contributing/versioning.rst
> @@ -91,19 +91,9 @@ being provided. The requirements for doing so are:
>       interest" be sought for each deprecation, for example: from NIC
> vendors,
>       CPU vendors, end-users, etc.
>  
> -#. The changes (including an alternative map file) must be gated
> with
> -   the ``RTE_NEXT_ABI`` option, and provided with a deprecation
> notice at the
> -   same time.
> -   It will become the default ABI in the next release.
> -
>  #. A full deprecation cycle, as explained above, must be made to
> offer
>     downstream consumers sufficient warning of the change.
>  
> -#. At the beginning of the next release cycle, every
> ``RTE_NEXT_ABI``
> -   conditions will be removed, the ``LIBABIVER`` variable in the
> makefile(s)
> -   where the ABI is changed will be incremented, and the map files
> will
> -   be updated.
> -
>  Note that the above process for ABI deprecation should not be
> undertaken
>  lightly. ABI stability is extremely important for downstream
> consumers of the
>  DPDK, especially when distributed in shared object form. Every
> effort should
> diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
> index c696a2174..8ac26face 100644
> --- a/mk/rte.lib.mk
> +++ b/mk/rte.lib.mk
> @@ -20,11 +20,6 @@ endif
>  ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
>  LIB := $(patsubst %.a,%.so.$(LIBABIVER),$(LIB))
>  ifeq ($(EXTLIB_BUILD),n)
> -ifeq ($(CONFIG_RTE_MAJOR_ABI),)
> -ifeq ($(CONFIG_RTE_NEXT_ABI),y)
> -LIB := $(LIB).1
> -endif
> -endif
>  CPU_LDFLAGS += --version-script=$(SRCDIR)/$(EXPORT_MAP)
>  endif
>  endif
> diff --git a/pkg/dpdk.spec b/pkg/dpdk.spec
> index 4d3b5745c..d118f0463 100644
> --- a/pkg/dpdk.spec
> +++ b/pkg/dpdk.spec
> @@ -84,7 +84,6 @@ make O=%{target} T=%{config} config
>  sed -ri 's,(RTE_MACHINE=).*,\1%{machine},' %{target}/.config
>  sed -ri 's,(RTE_APP_TEST=).*,\1n,'         %{target}/.config
>  sed -ri 's,(RTE_BUILD_SHARED_LIB=).*,\1y,' %{target}/.config
> -sed -ri 's,(RTE_NEXT_ABI=).*,\1n,'         %{target}/.config
>  sed -ri 's,(LIBRTE_VHOST=).*,\1y,'         %{target}/.config
>  sed -ri 's,(LIBRTE_PMD_PCAP=).*,\1y,'      %{target}/.config
>  make O=%{target} %{?_smp_mflags}

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC PATCH 1/5] bpf: add BPF loading and execution framework
  @ 2018-03-08  1:29  2% ` Konstantin Ananyev
  0 siblings, 0 replies; 200+ results
From: Konstantin Ananyev @ 2018-03-08  1:29 UTC (permalink / raw)
  To: dev; +Cc: Konstantin Ananyev

librte_bpf provides a framework to load and execute eBPF bytecode
inside user-space dpdk based applications.

Not currently supported features:
 - JIT
 - cBPF
 - tail-pointer call
 - eBPF MAP
 - skb

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 config/common_base                 |   5 +
 config/common_linuxapp             |   1 +
 lib/Makefile                       |   2 +
 lib/librte_bpf/Makefile            |  30 +++
 lib/librte_bpf/bpf.c               |  48 ++++
 lib/librte_bpf/bpf_exec.c          | 453 +++++++++++++++++++++++++++++++++++++
 lib/librte_bpf/bpf_impl.h          |  37 +++
 lib/librte_bpf/bpf_load.c          | 344 ++++++++++++++++++++++++++++
 lib/librte_bpf/bpf_validate.c      |  55 +++++
 lib/librte_bpf/rte_bpf.h           | 154 +++++++++++++
 lib/librte_bpf/rte_bpf_version.map |  12 +
 mk/rte.app.mk                      |   2 +
 12 files changed, 1143 insertions(+)
 create mode 100644 lib/librte_bpf/Makefile
 create mode 100644 lib/librte_bpf/bpf.c
 create mode 100644 lib/librte_bpf/bpf_exec.c
 create mode 100644 lib/librte_bpf/bpf_impl.h
 create mode 100644 lib/librte_bpf/bpf_load.c
 create mode 100644 lib/librte_bpf/bpf_validate.c
 create mode 100644 lib/librte_bpf/rte_bpf.h
 create mode 100644 lib/librte_bpf/rte_bpf_version.map

diff --git a/config/common_base b/config/common_base
index ad03cf433..2205b684f 100644
--- a/config/common_base
+++ b/config/common_base
@@ -823,3 +823,8 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
+#
+# Compile librte_bpf
+#
+CONFIG_RTE_LIBRTE_BPF=n
diff --git a/config/common_linuxapp b/config/common_linuxapp
index ff98f2355..7b4a0ce7d 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -10,6 +10,7 @@ CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=y
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_KNI_KMOD=y
+CONFIG_RTE_LIBRTE_BPF=y
 CONFIG_RTE_LIBRTE_KNI=y
 CONFIG_RTE_LIBRTE_PMD_KNI=y
 CONFIG_RTE_LIBRTE_VHOST=y
diff --git a/lib/Makefile b/lib/Makefile
index ec965a606..a4a2329f9 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -97,6 +97,8 @@ DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
 DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 DEPDIRS-librte_gso += librte_mempool
+DIRS-$(CONFIG_RTE_LIBRTE_BPF) += librte_bpf
+DEPDIRS-librte_bpf := librte_eal librte_mempool librte_mbuf librte_ether
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_bpf/Makefile b/lib/librte_bpf/Makefile
new file mode 100644
index 000000000..e0f434e77
--- /dev/null
+++ b/lib/librte_bpf/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_bpf.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+LDLIBS += -lrte_net -lrte_eal
+LDLIBS += -lrte_mempool -lrte_ring
+LDLIBS += -lrte_mbuf -lrte_ethdev
+LDLIBS += -lelf
+
+EXPORT_MAP := rte_bpf_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf.c
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_exec.c
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_load.c
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_validate.c
+
+# install header files
+SYMLINK-$(CONFIG_RTE_LIBRTE_BPF)-include += rte_bpf.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_bpf/bpf.c b/lib/librte_bpf/bpf.c
new file mode 100644
index 000000000..4727d2251
--- /dev/null
+++ b/lib/librte_bpf/bpf.c
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+
+#include "bpf_impl.h"
+
+__rte_experimental void
+rte_bpf_destroy(struct rte_bpf *bpf)
+{
+	if (bpf != NULL) {
+		if (bpf->jit.func != NULL)
+			munmap(bpf->jit.func, bpf->jit.sz);
+		munmap(bpf, bpf->sz);
+	}
+}
+
+__rte_experimental int
+rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit)
+{
+	if (bpf == NULL || jit == NULL)
+		return -EINVAL;
+
+	jit[0] = bpf->jit;
+	return 0;
+}
+
+int
+bpf_jit(struct rte_bpf *bpf)
+{
+	int32_t rc;
+
+	rc = -ENOTSUP;
+
+	if (rc != 0)
+		RTE_LOG(WARNING, USER1, "%s(%p) failed, error code: %d;\n",
+			__func__, bpf, rc);
+	return rc;
+}
diff --git a/lib/librte_bpf/bpf_exec.c b/lib/librte_bpf/bpf_exec.c
new file mode 100644
index 000000000..4bad0cc9e
--- /dev/null
+++ b/lib/librte_bpf/bpf_exec.c
@@ -0,0 +1,453 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_memory.h>
+#include <rte_eal.h>
+#include <rte_byteorder.h>
+
+#include "bpf_impl.h"
+
+#define BPF_JMP_UNC(ins)	((ins) += (ins)->off)
+
+#define BPF_JMP_CND_REG(reg, ins, op, type)	\
+	((ins) += \
+		((type)(reg)[(ins)->dst_reg] op (type)(reg)[(ins)->src_reg]) ? \
+		(ins)->off : 0)
+
+#define BPF_JMP_CND_IMM(reg, ins, op, type)	\
+	((ins) += \
+		((type)(reg)[(ins)->dst_reg] op (type)(ins)->imm) ? \
+		(ins)->off : 0)
+
+#define BPF_NEG_ALU(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = (type)(-(reg)[(ins)->dst_reg]))
+
+#define BPF_MOV_ALU_REG(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = (type)(reg)[(ins)->src_reg])
+
+#define BPF_OP_ALU_REG(reg, ins, op, type)	\
+	((reg)[(ins)->dst_reg] = \
+		(type)(reg)[(ins)->dst_reg] op (type)(reg)[(ins)->src_reg])
+
+#define BPF_MOV_ALU_IMM(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = (type)(ins)->imm)
+
+#define BPF_OP_ALU_IMM(reg, ins, op, type)	\
+	((reg)[(ins)->dst_reg] = \
+		(type)(reg)[(ins)->dst_reg] op (type)(ins)->imm)
+
+#define BPF_DIV_ZERO_CHECK(bpf, reg, ins, type) do { \
+	if ((type)(reg)[(ins)->src_reg] == 0) { \
+		RTE_LOG(ERR, USER1, \
+			"%s(%p): division by 0 at pc: %#zx;\n", \
+			__func__, bpf, \
+			(uintptr_t)(ins) - (uintptr_t)(bpf)->prm.ins); \
+		return 0; \
+	} \
+} while (0)
+
+#define BPF_LD_REG(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = \
+		*(type *)(uintptr_t)((reg)[(ins)->src_reg] + (ins)->off))
+
+#define BPF_ST_IMM(reg, ins, type)	\
+	(*(type *)(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off) = \
+		(type)(ins)->imm)
+
+#define BPF_ST_REG(reg, ins, type)	\
+	(*(type *)(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off) = \
+		(type)(reg)[(ins)->src_reg])
+
+#define BPF_ST_XADD_REG(reg, ins, tp)	\
+	(rte_atomic##tp##_add((rte_atomic##tp##_t *) \
+		(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
+		reg[ins->src_reg]))
+
+static inline void
+bpf_alu_be(uint64_t reg[MAX_BPF_REG], const struct bpf_insn *ins)
+{
+	uint64_t *v;
+
+	v = reg + ins->dst_reg;
+	switch (ins->imm) {
+	case 16:
+		*v = rte_cpu_to_be_16(*v);
+		break;
+	case 32:
+		*v = rte_cpu_to_be_32(*v);
+		break;
+	case 64:
+		*v = rte_cpu_to_be_64(*v);
+		break;
+	}
+}
+
+static inline void
+bpf_alu_le(uint64_t reg[MAX_BPF_REG], const struct bpf_insn *ins)
+{
+	uint64_t *v;
+
+	v = reg + ins->dst_reg;
+	switch (ins->imm) {
+	case 16:
+		*v = rte_cpu_to_le_16(*v);
+		break;
+	case 32:
+		*v = rte_cpu_to_le_32(*v);
+		break;
+	case 64:
+		*v = rte_cpu_to_le_64(*v);
+		break;
+	}
+}
+
+static inline uint64_t
+bpf_exec(const struct rte_bpf *bpf, uint64_t reg[MAX_BPF_REG])
+{
+	const struct bpf_insn *ins;
+
+	for (ins = bpf->prm.ins; ; ins++) {
+		switch (ins->code) {
+		/* 32 bit ALU IMM operations */
+		case (BPF_ALU | BPF_ADD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, +, uint32_t);
+			break;
+		case (BPF_ALU | BPF_SUB | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, -, uint32_t);
+			break;
+		case (BPF_ALU | BPF_AND | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, &, uint32_t);
+			break;
+		case (BPF_ALU | BPF_OR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, |, uint32_t);
+			break;
+		case (BPF_ALU | BPF_LSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, <<, uint32_t);
+			break;
+		case (BPF_ALU | BPF_RSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, >>, uint32_t);
+			break;
+		case (BPF_ALU | BPF_XOR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, ^, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MUL | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, *, uint32_t);
+			break;
+		case (BPF_ALU | BPF_DIV | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, /, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, %, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOV | BPF_K):
+			BPF_MOV_ALU_IMM(reg, ins, uint32_t);
+			break;
+		/* 32 bit ALU REG operations */
+		case (BPF_ALU | BPF_ADD | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, +, uint32_t);
+			break;
+		case (BPF_ALU | BPF_SUB | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, -, uint32_t);
+			break;
+		case (BPF_ALU | BPF_AND | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, &, uint32_t);
+			break;
+		case (BPF_ALU | BPF_OR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, |, uint32_t);
+			break;
+		case (BPF_ALU | BPF_LSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, <<, uint32_t);
+			break;
+		case (BPF_ALU | BPF_RSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, >>, uint32_t);
+			break;
+		case (BPF_ALU | BPF_XOR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, ^, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MUL | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, *, uint32_t);
+			break;
+		case (BPF_ALU | BPF_DIV | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint32_t);
+			BPF_OP_ALU_REG(reg, ins, /, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOD | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint32_t);
+			BPF_OP_ALU_REG(reg, ins, %, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOV | BPF_X):
+			BPF_MOV_ALU_REG(reg, ins, uint32_t);
+			break;
+		case (BPF_ALU | BPF_NEG):
+			BPF_NEG_ALU(reg, ins, uint32_t);
+			break;
+		case (BPF_ALU | BPF_END | BPF_TO_BE):
+			bpf_alu_be(reg, ins);
+			break;
+		case (BPF_ALU | BPF_END | BPF_TO_LE):
+			bpf_alu_le(reg, ins);
+			break;
+		/* 64 bit ALU IMM operations */
+		case (BPF_ALU64 | BPF_ADD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, +, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_SUB | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, -, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_AND | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, &, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_OR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, |, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_LSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, <<, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_RSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, >>, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_ARSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, >>, int64_t);
+			break;
+		case (BPF_ALU64 | BPF_XOR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, ^, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MUL | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, *, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_DIV | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, /, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, %, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOV | BPF_K):
+			BPF_MOV_ALU_IMM(reg, ins, uint64_t);
+			break;
+		/* 64 bit ALU REG operations */
+		case (BPF_ALU64 | BPF_ADD | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, +, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_SUB | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, -, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_AND | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, &, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_OR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, |, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_LSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, <<, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_RSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, >>, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_ARSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, >>, int64_t);
+			break;
+		case (BPF_ALU64 | BPF_XOR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, ^, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MUL | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, *, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_DIV | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint64_t);
+			BPF_OP_ALU_REG(reg, ins, /, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOD | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint64_t);
+			BPF_OP_ALU_REG(reg, ins, %, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOV | BPF_X):
+			BPF_MOV_ALU_REG(reg, ins, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_NEG):
+			BPF_NEG_ALU(reg, ins, uint64_t);
+			break;
+		/* load instructions */
+		case (BPF_LDX | BPF_MEM | BPF_B):
+			BPF_LD_REG(reg, ins, uint8_t);
+			break;
+		case (BPF_LDX | BPF_MEM | BPF_H):
+			BPF_LD_REG(reg, ins, uint16_t);
+			break;
+		case (BPF_LDX | BPF_MEM | BPF_W):
+			BPF_LD_REG(reg, ins, uint32_t);
+			break;
+		case (BPF_LDX | BPF_MEM | BPF_DW):
+			BPF_LD_REG(reg, ins, uint64_t);
+			break;
+		/* load 64 bit immediate value */
+		case (BPF_LD | BPF_IMM | BPF_DW):
+			reg[ins->dst_reg] = (uint32_t)ins[0].imm |
+				(uint64_t)(uint32_t)ins[1].imm << 32;
+			ins++;
+			break;
+		/* store instructions */
+		case (BPF_STX | BPF_MEM | BPF_B):
+			BPF_ST_REG(reg, ins, uint8_t);
+			break;
+		case (BPF_STX | BPF_MEM | BPF_H):
+			BPF_ST_REG(reg, ins, uint16_t);
+			break;
+		case (BPF_STX | BPF_MEM | BPF_W):
+			BPF_ST_REG(reg, ins, uint32_t);
+			break;
+		case (BPF_STX | BPF_MEM | BPF_DW):
+			BPF_ST_REG(reg, ins, uint64_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_B):
+			BPF_ST_IMM(reg, ins, uint8_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_H):
+			BPF_ST_IMM(reg, ins, uint16_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_W):
+			BPF_ST_IMM(reg, ins, uint32_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_DW):
+			BPF_ST_IMM(reg, ins, uint64_t);
+			break;
+		/* atomic add instructions */
+		case (BPF_STX | BPF_XADD | BPF_W):
+			BPF_ST_XADD_REG(reg, ins, 32);
+			break;
+		case (BPF_STX | BPF_XADD | BPF_DW):
+			BPF_ST_XADD_REG(reg, ins, 64);
+			break;
+		/* jump instructions */
+		case (BPF_JMP | BPF_JA):
+			BPF_JMP_UNC(ins);
+			break;
+		/* jump IMM instructions */
+		case (BPF_JMP | BPF_JEQ | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, ==, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JNE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, !=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JSGT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSGE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSET | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, &, uint64_t);
+			break;
+		/* jump REG instructions */
+		case (BPF_JMP | BPF_JEQ | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, ==, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JNE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, !=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JSGT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSGE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSET | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, &, uint64_t);
+			break;
+		/* call instructions */
+		case (BPF_JMP | BPF_CALL):
+			reg[BPF_REG_0] = bpf->prm.xsym[ins->imm].func(
+				reg[BPF_REG_1], reg[BPF_REG_2], reg[BPF_REG_3],
+				reg[BPF_REG_4], reg[BPF_REG_5]);
+			break;
+		/* return instruction */
+		case (BPF_JMP | BPF_EXIT):
+			return reg[BPF_REG_0];
+		default:
+			RTE_LOG(ERR, USER1,
+				"%s(%p): invalid opcode %#x at pc: %#zx;\n",
+				__func__, bpf, ins->code,
+				(uintptr_t)ins - (uintptr_t)bpf->prm.ins);
+			return 0;
+		}
+	}
+
+	/* should never be reached */
+	RTE_VERIFY(0);
+	return 0;
+}
+
+__rte_experimental uint32_t
+rte_bpf_exec_burst(const struct rte_bpf *bpf, void *ctx[], uint64_t rc[],
+	uint32_t num)
+{
+	uint32_t i;
+	uint64_t reg[MAX_BPF_REG];
+	uint64_t stack[MAX_BPF_STACK_SIZE / sizeof(uint64_t)];
+
+	for (i = 0; i != num; i++) {
+
+		reg[BPF_REG_1] = (uintptr_t)ctx[i];
+		reg[BPF_REG_10] = (uintptr_t)(stack + RTE_DIM(stack));
+
+		rc[i] = bpf_exec(bpf, reg);
+	}
+
+	return i;
+}
+
+__rte_experimental uint64_t
+rte_bpf_exec(const struct rte_bpf *bpf, void *ctx)
+{
+	uint64_t rc;
+
+	rte_bpf_exec_burst(bpf, &ctx, &rc, 1);
+	return rc;
+}
+
diff --git a/lib/librte_bpf/bpf_impl.h b/lib/librte_bpf/bpf_impl.h
new file mode 100644
index 000000000..f09417088
--- /dev/null
+++ b/lib/librte_bpf/bpf_impl.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _BPF_H_
+#define _BPF_H_
+
+#include <rte_bpf.h>
+#include <sys/mman.h>
+#include <linux/bpf.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define MAX_BPF_STACK_SIZE	0x200
+
+struct rte_bpf {
+	struct rte_bpf_prm prm;
+	struct rte_bpf_jit jit;
+	size_t sz;
+	uint32_t stack_sz;
+};
+
+extern int bpf_validate(struct rte_bpf *bpf);
+
+extern int bpf_jit(struct rte_bpf *bpf);
+
+#ifdef RTE_ARCH_X86_64
+extern int bpf_jit_x86(struct rte_bpf *);
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _BPF_H_ */
diff --git a/lib/librte_bpf/bpf_load.c b/lib/librte_bpf/bpf_load.c
new file mode 100644
index 000000000..84c6b9417
--- /dev/null
+++ b/lib/librte_bpf/bpf_load.c
@@ -0,0 +1,344 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <inttypes.h>
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/queue.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_memory.h>
+#include <rte_eal.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+
+#include "bpf_impl.h"
+
+static uint32_t
+bpf_find_func(const char *sn, const struct rte_bpf_xsym fp[], uint32_t fn)
+{
+	uint32_t i;
+
+	if (sn == NULL || fp == NULL)
+		return UINT32_MAX;
+
+	for (i = 0; i != fn; i++) {
+		if (fp[i].type == RTE_BPF_XTYPE_FUNC &&
+				strcmp(sn, fp[i].name) == 0)
+			break;
+	}
+
+	return (i != fn) ? i : UINT32_MAX;
+}
+
+static int
+check_elf_header(const Elf64_Ehdr * eh)
+{
+	const char *err;
+
+	err = NULL;
+
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (eh->e_ident[EI_DATA] != ELFDATA2LSB)
+#else
+	if (eh->e_ident[EI_DATA] != ELFDATA2MSB)
+#endif
+		err = "not native byte order";
+	else if (eh->e_ident[EI_OSABI] != ELFOSABI_NONE)
+		err = "unexpected OS ABI";
+	else if (eh->e_type != ET_REL)
+		err = "unexpected ELF type";
+	else if (eh->e_machine != EM_NONE && eh->e_machine != EM_BPF)
+		err = "unexpected machine type";
+
+	if (err != NULL) {
+		RTE_LOG(ERR, USER1, "%s(): %s\n", __func__, err);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/*
+ * helper function, find executable section by name.
+ */
+static int
+find_elf_code(Elf *elf, const char *section, Elf_Data **psd, size_t *pidx)
+{
+	Elf_Scn *sc;
+	const Elf64_Ehdr *eh;
+	const Elf64_Shdr *sh;
+	Elf_Data *sd;
+	const char *sn;
+	int32_t rc;
+
+	eh = elf64_getehdr(elf);
+	if (eh == NULL) {
+		rc = elf_errno();
+		RTE_LOG(ERR, USER1, "%s(%p, %s) error code: %d(%s)\n",
+			__func__, elf, section, rc, elf_errmsg(rc));
+		return -EINVAL;
+	}
+
+	if (check_elf_header(eh) != 0)
+		return -EINVAL;
+
+	/* find given section by name */
+	for (sc = elf_nextscn(elf, NULL); sc != NULL;
+			sc = elf_nextscn(elf, sc)) {
+		sh = elf64_getshdr(sc);
+		sn = elf_strptr(elf, eh->e_shstrndx, sh->sh_name);
+		if (sn != NULL && strcmp(section, sn) == 0 &&
+				sh->sh_type == SHT_PROGBITS &&
+				sh->sh_flags == (SHF_ALLOC | SHF_EXECINSTR))
+			break;
+	}
+
+	sd = elf_getdata(sc, NULL);
+	if (sd == NULL || sd->d_size == 0 ||
+			sd->d_size % sizeof(struct bpf_insn) != 0) {
+		rc = elf_errno();
+		RTE_LOG(ERR, USER1, "%s(%p, %s) error code: %d(%s)\n",
+			__func__, elf, section, rc, elf_errmsg(rc));
+		return -EINVAL;
+	}
+
+	*psd = sd;
+	*pidx = elf_ndxscn(sc);
+	return 0;
+}
+
+/*
+ * helper function to process data from relocation table.
+ */
+static int
+process_reloc(Elf *elf, size_t sym_idx, Elf64_Rel *re, size_t re_sz,
+	struct bpf_insn *ins, size_t ins_sz, const struct rte_bpf_prm *prm)
+{
+	uint32_t i, idx, fidx, n;
+	size_t ofs, sym;
+	const char *sn;
+	const Elf64_Ehdr *eh;
+	Elf_Scn *sc;
+	const Elf_Data *sd;
+	Elf64_Sym *sm;
+
+	eh = elf64_getehdr(elf);
+
+	/* get symtable by section index */
+	sc = elf_getscn(elf, sym_idx);
+	sd = elf_getdata(sc, NULL);
+	if (sd == NULL)
+		return -EINVAL;
+	sm = sd->d_buf;
+
+	n = re_sz / sizeof(re[0]);
+	for (i = 0; i != n; i++) {
+
+		ofs = re[i].r_offset;
+		if (ofs % sizeof(ins[0]) != 0 || ofs >= ins_sz)
+			return -EINVAL;
+
+		idx = ofs / sizeof(ins[0]);
+		if (ins[idx].code != (BPF_JMP | BPF_CALL))
+			return -EINVAL;
+
+		/* retrieve index in the symtable */
+		sym = ELF64_R_SYM(re[i].r_info);
+		if (sym * sizeof(sm[0]) >= sd->d_size)
+			return -EINVAL;
+
+		sn = elf_strptr(elf, eh->e_shstrndx, sm[sym].st_name);
+
+		fidx = bpf_find_func(sn, prm->xsym, prm->nb_xsym);
+		if (fidx == UINT32_MAX)
+			return -EINVAL;
+
+		ins[idx].imm = fidx;
+	}
+
+	return 0;
+}
+
+/*
+ * helper function, find relocation information (if any)
+ * and update bpf code.
+ */
+static int
+elf_reloc_code(Elf *elf, Elf_Data *ed, size_t sidx,
+	const struct rte_bpf_prm *prm)
+{
+	Elf64_Rel *re;
+	Elf_Scn *sc;
+	const Elf64_Shdr *sh;
+	const Elf_Data *sd;
+	int32_t rc;
+
+	rc = 0;
+
+	/* walk through all sections */
+	for (sc = elf_nextscn(elf, NULL); sc != NULL && rc == 0;
+			sc = elf_nextscn(elf, sc)) {
+
+		sh = elf64_getshdr(sc);
+
+		/* relocation data for our code section */
+		if (sh->sh_type == SHT_REL && sh->sh_info == sidx) {
+			sd = elf_getdata(sc, NULL);
+			if (sd == NULL || sd->d_size == 0 ||
+					sd->d_size % sizeof(re[0]) != 0)
+				return -EINVAL;
+			rc = process_reloc(elf, sh->sh_link,
+				sd->d_buf, sd->d_size, ed->d_buf, ed->d_size,
+				prm);
+		}
+	}
+
+	return rc;
+}
+
+static struct rte_bpf *
+bpf_load(const struct rte_bpf_prm *prm)
+{
+	uint8_t *buf;
+	struct rte_bpf *bpf;
+	size_t sz, bsz, insz, xsz;
+
+	xsz =  prm->nb_xsym * sizeof(prm->xsym[0]);
+	insz = prm->nb_ins * sizeof(prm->ins[0]);
+	bsz = sizeof(bpf[0]);
+	sz = insz + xsz + bsz;
+
+	buf = mmap(NULL, sz, PROT_READ | PROT_WRITE,
+		MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (buf == MAP_FAILED)
+		return NULL;
+
+	bpf = (void *)buf;
+	bpf->sz = sz;
+
+	memcpy(&bpf->prm, prm, sizeof(bpf->prm));
+
+	memcpy(buf + bsz, prm->xsym, xsz);
+	memcpy(buf + bsz + xsz, prm->ins, insz);
+
+	bpf->prm.xsym = (void *)(buf + bsz);
+	bpf->prm.ins = (void *)(buf + bsz + xsz);
+
+	return bpf;
+}
+
+__rte_experimental struct rte_bpf *
+rte_bpf_load(const struct rte_bpf_prm *prm)
+{
+	struct rte_bpf *bpf;
+	int32_t rc;
+
+	if (prm == NULL || prm->ins == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	bpf = bpf_load(prm);
+	if (bpf == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	rc = bpf_validate(bpf);
+	if (rc == 0) {
+		bpf_jit(bpf);
+		if (mprotect(bpf, bpf->sz, PROT_READ) != 0)
+			rc = -ENOMEM;
+	}
+
+	if (rc != 0) {
+		rte_bpf_destroy(bpf);
+		rte_errno = -rc;
+		return NULL;
+	}
+
+	return bpf;
+}
+
+static struct rte_bpf *
+bpf_load_elf(const struct rte_bpf_prm *prm, int32_t fd, const char *section)
+{
+	Elf *elf;
+	Elf_Data *sd;
+	size_t sidx;
+	int32_t rc;
+	struct rte_bpf *bpf;
+	struct rte_bpf_prm np;
+
+	elf_version(EV_CURRENT);
+	elf = elf_begin(fd, ELF_C_READ, NULL);
+
+	rc = find_elf_code(elf, section, &sd, &sidx);
+	if (rc == 0)
+		rc = elf_reloc_code(elf, sd, sidx, prm);
+
+	if (rc == 0) {
+		np = prm[0];
+		np.ins = sd->d_buf;
+		np.nb_ins = sd->d_size / sizeof(struct bpf_insn);
+		bpf = rte_bpf_load(&np);
+	} else {
+		bpf = NULL;
+		rte_errno = -rc;
+	}
+
+	elf_end(elf);
+	return bpf;
+}
+
+__rte_experimental struct rte_bpf *
+rte_bpf_elf_load(const struct rte_bpf_prm *prm, const char *fname,
+	const char *sname)
+{
+	int32_t fd, rc;
+	struct rte_bpf *bpf;
+
+	if (prm == NULL || fname == NULL || sname == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	fd = open(fname, O_RDONLY);
+	if (fd < 0) {
+		rc = errno;
+		RTE_LOG(ERR, USER1, "%s(%s) error code: %d(%s)\n",
+			__func__, fname, rc, strerror(rc));
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	bpf = bpf_load_elf(prm, fd, sname);
+	close(fd);
+
+	if (bpf == NULL) {
+		RTE_LOG(ERR, USER1,
+			"%s(fname=\"%s\", sname=\"%s\") failed, "
+			"error code: %d\n",
+			__func__, fname, sname, rte_errno);
+		return NULL;
+	}
+
+	RTE_LOG(INFO, USER1, "%s(fname=\"%s\", sname=\"%s\") "
+		"successfully creates %p;\n",
+		__func__, fname, sname, bpf);
+	return bpf;
+}
diff --git a/lib/librte_bpf/bpf_validate.c b/lib/librte_bpf/bpf_validate.c
new file mode 100644
index 000000000..7c1267cbd
--- /dev/null
+++ b/lib/librte_bpf/bpf_validate.c
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+
+#include "bpf_impl.h"
+
+/*
+ * dummy one for now, need more work.
+ */
+int
+bpf_validate(struct rte_bpf *bpf)
+{
+	int32_t rc, ofs, stack_sz;
+	uint32_t i, op, dr;
+	const struct bpf_insn *ins;
+
+	rc = 0;
+	stack_sz = 0;
+	for (i = 0; i != bpf->prm.nb_ins; i++) {
+
+		ins = bpf->prm.ins + i;
+		op = ins->code;
+		dr = ins->dst_reg;
+		ofs = ins->off;
+
+		if ((BPF_CLASS(op) == BPF_STX || BPF_CLASS(op) == BPF_ST) &&
+				dr == BPF_REG_10) {
+			ofs -= sizeof(uint64_t);
+			stack_sz = RTE_MIN(ofs, stack_sz);
+		}
+	}
+
+	if (stack_sz != 0) {
+		stack_sz = -stack_sz;
+		if (stack_sz > MAX_BPF_STACK_SIZE)
+			rc = -ERANGE;
+		else
+			bpf->stack_sz = stack_sz;
+	}
+
+	if (rc != 0)
+		RTE_LOG(ERR, USER1, "%s(%p) failed, error code: %d;\n",
+			__func__, bpf, rc);
+	return rc;
+}
diff --git a/lib/librte_bpf/rte_bpf.h b/lib/librte_bpf/rte_bpf.h
new file mode 100644
index 000000000..45f622818
--- /dev/null
+++ b/lib/librte_bpf/rte_bpf.h
@@ -0,0 +1,154 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _RTE_BPF_H_
+#define _RTE_BPF_H_
+
+#include <rte_common.h>
+#include <rte_mbuf.h>
+#include <linux/bpf.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Possible types for external symbols.
+ */
+enum rte_bpf_xtype {
+	RTE_BPF_XTYPE_FUNC, /**< function */
+	RTE_BPF_XTYPE_NUM
+};
+
+/**
+ * Definition for external symbols available in the BPF program.
+ */
+struct rte_bpf_xsym {
+	const char *name;        /**< name */
+	enum rte_bpf_xtype type; /**< type */
+	uint64_t (*func)(uint64_t, uint64_t, uint64_t, uint64_t, uint64_t);
+	/**< value */
+};
+
+/**
+ * Possible BPF program types.
+ */
+enum rte_bpf_prog_type {
+	RTE_BPF_PROG_TYPE_UNSPEC = BPF_PROG_TYPE_UNSPEC,
+	/**< input is a pointer to raw data */
+	RTE_BPF_PROG_TYPE_MBUF,
+	/**< input is a pointer to rte_mbuf */
+};
+
+/**
+ * Input parameters for loading eBPF code.
+ */
+struct rte_bpf_prm {
+	const struct bpf_insn *ins; /**< array of eBPF instructions */
+	uint32_t nb_ins;            /**< number of instructions in ins */
+	const struct rte_bpf_xsym *xsym;
+	/**< array of external symbols that eBPF code is allowed to reference */
+	uint32_t nb_xsym; /**< number of elements in xsym */
+	enum rte_bpf_prog_type prog_type; /**< eBPF program type */
+};
+
+/**
+ * Information about compiled into native ISA eBPF code.
+ */
+struct rte_bpf_jit {
+	uint64_t (*func)(void *);
+	size_t sz;
+};
+
+struct rte_bpf;
+
+/**
+ * De-allocate all memory used by this eBPF execution context.
+ *
+ * @param bpf
+ *   BPF handle to destroy.
+ */
+void rte_bpf_destroy(struct rte_bpf *bpf);
+
+/**
+ * Create a new eBPF execution context and load given BPF code into it.
+ *
+ * @param prm
+ *  Parameters used to create and initialise the BPF exeution context.
+ * @return
+ *   BPF handle that is used in future BPF operations,
+ *   or NULL on error, with error code set in rte_errno.
+ *   Possible rte_errno errors include:
+ *   - EINVAL - invalid parameter passed to function
+ *   - ENOMEM - can't reserve enough memory
+ */
+struct rte_bpf *rte_bpf_load(const struct rte_bpf_prm *prm);
+
+/**
+ * Create a new eBPF execution context and load BPF code from given ELF
+ * file into it.
+ *
+ * @param prm
+ *  Parameters used to create and initialise the BPF exeution context.
+ * @param fname
+ *  Pathname for a ELF file.
+ * @param sname
+ *  Name of the executable section within the file to load.
+ * @return
+ *   BPF handle that is used in future BPF operations,
+ *   or NULL on error, with error code set in rte_errno.
+ *   Possible rte_errno errors include:
+ *   - EINVAL - invalid parameter passed to function
+ *   - ENOMEM - can't reserve enough memory
+ */
+struct rte_bpf *rte_bpf_elf_load(const struct rte_bpf_prm *prm,
+	const char *fname, const char *sname);
+
+/**
+ * Execute given BPF bytecode.
+ *
+ * @param bpf
+ *   handle for the BPF code to execute.
+ * @param ctx
+ *   pointer to input context.
+ * @return
+ *   BPF execution return value.
+ */
+uint64_t rte_bpf_exec(const struct rte_bpf *bpf, void *ctx);
+
+/**
+ * Execute given BPF bytecode over a set of input contexts.
+ *
+ * @param bpf
+ *   handle for the BPF code to execute.
+ * @param ctx
+ *   array of pointers to the input contexts.
+ * @param rc
+ *   array of return values (one per input).
+ * @param num
+ *   number of elements in ctx[] (and rc[]).
+ * @return
+ *   number of successfully processed inputs.
+ */
+uint32_t rte_bpf_exec_burst(const struct rte_bpf *bpf, void *ctx[],
+	uint64_t rc[], uint32_t num);
+
+/**
+ * Provide information about natively compield code for given BPF handle.
+ *
+ * @param bpf
+ *   handle for the BPF code.
+ * @param jit
+ *   pointer to the rte_bpf_jit structure to be filled with related data.
+ * @return
+ *   - -EINVAL if the parameters are invalid.
+ *   - Zero if operation completed successfully.
+ */
+int rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BPF_H_ */
diff --git a/lib/librte_bpf/rte_bpf_version.map b/lib/librte_bpf/rte_bpf_version.map
new file mode 100644
index 000000000..ff65144df
--- /dev/null
+++ b/lib/librte_bpf/rte_bpf_version.map
@@ -0,0 +1,12 @@
+EXPERIMENTAL {
+	global:
+
+	rte_bpf_destroy;
+	rte_bpf_elf_load;
+	rte_bpf_exec;
+	rte_bpf_exec_burst;
+	rte_bpf_get_jit;
+	rte_bpf_load;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 3eb41d176..fb41c77d2 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -83,6 +83,8 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER)          += -lrte_power
 _LDLIBS-$(CONFIG_RTE_LIBRTE_TIMER)          += -lrte_timer
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EFD)            += -lrte_efd
 
+_LDLIBS-$(CONFIG_RTE_LIBRTE_BPF)            += -lrte_bpf -lelf
+
 _LDLIBS-y += --whole-archive
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CFGFILE)        += -lrte_cfgfile
-- 
2.13.6

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-07 17:44 23% [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI Ferruh Yigit
  2018-03-07 18:06  0% ` Luca Boccassi
@ 2018-03-08  8:05  5% ` Thomas Monjalon
  2018-03-08 11:43  3%   ` Ferruh Yigit
  1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-03-08  8:05 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Neil Horman, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

07/03/2018 18:44, Ferruh Yigit:
> After experimental API process defined do we still need RTE_NEXT_ABI
> config and process which has similar targets?

They are different targets.
Experimental API is always enabled but may be avoided by applications.
Next ABI can be used to break ABI without notice and disabled to keep
old ABI compatibility. It is almost never used because it is preferred
to keep ABI compatibility with rte_compat macros, or wait a deprecation
period after notice.

^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08  8:05  5% ` Thomas Monjalon
@ 2018-03-08 11:43  3%   ` Ferruh Yigit
  2018-03-08 15:17  0%     ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-03-08 11:43 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Neil Horman, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> 07/03/2018 18:44, Ferruh Yigit:
>> After experimental API process defined do we still need RTE_NEXT_ABI
>> config and process which has similar targets?
> 
> They are different targets.
> Experimental API is always enabled but may be avoided by applications.
> Next ABI can be used to break ABI without notice and disabled to keep
> old ABI compatibility. It is almost never used because it is preferred
> to keep ABI compatibility with rte_compat macros, or wait a deprecation
> period after notice.

OK, I see.

Shouldn't we disable it by default at least? Otherwise who is not paying
attention to this config option will get and ABI/API break.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 18.05 v4] eal: add function to return number of detected sockets
  2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] eal: add " Anatoly Burakov
@ 2018-03-08 12:12  3%       ` Bruce Richardson
  2018-03-08 14:38  0%         ` Burakov, Anatoly
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2018-03-08 12:12 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev

On Wed, Feb 07, 2018 at 09:58:36AM +0000, Anatoly Burakov wrote:
> During lcore scan, find maximum socket ID and store it. This will
> break the ABI, so bump ABI version.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> 
> Notes:
>     v4:
>     - Remove backwards ABI compatibility, bump ABI instead
>     
>     v3:
>     - Added ABI compatibility
>     
>     v2:
>     - checkpatch changes
>     - check socket before deciding if the core is not to be used
> 
>  lib/librte_eal/bsdapp/eal/Makefile        |  2 +-
>  lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
>  lib/librte_eal/common/include/rte_eal.h   |  1 +
>  lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
>  lib/librte_eal/linuxapp/eal/Makefile      |  2 +-
>  lib/librte_eal/rte_eal_version.map        |  9 +++++++-
>  6 files changed, 44 insertions(+), 15 deletions(-)
> 
Breaking the ABI is the best way to implement this change, and given the
deprecation was previously announced I'm ok with that.

Question: we are ok assuming that the socket numbers are sequential, or
nearly so, and knowing the maximum socket number seen is a good
approximation of the actual physical sockets? I know in terms of cores
on a system, the core id's often jump - are there systems where the
socket numbers do too?

/Bruce

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH 18.05 v4] eal: add function to return number of detected sockets
  2018-03-08 12:12  3%       ` Bruce Richardson
@ 2018-03-08 14:38  0%         ` Burakov, Anatoly
  2018-03-09 16:32  0%           ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Burakov, Anatoly @ 2018-03-08 14:38 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On 08-Mar-18 12:12 PM, Bruce Richardson wrote:
> On Wed, Feb 07, 2018 at 09:58:36AM +0000, Anatoly Burakov wrote:
>> During lcore scan, find maximum socket ID and store it. This will
>> break the ABI, so bump ABI version.
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> ---
>>
>> Notes:
>>      v4:
>>      - Remove backwards ABI compatibility, bump ABI instead
>>      
>>      v3:
>>      - Added ABI compatibility
>>      
>>      v2:
>>      - checkpatch changes
>>      - check socket before deciding if the core is not to be used
>>
>>   lib/librte_eal/bsdapp/eal/Makefile        |  2 +-
>>   lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
>>   lib/librte_eal/common/include/rte_eal.h   |  1 +
>>   lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
>>   lib/librte_eal/linuxapp/eal/Makefile      |  2 +-
>>   lib/librte_eal/rte_eal_version.map        |  9 +++++++-
>>   6 files changed, 44 insertions(+), 15 deletions(-)
>>
> Breaking the ABI is the best way to implement this change, and given the
> deprecation was previously announced I'm ok with that.
> 
> Question: we are ok assuming that the socket numbers are sequential, or
> nearly so, and knowing the maximum socket number seen is a good
> approximation of the actual physical sockets? I know in terms of cores
> on a system, the core id's often jump - are there systems where the
> socket numbers do too?
> 
> /Bruce
> 

I am not aware of any system that would jump sockets like that. I'm open 
to corrections, however :)

-- 
Thanks,
Anatoly

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08 11:43  3%   ` Ferruh Yigit
@ 2018-03-08 15:17  0%     ` Thomas Monjalon
  2018-03-08 15:35  0%       ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-03-08 15:17 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Neil Horman, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

08/03/2018 12:43, Ferruh Yigit:
> On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> > 07/03/2018 18:44, Ferruh Yigit:
> >> After experimental API process defined do we still need RTE_NEXT_ABI
> >> config and process which has similar targets?
> > 
> > They are different targets.
> > Experimental API is always enabled but may be avoided by applications.
> > Next ABI can be used to break ABI without notice and disabled to keep
> > old ABI compatibility. It is almost never used because it is preferred
> > to keep ABI compatibility with rte_compat macros, or wait a deprecation
> > period after notice.
> 
> OK, I see.
> 
> Shouldn't we disable it by default at least? Otherwise who is not paying
> attention to this config option will get and ABI/API break.

Yes I think you are right, it can be disabled by default.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08 15:17  0%     ` Thomas Monjalon
@ 2018-03-08 15:35  0%       ` Neil Horman
  2018-03-08 16:04  0%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-03-08 15:35 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ferruh Yigit, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

On Thu, Mar 08, 2018 at 04:17:00PM +0100, Thomas Monjalon wrote:
> 08/03/2018 12:43, Ferruh Yigit:
> > On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> > > 07/03/2018 18:44, Ferruh Yigit:
> > >> After experimental API process defined do we still need RTE_NEXT_ABI
> > >> config and process which has similar targets?
> > > 
> > > They are different targets.
> > > Experimental API is always enabled but may be avoided by applications.
> > > Next ABI can be used to break ABI without notice and disabled to keep
> > > old ABI compatibility. It is almost never used because it is preferred
> > > to keep ABI compatibility with rte_compat macros, or wait a deprecation
> > > period after notice.
> > 
> > OK, I see.
> > 
> > Shouldn't we disable it by default at least? Otherwise who is not paying
> > attention to this config option will get and ABI/API break.
> 
> Yes I think you are right, it can be disabled by default.
> 
I would agree, there seems to be overlap here, and the experimental tagging can
cover what the NEXT_API flag is meant to do.  It can be removed I think.
Neil

> 
> 
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08 15:35  0%       ` Neil Horman
@ 2018-03-08 16:04  0%         ` Thomas Monjalon
  2018-03-08 19:40  3%           ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-03-08 16:04 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ferruh Yigit, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

08/03/2018 16:35, Neil Horman:
> On Thu, Mar 08, 2018 at 04:17:00PM +0100, Thomas Monjalon wrote:
> > 08/03/2018 12:43, Ferruh Yigit:
> > > On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> > > > 07/03/2018 18:44, Ferruh Yigit:
> > > >> After experimental API process defined do we still need RTE_NEXT_ABI
> > > >> config and process which has similar targets?
> > > > 
> > > > They are different targets.
> > > > Experimental API is always enabled but may be avoided by applications.
> > > > Next ABI can be used to break ABI without notice and disabled to keep
> > > > old ABI compatibility. It is almost never used because it is preferred
> > > > to keep ABI compatibility with rte_compat macros, or wait a deprecation
> > > > period after notice.
> > > 
> > > OK, I see.
> > > 
> > > Shouldn't we disable it by default at least? Otherwise who is not paying
> > > attention to this config option will get and ABI/API break.
> > 
> > Yes I think you are right, it can be disabled by default.
> > 
> I would agree, there seems to be overlap here, and the experimental tagging can
> cover what the NEXT_API flag is meant to do.  It can be removed I think.

It is not NEXT_API but NEXT_ABI.
Why do you think it overlaps experimental API tagging?

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08 16:04  0%         ` Thomas Monjalon
@ 2018-03-08 19:40  3%           ` Neil Horman
  2018-03-08 21:34  4%             ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-03-08 19:40 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ferruh Yigit, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

On Thu, Mar 08, 2018 at 05:04:01PM +0100, Thomas Monjalon wrote:
> 08/03/2018 16:35, Neil Horman:
> > On Thu, Mar 08, 2018 at 04:17:00PM +0100, Thomas Monjalon wrote:
> > > 08/03/2018 12:43, Ferruh Yigit:
> > > > On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> > > > > 07/03/2018 18:44, Ferruh Yigit:
> > > > >> After experimental API process defined do we still need RTE_NEXT_ABI
> > > > >> config and process which has similar targets?
> > > > > 
> > > > > They are different targets.
> > > > > Experimental API is always enabled but may be avoided by applications.
> > > > > Next ABI can be used to break ABI without notice and disabled to keep
> > > > > old ABI compatibility. It is almost never used because it is preferred
> > > > > to keep ABI compatibility with rte_compat macros, or wait a deprecation
> > > > > period after notice.
> > > > 
> > > > OK, I see.
> > > > 
> > > > Shouldn't we disable it by default at least? Otherwise who is not paying
> > > > attention to this config option will get and ABI/API break.
> > > 
> > > Yes I think you are right, it can be disabled by default.
> > > 
> > I would agree, there seems to be overlap here, and the experimental tagging can
> > cover what the NEXT_API flag is meant to do.  It can be removed I think.
> 
> It is not NEXT_API but NEXT_ABI.
Sorry, typo, though I'm sure you got that, since the former doesn't exist,
right?
> Why do you think it overlaps experimental API tagging?

I assert that because the compat lib has macros to map common symbols to version
specific ones.  That is to say, if you change a data structure, you can setup
the API calls that use said structure such that version 1 or the symbol maps to
an internal function that uses the old structure, while version 2 maps to an
internal function that uses the new symbol

That is to say, if you're planning on introducing ABI changes, the experimental
API tagging can be used to implement what the NEXT_ABI macro does.

Neil

> 
> 
> 

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08 19:40  3%           ` Neil Horman
@ 2018-03-08 21:34  4%             ` Thomas Monjalon
  2018-03-09  0:18  4%               ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2018-03-08 21:34 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ferruh Yigit, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

08/03/2018 20:40, Neil Horman:
> On Thu, Mar 08, 2018 at 05:04:01PM +0100, Thomas Monjalon wrote:
> > 08/03/2018 16:35, Neil Horman:
> > > On Thu, Mar 08, 2018 at 04:17:00PM +0100, Thomas Monjalon wrote:
> > > > 08/03/2018 12:43, Ferruh Yigit:
> > > > > On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> > > > > > 07/03/2018 18:44, Ferruh Yigit:
> > > > > >> After experimental API process defined do we still need RTE_NEXT_ABI
> > > > > >> config and process which has similar targets?
> > > > > > 
> > > > > > They are different targets.
> > > > > > Experimental API is always enabled but may be avoided by applications.
> > > > > > Next ABI can be used to break ABI without notice and disabled to keep
> > > > > > old ABI compatibility. It is almost never used because it is preferred
> > > > > > to keep ABI compatibility with rte_compat macros, or wait a deprecation
> > > > > > period after notice.
> > > > > 
> > > > > OK, I see.
> > > > > 
> > > > > Shouldn't we disable it by default at least? Otherwise who is not paying
> > > > > attention to this config option will get and ABI/API break.
> > > > 
> > > > Yes I think you are right, it can be disabled by default.
> > > > 
> > > I would agree, there seems to be overlap here, and the experimental tagging can
> > > cover what the NEXT_API flag is meant to do.  It can be removed I think.
> > 
> > It is not NEXT_API but NEXT_ABI.
> Sorry, typo, though I'm sure you got that, since the former doesn't exist,
> right?
> > Why do you think it overlaps experimental API tagging?
> 
> I assert that because the compat lib has macros to map common symbols to version
> specific ones.  That is to say, if you change a data structure, you can setup
> the API calls that use said structure such that version 1 or the symbol maps to
> an internal function that uses the old structure, while version 2 maps to an
> internal function that uses the new symbol
> 
> That is to say, if you're planning on introducing ABI changes, the experimental
> API tagging can be used to implement what the NEXT_ABI macro does.

It is a different usage.
Experimental API tagging is for new functions.
rte_compat is used to avoid breaking the ABI when changing old code.
NEXT_ABI has been used in the past to disable an ABI breakage, which was
not possible to mitigate with rte_compat because impacting too many functions.

I am not saying that I like NEXT_ABI, but it could be useful exceptionnally.

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v7 2/7] eventtimer: add common code
  @ 2018-03-08 21:54  2%   ` Erik Gabriel Carrillo
  0 siblings, 0 replies; 200+ results
From: Erik Gabriel Carrillo @ 2018-03-08 21:54 UTC (permalink / raw)
  To: pbhagavatula; +Cc: dev, jerin.jacob, nipun.gupta, hemant.agrawal

This commit adds the logic that is shared by all event timer adapter
drivers; the common code handles instance allocation and some
initialization.

Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
 config/common_base                                |   1 +
 drivers/event/sw/sw_evdev.c                       |  18 +
 lib/librte_eventdev/Makefile                      |   2 +
 lib/librte_eventdev/rte_event_timer_adapter.c     | 459 ++++++++++++++++++++++
 lib/librte_eventdev/rte_event_timer_adapter_pmd.h | 150 +++++++
 lib/librte_eventdev/rte_eventdev.h                |   3 +
 lib/librte_eventdev/rte_eventdev_pmd.h            |  35 ++
 lib/librte_eventdev/rte_eventdev_version.map      |  20 +
 8 files changed, 688 insertions(+)
 create mode 100644 lib/librte_eventdev/rte_event_timer_adapter.c
 create mode 100644 lib/librte_eventdev/rte_event_timer_adapter_pmd.h

diff --git a/config/common_base b/config/common_base
index ad03cf4..286df74 100644
--- a/config/common_base
+++ b/config/common_base
@@ -546,6 +546,7 @@ CONFIG_RTE_LIBRTE_EVENTDEV=y
 CONFIG_RTE_LIBRTE_EVENTDEV_DEBUG=n
 CONFIG_RTE_EVENT_MAX_DEVS=16
 CONFIG_RTE_EVENT_MAX_QUEUES_PER_DEV=64
+CONFIG_RTE_EVENT_TIMER_ADAPTER_NUM_MAX=32
 
 #
 # Compile PMD for skeleton event device
diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
index 6672fd8..0847547 100644
--- a/drivers/event/sw/sw_evdev.c
+++ b/drivers/event/sw/sw_evdev.c
@@ -464,6 +464,22 @@ sw_eth_rx_adapter_caps_get(const struct rte_eventdev *dev,
 	return 0;
 }
 
+static int
+sw_timer_adapter_caps_get(const struct rte_eventdev *dev,
+			  uint64_t flags,
+			  uint32_t *caps,
+			  const struct rte_event_timer_adapter_ops **ops)
+{
+	RTE_SET_USED(dev);
+	RTE_SET_USED(flags);
+	*caps = 0;
+
+	/* Use default SW ops */
+	*ops = NULL;
+
+	return 0;
+}
+
 static void
 sw_info_get(struct rte_eventdev *dev, struct rte_event_dev_info *info)
 {
@@ -791,6 +807,8 @@ sw_probe(struct rte_vdev_device *vdev)
 
 			.eth_rx_adapter_caps_get = sw_eth_rx_adapter_caps_get,
 
+			.timer_adapter_caps_get = sw_timer_adapter_caps_get,
+
 			.xstats_get = sw_xstats_get,
 			.xstats_get_names = sw_xstats_get_names,
 			.xstats_get_by_name = sw_xstats_get_by_name,
diff --git a/lib/librte_eventdev/Makefile b/lib/librte_eventdev/Makefile
index 549b182..8b16e3f 100644
--- a/lib/librte_eventdev/Makefile
+++ b/lib/librte_eventdev/Makefile
@@ -20,6 +20,7 @@ LDLIBS += -lrte_eal -lrte_ring -lrte_ethdev -lrte_hash
 SRCS-y += rte_eventdev.c
 SRCS-y += rte_event_ring.c
 SRCS-y += rte_event_eth_rx_adapter.c
+SRCS-y += rte_event_timer_adapter.c
 
 # export include files
 SYMLINK-y-include += rte_eventdev.h
@@ -29,6 +30,7 @@ SYMLINK-y-include += rte_eventdev_pmd_vdev.h
 SYMLINK-y-include += rte_event_ring.h
 SYMLINK-y-include += rte_event_eth_rx_adapter.h
 SYMLINK-y-include += rte_event_timer_adapter.h
+SYMLINK-y-include += rte_event_timer_adapter_pmd.h
 
 # versioning export map
 EXPORT_MAP := rte_eventdev_version.map
diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
new file mode 100644
index 0000000..711d6b9
--- /dev/null
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -0,0 +1,459 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * All rights reserved.
+ */
+
+#include <string.h>
+
+#include <rte_memzone.h>
+#include <rte_memory.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+
+#include "rte_eventdev.h"
+#include "rte_eventdev_pmd.h"
+#include "rte_event_timer_adapter.h"
+#include "rte_event_timer_adapter_pmd.h"
+
+#define DATA_MZ_NAME_MAX_LEN 64
+#define DATA_MZ_NAME_FORMAT "rte_event_timer_adapter_data_%d"
+
+static int evtim_logtype;
+
+static struct rte_event_timer_adapter adapters[RTE_EVENT_TIMER_ADAPTER_NUM_MAX];
+
+static inline int
+adapter_valid(const struct rte_event_timer_adapter *adapter)
+{
+	return adapter != NULL && adapter->allocated == 1;
+}
+
+#define EVTIM_LOG(level, logtype, ...) \
+	rte_log(RTE_LOG_ ## level, logtype, \
+		RTE_FMT("EVTIMER: %s() line %u: " RTE_FMT_HEAD(__VA_ARGS__,) \
+			"\n", __func__, __LINE__, RTE_FMT_TAIL(__VA_ARGS__,)))
+
+#define EVTIM_LOG_ERR(...) EVTIM_LOG(ERR, evtim_logtype, __VA_ARGS__)
+
+#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
+#define EVTIM_LOG_DBG(...) \
+	EVTIM_LOG(DEBUG, evtim_logtype, __VA_ARGS__)
+#else
+#define EVTIM_LOG_DBG(...) (void)0
+#endif
+
+#define ADAPTER_VALID_OR_ERR_RET(adapter, retval) do { \
+	if (!adapter_valid(adapter))		       \
+		return retval;			       \
+} while (0)
+
+#define FUNC_PTR_OR_ERR_RET(func, errval) do { \
+	if ((func) == NULL)		       \
+		return errval;		       \
+} while (0)
+
+#define FUNC_PTR_OR_NULL_RET_WITH_ERRNO(func, errval) do { \
+	if ((func) == NULL) {				   \
+		rte_errno = errval;			   \
+		return NULL;				   \
+	}						   \
+} while (0)
+
+static int
+default_port_conf_cb(uint16_t id, uint8_t event_dev_id, uint8_t *event_port_id,
+		     void *conf_arg)
+{
+	struct rte_event_timer_adapter *adapter;
+	struct rte_eventdev *dev;
+	struct rte_event_dev_config dev_conf;
+	struct rte_event_port_conf *port_conf, def_port_conf = {0};
+	int started;
+	uint8_t port_id;
+	uint8_t dev_id;
+	int ret;
+
+	RTE_SET_USED(event_dev_id);
+
+	adapter = &adapters[id];
+	dev = &rte_eventdevs[adapter->data->event_dev_id];
+	dev_id = dev->data->dev_id;
+	dev_conf = dev->data->dev_conf;
+
+	started = dev->data->dev_started;
+	if (started)
+		rte_event_dev_stop(dev_id);
+
+	port_id = dev_conf.nb_event_ports;
+	dev_conf.nb_event_ports += 1;
+	ret = rte_event_dev_configure(dev_id, &dev_conf);
+	if (ret < 0) {
+		EVTIM_LOG_ERR("failed to configure event dev %u\n", dev_id);
+		if (started)
+			if (rte_event_dev_start(dev_id))
+				return -EIO;
+
+		return ret;
+	}
+
+	if (conf_arg != NULL)
+		port_conf = conf_arg;
+	else {
+		port_conf = &def_port_conf;
+		ret = rte_event_port_default_conf_get(dev_id, port_id,
+						      port_conf);
+		if (ret < 0)
+			return ret;
+	}
+
+	ret = rte_event_port_setup(dev_id, port_id, port_conf);
+	if (ret < 0) {
+		EVTIM_LOG_ERR("failed to setup event port %u on event dev %u\n",
+			      port_id, dev_id);
+		return ret;
+	}
+
+	*event_port_id = port_id;
+
+	if (started)
+		ret = rte_event_dev_start(dev_id);
+
+	return ret;
+}
+
+struct rte_event_timer_adapter * __rte_experimental
+rte_event_timer_adapter_create(const struct rte_event_timer_adapter_conf *conf)
+{
+	return rte_event_timer_adapter_create_ext(conf, default_port_conf_cb,
+						  NULL);
+}
+
+struct rte_event_timer_adapter * __rte_experimental
+rte_event_timer_adapter_create_ext(
+		const struct rte_event_timer_adapter_conf *conf,
+		rte_event_timer_adapter_port_conf_cb_t conf_cb,
+		void *conf_arg)
+{
+	uint16_t adapter_id;
+	struct rte_event_timer_adapter *adapter;
+	const struct rte_memzone *mz;
+	char mz_name[DATA_MZ_NAME_MAX_LEN];
+	int n, ret;
+	struct rte_eventdev *dev;
+
+	if (conf == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	/* Check eventdev ID */
+	if (!rte_event_pmd_is_valid_dev(conf->event_dev_id)) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	dev = &rte_eventdevs[conf->event_dev_id];
+
+	adapter_id = conf->timer_adapter_id;
+
+	/* Check that adapter_id is in range */
+	if (adapter_id >= RTE_EVENT_TIMER_ADAPTER_NUM_MAX) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	/* Check adapter ID not already allocated */
+	adapter = &adapters[adapter_id];
+	if (adapter->allocated) {
+		rte_errno = EEXIST;
+		return NULL;
+	}
+
+	/* Create shared data area. */
+	n = snprintf(mz_name, sizeof(mz_name), DATA_MZ_NAME_FORMAT, adapter_id);
+	if (n >= (int)sizeof(mz_name)) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+	mz = rte_memzone_reserve(mz_name,
+				 sizeof(struct rte_event_timer_adapter_data),
+				 conf->socket_id, 0);
+	if (mz == NULL)
+		/* rte_errno set by rte_memzone_reserve */
+		return NULL;
+
+	adapter->data = mz->addr;
+	memset(adapter->data, 0, sizeof(struct rte_event_timer_adapter_data));
+
+	adapter->data->mz = mz;
+	adapter->data->event_dev_id = conf->event_dev_id;
+	adapter->data->id = adapter_id;
+	adapter->data->socket_id = conf->socket_id;
+	adapter->data->conf = *conf;  /* copy conf structure */
+
+	/* Query eventdev PMD for timer adapter capabilities and ops */
+	ret = dev->dev_ops->timer_adapter_caps_get(dev,
+						   adapter->data->conf.flags,
+						   &adapter->data->caps,
+						   &adapter->ops);
+	if (ret < 0) {
+		rte_errno = ret;
+		goto free_memzone;
+	}
+
+	if (!(adapter->data->caps &
+	      RTE_EVENT_TIMER_ADAPTER_CAP_INTERNAL_PORT)) {
+		FUNC_PTR_OR_NULL_RET_WITH_ERRNO(conf_cb, -EINVAL);
+		ret = conf_cb(adapter->data->id, adapter->data->event_dev_id,
+			      &adapter->data->event_port_id, conf_arg);
+		if (ret < 0) {
+			rte_errno = ret;
+			goto free_memzone;
+		}
+	}
+
+	/* Allow driver to do some setup */
+	FUNC_PTR_OR_NULL_RET_WITH_ERRNO(adapter->ops->init, -ENOTSUP);
+	ret = adapter->ops->init(adapter);
+	if (ret < 0) {
+		rte_errno = ret;
+		goto free_memzone;
+	}
+
+	/* Set fast-path function pointers */
+	adapter->arm_burst = adapter->ops->arm_burst;
+	adapter->arm_tmo_tick_burst = adapter->ops->arm_tmo_tick_burst;
+	adapter->cancel_burst = adapter->ops->cancel_burst;
+
+	adapter->allocated = 1;
+
+	return adapter;
+
+free_memzone:
+	rte_memzone_free(adapter->data->mz);
+	return NULL;
+}
+
+int __rte_experimental
+rte_event_timer_adapter_get_info(const struct rte_event_timer_adapter *adapter,
+		struct rte_event_timer_adapter_info *adapter_info)
+{
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+
+	if (adapter->ops->get_info)
+		/* let driver set values it knows */
+		adapter->ops->get_info(adapter, adapter_info);
+
+	/* Set common values */
+	adapter_info->conf = adapter->data->conf;
+	adapter_info->event_dev_port_id = adapter->data->event_port_id;
+	adapter_info->caps = adapter->data->caps;
+
+	return 0;
+}
+
+int __rte_experimental
+rte_event_timer_adapter_start(const struct rte_event_timer_adapter *adapter)
+{
+	int ret;
+
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->ops->start, -EINVAL);
+
+	ret = adapter->ops->start(adapter);
+	if (ret < 0)
+		return ret;
+
+	adapter->data->started = 1;
+
+	return 0;
+}
+
+int __rte_experimental
+rte_event_timer_adapter_stop(const struct rte_event_timer_adapter *adapter)
+{
+	int ret;
+
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->ops->stop, -EINVAL);
+
+	if (adapter->data->started == 0) {
+		EVTIM_LOG_ERR("event timer adapter %hu already stopped",
+			      adapter->data->id);
+		return 0;
+	}
+
+	ret = adapter->ops->stop(adapter);
+	if (ret < 0)
+		return ret;
+
+	adapter->data->started = 0;
+
+	return 0;
+}
+
+struct rte_event_timer_adapter * __rte_experimental
+rte_event_timer_adapter_lookup(uint16_t adapter_id)
+{
+	char name[DATA_MZ_NAME_MAX_LEN];
+	const struct rte_memzone *mz;
+	struct rte_event_timer_adapter_data *data;
+	struct rte_event_timer_adapter *adapter;
+	int ret;
+	struct rte_eventdev *dev;
+
+	if (adapters[adapter_id].allocated)
+		return &adapters[adapter_id]; /* Adapter is already loaded */
+
+	snprintf(name, DATA_MZ_NAME_MAX_LEN, DATA_MZ_NAME_FORMAT, adapter_id);
+	mz = rte_memzone_lookup(name);
+	if (mz == NULL) {
+		rte_errno = ENOENT;
+		return NULL;
+	}
+
+	data = mz->addr;
+
+	adapter = &adapters[data->id];
+	adapter->data = data;
+
+	dev = &rte_eventdevs[adapter->data->event_dev_id];
+
+	/* Query eventdev PMD for timer adapter capabilities and ops */
+	ret = dev->dev_ops->timer_adapter_caps_get(dev,
+						   adapter->data->conf.flags,
+						   &adapter->data->caps,
+						   &adapter->ops);
+	if (ret < 0) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	/* Set fast-path function pointers */
+	adapter->arm_burst = adapter->ops->arm_burst;
+	adapter->arm_tmo_tick_burst = adapter->ops->arm_tmo_tick_burst;
+	adapter->cancel_burst = adapter->ops->cancel_burst;
+
+	adapter->allocated = 1;
+
+	return adapter;
+}
+
+int __rte_experimental
+rte_event_timer_adapter_free(struct rte_event_timer_adapter *adapter)
+{
+	int ret;
+
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->ops->uninit, -EINVAL);
+
+	if (adapter->data->started == 1) {
+		EVTIM_LOG_ERR("event timer adapter %hu must be stopped "
+			      "before freeing", adapter->data->id);
+		return -EBUSY;
+	}
+
+	/* free impl priv data */
+	ret = adapter->ops->uninit(adapter);
+	if (ret < 0)
+		return ret;
+
+	/* free shared data area */
+	ret = rte_memzone_free(adapter->data->mz);
+	if (ret < 0)
+		return ret;
+
+	adapter->data = NULL;
+	adapter->allocated = 0;
+
+	return 0;
+}
+
+int __rte_experimental
+rte_event_timer_adapter_service_id_get(struct rte_event_timer_adapter *adapter,
+				       uint32_t *service_id)
+{
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+
+	if (adapter->data->service_inited && service_id != NULL)
+		*service_id = adapter->data->service_id;
+
+	return adapter->data->service_inited ? 0 : -ESRCH;
+}
+
+int __rte_experimental
+rte_event_timer_adapter_stats_get(struct rte_event_timer_adapter *adapter,
+				  struct rte_event_timer_adapter_stats *stats)
+{
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->ops->stats_get, -EINVAL);
+	if (stats == NULL)
+		return -EINVAL;
+
+	return adapter->ops->stats_get(adapter, stats);
+}
+
+int __rte_experimental
+rte_event_timer_adapter_stats_reset(struct rte_event_timer_adapter *adapter)
+{
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->ops->stats_reset, -EINVAL);
+	return adapter->ops->stats_reset(adapter);
+}
+
+void __rte_experimental
+rte_event_timer_init(struct rte_event_timer *evtim)
+{
+	evtim->ev.op = RTE_EVENT_OP_NEW;
+	evtim->ev.event_type = RTE_EVENT_TYPE_TIMER;
+	evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+}
+
+int __rte_experimental
+rte_event_timer_arm_burst(const struct rte_event_timer_adapter *adapter,
+			  struct rte_event_timer **evtims,
+			  uint16_t nb_evtims)
+{
+#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->arm_burst, -EINVAL);
+#endif
+
+	return adapter->arm_burst(adapter, evtims, nb_evtims);
+}
+
+int __rte_experimental
+rte_event_timer_arm_tmo_tick_burst(
+			const struct rte_event_timer_adapter *adapter,
+			struct rte_event_timer **evtims,
+			const uint64_t timeout_ticks,
+			const uint16_t nb_evtims)
+{
+#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->arm_tmo_tick_burst, -EINVAL);
+#endif
+
+	return adapter->arm_tmo_tick_burst(adapter, evtims, timeout_ticks,
+					   nb_evtims);
+}
+
+int __rte_experimental
+rte_event_timer_cancel_burst(const struct rte_event_timer_adapter *adapter,
+			     struct rte_event_timer **evtims,
+			     uint16_t nb_evtims)
+{
+#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
+	ADAPTER_VALID_OR_ERR_RET(adapter, -EINVAL);
+	FUNC_PTR_OR_ERR_RET(adapter->cancel_burst, -EINVAL);
+#endif
+
+	return adapter->cancel_burst(adapter, evtims, nb_evtims);
+}
+
+RTE_INIT(event_timer_adapter_init_log);
+static void
+event_timer_adapter_init_log(void)
+{
+	evtim_logtype = rte_log_register("lib.eventdev.adapter.timer");
+	if (evtim_logtype >= 0)
+		rte_log_set_level(evtim_logtype, RTE_LOG_NOTICE);
+}
diff --git a/lib/librte_eventdev/rte_event_timer_adapter_pmd.h b/lib/librte_eventdev/rte_event_timer_adapter_pmd.h
new file mode 100644
index 0000000..db044c8
--- /dev/null
+++ b/lib/librte_eventdev/rte_event_timer_adapter_pmd.h
@@ -0,0 +1,150 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * All rights reserved.
+ */
+
+#ifndef __RTE_EVENT_TIMER_ADAPTER_PMD_H__
+#define __RTE_EVENT_TIMER_ADAPTER_PMD_H__
+
+/**
+ * @file
+ * RTE Event Timer Adapter API (PMD Side)
+ *
+ * @note
+ * This file provides implementation helpers for internal use by PMDs.  They
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "rte_event_timer_adapter.h"
+
+/*
+ * Definitions of functions exported by an event timer adapter implementation
+ * through *rte_event_timer_adapter_ops* structure supplied in the
+ * *rte_event_timer_adapter* structure associated with an event timer adapter.
+ */
+
+typedef int (*rte_event_timer_adapter_init_t)(
+		struct rte_event_timer_adapter *adapter);
+/**< @internal Event timer adapter implementation setup */
+typedef int (*rte_event_timer_adapter_uninit_t)(
+		struct rte_event_timer_adapter *adapter);
+/**< @internal Event timer adapter implementation teardown */
+typedef int (*rte_event_timer_adapter_start_t)(
+		const struct rte_event_timer_adapter *adapter);
+/**< @internal Start running event timer adapter */
+typedef int (*rte_event_timer_adapter_stop_t)(
+		const struct rte_event_timer_adapter *adapter);
+/**< @internal Stop running event timer adapter */
+typedef void (*rte_event_timer_adapter_get_info_t)(
+		const struct rte_event_timer_adapter *adapter,
+		struct rte_event_timer_adapter_info *adapter_info);
+/**< @internal Get contextual information for event timer adapter */
+typedef int (*rte_event_timer_adapter_stats_get_t)(
+		const struct rte_event_timer_adapter *adapter,
+		struct rte_event_timer_adapter_stats *stats);
+/**< @internal Get statistics for event timer adapter */
+typedef int (*rte_event_timer_adapter_stats_reset_t)(
+		const struct rte_event_timer_adapter *adapter);
+/**< @internal Reset statistics for event timer adapter */
+typedef int (*rte_event_timer_arm_burst_t)(
+		const struct rte_event_timer_adapter *adapter,
+		struct rte_event_timer **tims,
+		uint16_t nb_tims);
+/**< @internal Enable event timers to enqueue timer events upon expiry */
+typedef int (*rte_event_timer_arm_tmo_tick_burst_t)(
+		const struct rte_event_timer_adapter *adapter,
+		struct rte_event_timer **tims,
+		uint64_t timeout_tick,
+		uint16_t nb_tims);
+/**< @internal Enable event timers with common expiration time */
+typedef int (*rte_event_timer_cancel_burst_t)(
+		const struct rte_event_timer_adapter *adapter,
+		struct rte_event_timer **tims,
+		uint16_t nb_tims);
+/**< @internal Prevent event timers from enqueuing timer events */
+
+/**
+ * @internal Structure containing the functions exported by an event timer
+ * adapter implementation.
+ */
+struct rte_event_timer_adapter_ops {
+	rte_event_timer_adapter_init_t		init;  /**< Set up adapter */
+	rte_event_timer_adapter_uninit_t	uninit;/**< Tear down adapter */
+	rte_event_timer_adapter_start_t		start; /**< Start adapter */
+	rte_event_timer_adapter_stop_t		stop;  /**< Stop adapter */
+	rte_event_timer_adapter_get_info_t	get_info;
+	/**< Get info from driver */
+	rte_event_timer_adapter_stats_get_t	stats_get;
+	/**< Get adapter statistics */
+	rte_event_timer_adapter_stats_reset_t	stats_reset;
+	/**< Reset adapter statistics */
+	rte_event_timer_arm_burst_t		arm_burst;
+	/**< Arm one or more event timers */
+	rte_event_timer_arm_tmo_tick_burst_t	arm_tmo_tick_burst;
+	/**< Arm event timers with same expiration time */
+	rte_event_timer_cancel_burst_t		cancel_burst;
+	/**< Cancel one or more event timers */
+};
+
+/**
+ * @internal Adapter data; structure to be placed in shared memory to be
+ * accessible by various processes in a multi-process configuration.
+ */
+struct rte_event_timer_adapter_data {
+	uint8_t id;
+	/**< Event timer adapter ID */
+	uint8_t event_dev_id;
+	/**< Event device ID */
+	uint32_t socket_id;
+	/**< Socket ID where memory is allocated */
+	uint8_t event_port_id;
+	/**< Optional: event port ID used when the inbuilt port is absent */
+	const struct rte_memzone *mz;
+	/**< Event timer adapter memzone pointer */
+	struct rte_event_timer_adapter_conf conf;
+	/**< Configuration used to configure the adapter. */
+	uint32_t caps;
+	/**< Adapter capabilities */
+	void *adapter_priv;
+	/**< Timer adapter private data*/
+	uint8_t service_inited;
+	/**< Service initialization state */
+	uint32_t service_id;
+	/**< Service ID*/
+
+	RTE_STD_C11
+	uint8_t started : 1;
+	/**< Flag to indicate adapter started. */
+} __rte_cache_aligned;
+
+/**
+ * @internal Data structure associated with each event timer adapter.
+ */
+struct rte_event_timer_adapter {
+	rte_event_timer_arm_burst_t arm_burst;
+	/**< Pointer to driver arm_burst function. */
+	rte_event_timer_arm_tmo_tick_burst_t arm_tmo_tick_burst;
+	/**< Pointer to driver arm_tmo_tick_burst function. */
+	rte_event_timer_cancel_burst_t cancel_burst;
+	/**< Pointer to driver cancel function. */
+	struct rte_event_timer_adapter_data *data;
+	/**< Pointer to shared adapter data */
+	const struct rte_event_timer_adapter_ops *ops;
+	/**< Functions exported by adapter driver */
+
+	RTE_STD_C11
+	uint8_t allocated : 1;
+	/**< Flag to indicate that this adapter has been allocated */
+} __rte_cache_aligned;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __RTE_EVENT_TIMER_ADAPTER_PMD_H__ */
diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
index f9ad71e..888bcf1 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -1046,6 +1046,9 @@ struct rte_event {
  * @see struct rte_event_eth_rx_adapter_queue_conf::rx_queue_flags
  */
 
+#define RTE_EVENT_TIMER_ADAPTER_CAP_INTERNAL_PORT (1ULL << 1)
+/**< This flag is set when the timer mechanism is in HW. */
+
 /**
  * Retrieve the event device's ethdev Rx adapter capabilities for the
  * specified ethernet port
diff --git a/lib/librte_eventdev/rte_eventdev_pmd.h b/lib/librte_eventdev/rte_eventdev_pmd.h
index 31343b5..0e37f1c 100644
--- a/lib/librte_eventdev/rte_eventdev_pmd.h
+++ b/lib/librte_eventdev/rte_eventdev_pmd.h
@@ -26,6 +26,7 @@ extern "C" {
 #include <rte_malloc.h>
 
 #include "rte_eventdev.h"
+#include "rte_event_timer_adapter_pmd.h"
 
 /* Logging Macros */
 #define RTE_EDEV_LOG_ERR(...) \
@@ -449,6 +450,37 @@ typedef int (*eventdev_eth_rx_adapter_caps_get_t)
 struct rte_event_eth_rx_adapter_queue_conf *queue_conf;
 
 /**
+ * Retrieve the event device's timer adapter capabilities, as well as the ops
+ * structure that an event timer adapter should call through to enter the
+ * driver
+ *
+ * @param dev
+ *   Event device pointer
+ *
+ * @param flags
+ *   Flags that can be used to determine how to select an event timer
+ *   adapter ops structure
+ *
+ * @param[out] caps
+ *   A pointer to memory filled with Rx event adapter capabilities.
+ *
+ * @param[out] ops
+ *   A pointer to the ops pointer to set with the address of the desired ops
+ *   structure
+ *
+ * @return
+ *   - 0: Success, driver provides Rx event adapter capabilities for the
+ *	ethernet device.
+ *   - <0: Error code returned by the driver function.
+ *
+ */
+typedef int (*eventdev_timer_adapter_caps_get_t)(
+				const struct rte_eventdev *dev,
+				uint64_t flags,
+				uint32_t *caps,
+				const struct rte_event_timer_adapter_ops **ops);
+
+/**
  * Add ethernet Rx queues to event device. This callback is invoked if
  * the caps returned from rte_eventdev_eth_rx_adapter_caps_get(, eth_port_id)
  * has RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT set.
@@ -640,6 +672,9 @@ struct rte_eventdev_ops {
 	eventdev_eth_rx_adapter_stats_reset eth_rx_adapter_stats_reset;
 	/**< Reset ethernet Rx stats */
 
+	eventdev_timer_adapter_caps_get_t timer_adapter_caps_get;
+	/**< Get timer adapter capabilities */
+
 	eventdev_selftest dev_selftest;
 	/**< Start eventdev Selftest */
 };
diff --git a/lib/librte_eventdev/rte_eventdev_version.map b/lib/librte_eventdev/rte_eventdev_version.map
index 2aef470..345b0b1 100644
--- a/lib/librte_eventdev/rte_eventdev_version.map
+++ b/lib/librte_eventdev/rte_eventdev_version.map
@@ -74,3 +74,23 @@ DPDK_18.02 {
 
 	rte_event_dev_selftest;
 } DPDK_17.11;
+
+EXPERIMENTAL {
+	global:
+
+	rte_event_timer_adapter_create;
+	rte_event_timer_adapter_create_ext;
+	rte_event_timer_adapter_free;
+	rte_event_timer_adapter_get_info;
+	rte_event_timer_adapter_lookup;
+	rte_event_timer_adapter_service_id_get;
+	rte_event_timer_adapter_service_id_get;
+	rte_event_timer_adapter_start;
+	rte_event_timer_adapter_stats_get;
+	rte_event_timer_adapter_stats_reset;
+	rte_event_timer_adapter_stop;
+	rte_event_timer_init;
+	rte_event_timer_arm_burst;
+	rte_event_timer_arm_tmo_tick_burst;
+	rte_event_timer_cancel_burst;
+} DPDK_18.02;
-- 
2.6.4

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI
  2018-03-08 21:34  4%             ` Thomas Monjalon
@ 2018-03-09  0:18  4%               ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2018-03-09  0:18 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ferruh Yigit, John McNamara, Marko Kovacevic, dev, Luca Boccassi,
	Christian Ehrhardt

On Thu, Mar 08, 2018 at 10:34:14PM +0100, Thomas Monjalon wrote:
> 08/03/2018 20:40, Neil Horman:
> > On Thu, Mar 08, 2018 at 05:04:01PM +0100, Thomas Monjalon wrote:
> > > 08/03/2018 16:35, Neil Horman:
> > > > On Thu, Mar 08, 2018 at 04:17:00PM +0100, Thomas Monjalon wrote:
> > > > > 08/03/2018 12:43, Ferruh Yigit:
> > > > > > On 3/8/2018 8:05 AM, Thomas Monjalon wrote:
> > > > > > > 07/03/2018 18:44, Ferruh Yigit:
> > > > > > >> After experimental API process defined do we still need RTE_NEXT_ABI
> > > > > > >> config and process which has similar targets?
> > > > > > > 
> > > > > > > They are different targets.
> > > > > > > Experimental API is always enabled but may be avoided by applications.
> > > > > > > Next ABI can be used to break ABI without notice and disabled to keep
> > > > > > > old ABI compatibility. It is almost never used because it is preferred
> > > > > > > to keep ABI compatibility with rte_compat macros, or wait a deprecation
> > > > > > > period after notice.
> > > > > > 
> > > > > > OK, I see.
> > > > > > 
> > > > > > Shouldn't we disable it by default at least? Otherwise who is not paying
> > > > > > attention to this config option will get and ABI/API break.
> > > > > 
> > > > > Yes I think you are right, it can be disabled by default.
> > > > > 
> > > > I would agree, there seems to be overlap here, and the experimental tagging can
> > > > cover what the NEXT_API flag is meant to do.  It can be removed I think.
> > > 
> > > It is not NEXT_API but NEXT_ABI.
> > Sorry, typo, though I'm sure you got that, since the former doesn't exist,
> > right?
> > > Why do you think it overlaps experimental API tagging?
> > 
> > I assert that because the compat lib has macros to map common symbols to version
> > specific ones.  That is to say, if you change a data structure, you can setup
> > the API calls that use said structure such that version 1 or the symbol maps to
> > an internal function that uses the old structure, while version 2 maps to an
> > internal function that uses the new symbol
> > 
> > That is to say, if you're planning on introducing ABI changes, the experimental
> > API tagging can be used to implement what the NEXT_ABI macro does.
> 
> It is a different usage.
> Experimental API tagging is for new functions.
> rte_compat is used to avoid breaking the ABI when changing old code.
> NEXT_ABI has been used in the past to disable an ABI breakage, which was
> not possible to mitigate with rte_compat because impacting too many functions.
> 
Thats not entirely true.  It _is_ used to manage ABI changes when backwards
compatibiilty needs to be preserved. It _can_be_ used for experimental abi
management.  That is to say, if you want to modify an existing ABI symbol, you
can do so by writing a new function, and then exporting the new function as the
old symbol with the @EXPERIMENTAL version.  Not saying we have to do that, but
we certainly can, and can eliminate NEXT_ABI in the process.

> I am not saying that I like NEXT_ABI, but it could be useful exceptionnally.
> 
Well, if the consensus is that it should be kept, its no skin off my nose, but
the discussion was around removing NEXT_ABI, and I was copied, so I thought I'd
add my $0.02

Neil

> 
> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v4] ethdev: return named opaque type instead of void pointer
       [not found]       ` <20180309123651.GB19004@hmswarspite.think-freely.org>
@ 2018-03-09 13:00  0%     ` Ferruh Yigit
  2018-03-09 15:16  0%       ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-03-09 13:00 UTC (permalink / raw)
  To: Neil Horman; +Cc: John McNamara, Marko Kovacevic, Thomas Monjalon, dev

On 3/9/2018 12:36 PM, Neil Horman wrote:
> On Fri, Mar 09, 2018 at 11:25:31AM +0000, Ferruh Yigit wrote:
>> "struct rte_eth_rxtx_callback" is defined as internal data structure and
>> used as named opaque type.
>>
>> So the functions that are adding callbacks can return objects in this
>> type instead of void pointer.
>>
>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
>> ---
>> v2:
>> * keep using struct * in parameters, instead add callback functions
>> return struct rte_eth_rxtx_callback pointer.
>>
>> v4:
>> * Remove deprecation notice. LIBABIVER already increased in this release
>> ---
>>  doc/guides/rel_notes/deprecation.rst |  7 -------
>>  lib/librte_ether/rte_ethdev.c        |  6 +++---
>>  lib/librte_ether/rte_ethdev.h        | 13 ++++++++-----
>>  3 files changed, 11 insertions(+), 15 deletions(-)
>>
> This doesn't quite make sense to me.  If rte_eth_rxtx_callback is defined as an
> internal data structure, then it shouldn't be used as part of the prototype for
> an exported function, as the structure will then no longer be a internal data
> structure, but rather part of the public ABI.

"struct rte_eth_rxtx_callback" is internal data structure. And application
should not access elements of this structure.

"struct rte_eth_rxtx_callback;" is defined in the public header, so applications
can use it as opaque type.

It is possible that both "add" and "remove" APIs use "void *" and API itself can
cast it. But the inconsistency was "add" related APIs return "void *" and
"remove" related APIs require a parameter in "struct rte_eth_rxtx_callback *" type.

While unifying the usage, "struct rte_eth_rxtx_callback *" preferred against
"void *", because named opaque type documents intention/usage better.

Thanks,
ferruh

> 
> Neil
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4] ethdev: return named opaque type instead of void pointer
  2018-03-09 13:00  0%     ` Ferruh Yigit
@ 2018-03-09 15:16  0%       ` Neil Horman
  2018-03-09 15:45  0%         ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Neil Horman @ 2018-03-09 15:16 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: John McNamara, Marko Kovacevic, Thomas Monjalon, dev

On Fri, Mar 09, 2018 at 01:00:35PM +0000, Ferruh Yigit wrote:
> On 3/9/2018 12:36 PM, Neil Horman wrote:
> > On Fri, Mar 09, 2018 at 11:25:31AM +0000, Ferruh Yigit wrote:
> >> "struct rte_eth_rxtx_callback" is defined as internal data structure and
> >> used as named opaque type.
> >>
> >> So the functions that are adding callbacks can return objects in this
> >> type instead of void pointer.
> >>
> >> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> >> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
> >> ---
> >> v2:
> >> * keep using struct * in parameters, instead add callback functions
> >> return struct rte_eth_rxtx_callback pointer.
> >>
> >> v4:
> >> * Remove deprecation notice. LIBABIVER already increased in this release
> >> ---
> >>  doc/guides/rel_notes/deprecation.rst |  7 -------
> >>  lib/librte_ether/rte_ethdev.c        |  6 +++---
> >>  lib/librte_ether/rte_ethdev.h        | 13 ++++++++-----
> >>  3 files changed, 11 insertions(+), 15 deletions(-)
> >>
> > This doesn't quite make sense to me.  If rte_eth_rxtx_callback is defined as an
> > internal data structure, then it shouldn't be used as part of the prototype for
> > an exported function, as the structure will then no longer be a internal data
> > structure, but rather part of the public ABI.
> 
> "struct rte_eth_rxtx_callback" is internal data structure. And application
> should not access elements of this structure.
> 
> "struct rte_eth_rxtx_callback;" is defined in the public header, so applications
> can use it as opaque type.
> 
> It is possible that both "add" and "remove" APIs use "void *" and API itself can
> cast it. But the inconsistency was "add" related APIs return "void *" and
> "remove" related APIs require a parameter in "struct rte_eth_rxtx_callback *" type.
> 
> While unifying the usage, "struct rte_eth_rxtx_callback *" preferred against
> "void *", because named opaque type documents intention/usage better.
> 
> Thanks,
> ferruh
> 
I get what you're saying about rte_eth_rxtx_callback being an internals
structure (or its intent is to be an internal structure), but it doesn't seem to
hold up to the header file layout.  rte_eth_rxtx_callback is defined in
rte_ethdev_core.h which according to the makefile, is listed as a symlinked
file, and therefore available for external applications to include.  This
negates the intended opaque nature of the struct.  I think before you do this,
you want to rectify that.

Neil

> > 
> > Neil
> > 
> 
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v4] ethdev: return named opaque type instead of void pointer
  2018-03-09 15:16  0%       ` Neil Horman
@ 2018-03-09 15:45  0%         ` Ferruh Yigit
  2018-03-09 19:06  0%           ` Neil Horman
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2018-03-09 15:45 UTC (permalink / raw)
  To: Neil Horman; +Cc: John McNamara, Marko Kovacevic, Thomas Monjalon, dev

On 3/9/2018 3:16 PM, Neil Horman wrote:
> On Fri, Mar 09, 2018 at 01:00:35PM +0000, Ferruh Yigit wrote:
>> On 3/9/2018 12:36 PM, Neil Horman wrote:
>>> On Fri, Mar 09, 2018 at 11:25:31AM +0000, Ferruh Yigit wrote:
>>>> "struct rte_eth_rxtx_callback" is defined as internal data structure and
>>>> used as named opaque type.
>>>>
>>>> So the functions that are adding callbacks can return objects in this
>>>> type instead of void pointer.
>>>>
>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>>> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
>>>> ---
>>>> v2:
>>>> * keep using struct * in parameters, instead add callback functions
>>>> return struct rte_eth_rxtx_callback pointer.
>>>>
>>>> v4:
>>>> * Remove deprecation notice. LIBABIVER already increased in this release
>>>> ---
>>>>  doc/guides/rel_notes/deprecation.rst |  7 -------
>>>>  lib/librte_ether/rte_ethdev.c        |  6 +++---
>>>>  lib/librte_ether/rte_ethdev.h        | 13 ++++++++-----
>>>>  3 files changed, 11 insertions(+), 15 deletions(-)
>>>>
>>> This doesn't quite make sense to me.  If rte_eth_rxtx_callback is defined as an
>>> internal data structure, then it shouldn't be used as part of the prototype for
>>> an exported function, as the structure will then no longer be a internal data
>>> structure, but rather part of the public ABI.
>>
>> "struct rte_eth_rxtx_callback" is internal data structure. And application
>> should not access elements of this structure.
>>
>> "struct rte_eth_rxtx_callback;" is defined in the public header, so applications
>> can use it as opaque type.
>>
>> It is possible that both "add" and "remove" APIs use "void *" and API itself can
>> cast it. But the inconsistency was "add" related APIs return "void *" and
>> "remove" related APIs require a parameter in "struct rte_eth_rxtx_callback *" type.
>>
>> While unifying the usage, "struct rte_eth_rxtx_callback *" preferred against
>> "void *", because named opaque type documents intention/usage better.
>>
>> Thanks,
>> ferruh
>>
> I get what you're saying about rte_eth_rxtx_callback being an internals
> structure (or its intent is to be an internal structure), but it doesn't seem to
> hold up to the header file layout.  rte_eth_rxtx_callback is defined in
> rte_ethdev_core.h which according to the makefile, is listed as a symlinked
> file, and therefore available for external applications to include.  This
> negates the intended opaque nature of the struct.  I think before you do this,
> you want to rectify that.

Intention is to make "struct rte_eth_rxtx_callback" internal, but as you said it
is available to applications. This is same for all data structures in
rte_ethdev_core.h

Unfortunately it can't be actual internal because of inline functions in public
header uses them. And we can't change inline functions because of performance
concerns.

Since we can't make the structure real internal, we can't really prevent
applications to access the internals, this same if you use "void *".

> 
> Neil
> 
>>>
>>> Neil
>>>
>>
>>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 18.05 v4] eal: add function to return number of detected sockets
  2018-03-08 14:38  0%         ` Burakov, Anatoly
@ 2018-03-09 16:32  0%           ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2018-03-09 16:32 UTC (permalink / raw)
  To: Burakov, Anatoly; +Cc: dev

On Thu, Mar 08, 2018 at 02:38:37PM +0000, Burakov, Anatoly wrote:
> On 08-Mar-18 12:12 PM, Bruce Richardson wrote:
> > On Wed, Feb 07, 2018 at 09:58:36AM +0000, Anatoly Burakov wrote:
> > > During lcore scan, find maximum socket ID and store it. This will
> > > break the ABI, so bump ABI version.
> > > 
> > > Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> > > ---
> > > 
> > > Notes:
> > >      v4:
> > >      - Remove backwards ABI compatibility, bump ABI instead
> > >      v3:
> > >      - Added ABI compatibility
> > >      v2:
> > >      - checkpatch changes
> > >      - check socket before deciding if the core is not to be used
> > > 
> > >   lib/librte_eal/bsdapp/eal/Makefile        |  2 +-
> > >   lib/librte_eal/common/eal_common_lcore.c  | 37 +++++++++++++++++++++----------
> > >   lib/librte_eal/common/include/rte_eal.h   |  1 +
> > >   lib/librte_eal/common/include/rte_lcore.h |  8 +++++++
> > >   lib/librte_eal/linuxapp/eal/Makefile      |  2 +-
> > >   lib/librte_eal/rte_eal_version.map        |  9 +++++++-
> > >   6 files changed, 44 insertions(+), 15 deletions(-)
> > > 
> > Breaking the ABI is the best way to implement this change, and given the
> > deprecation was previously announced I'm ok with that.
> > 
> > Question: we are ok assuming that the socket numbers are sequential, or
> > nearly so, and knowing the maximum socket number seen is a good
> > approximation of the actual physical sockets? I know in terms of cores
> > on a system, the core id's often jump - are there systems where the
> > socket numbers do too?
> > 
> > /Bruce
> > 
> 
> I am not aware of any system that would jump sockets like that. I'm open to
> corrections, however :)
> 
> -- 
In the absense of any corrections, I think this is fine to have.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v1 1/5] bpf: add BPF loading and execution framework
  @ 2018-03-09 16:42  2% ` Konstantin Ananyev
  0 siblings, 0 replies; 200+ results
From: Konstantin Ananyev @ 2018-03-09 16:42 UTC (permalink / raw)
  To: dev; +Cc: Konstantin Ananyev

librte_bpf provides a framework to load and execute eBPF bytecode
inside user-space dpdk based applications.
It supports basic set of features from eBPF spec
(https://www.kernel.org/doc/Documentation/networking/filter.txt).

Not currently supported features:
 - JIT
 - cBPF
 - tail-pointer call
 - eBPF MAP
 - skb

It also adds dependency on libelf.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 config/common_base                 |   5 +
 config/common_linuxapp             |   1 +
 lib/Makefile                       |   2 +
 lib/librte_bpf/Makefile            |  30 +++
 lib/librte_bpf/bpf.c               |  48 ++++
 lib/librte_bpf/bpf_exec.c          | 452 +++++++++++++++++++++++++++++++++++++
 lib/librte_bpf/bpf_impl.h          |  37 +++
 lib/librte_bpf/bpf_load.c          | 380 +++++++++++++++++++++++++++++++
 lib/librte_bpf/bpf_validate.c      |  55 +++++
 lib/librte_bpf/rte_bpf.h           | 158 +++++++++++++
 lib/librte_bpf/rte_bpf_version.map |  12 +
 mk/rte.app.mk                      |   2 +
 12 files changed, 1182 insertions(+)
 create mode 100644 lib/librte_bpf/Makefile
 create mode 100644 lib/librte_bpf/bpf.c
 create mode 100644 lib/librte_bpf/bpf_exec.c
 create mode 100644 lib/librte_bpf/bpf_impl.h
 create mode 100644 lib/librte_bpf/bpf_load.c
 create mode 100644 lib/librte_bpf/bpf_validate.c
 create mode 100644 lib/librte_bpf/rte_bpf.h
 create mode 100644 lib/librte_bpf/rte_bpf_version.map

diff --git a/config/common_base b/config/common_base
index ad03cf433..2205b684f 100644
--- a/config/common_base
+++ b/config/common_base
@@ -823,3 +823,8 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
+#
+# Compile librte_bpf
+#
+CONFIG_RTE_LIBRTE_BPF=n
diff --git a/config/common_linuxapp b/config/common_linuxapp
index ff98f2355..7b4a0ce7d 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -10,6 +10,7 @@ CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=y
 CONFIG_RTE_EAL_IGB_UIO=y
 CONFIG_RTE_EAL_VFIO=y
 CONFIG_RTE_KNI_KMOD=y
+CONFIG_RTE_LIBRTE_BPF=y
 CONFIG_RTE_LIBRTE_KNI=y
 CONFIG_RTE_LIBRTE_PMD_KNI=y
 CONFIG_RTE_LIBRTE_VHOST=y
diff --git a/lib/Makefile b/lib/Makefile
index ec965a606..a4a2329f9 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -97,6 +97,8 @@ DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_GSO) += librte_gso
 DEPDIRS-librte_gso := librte_eal librte_mbuf librte_ether librte_net
 DEPDIRS-librte_gso += librte_mempool
+DIRS-$(CONFIG_RTE_LIBRTE_BPF) += librte_bpf
+DEPDIRS-librte_bpf := librte_eal librte_mempool librte_mbuf librte_ether
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_bpf/Makefile b/lib/librte_bpf/Makefile
new file mode 100644
index 000000000..e0f434e77
--- /dev/null
+++ b/lib/librte_bpf/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_bpf.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+LDLIBS += -lrte_net -lrte_eal
+LDLIBS += -lrte_mempool -lrte_ring
+LDLIBS += -lrte_mbuf -lrte_ethdev
+LDLIBS += -lelf
+
+EXPORT_MAP := rte_bpf_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf.c
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_exec.c
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_load.c
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_validate.c
+
+# install header files
+SYMLINK-$(CONFIG_RTE_LIBRTE_BPF)-include += rte_bpf.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_bpf/bpf.c b/lib/librte_bpf/bpf.c
new file mode 100644
index 000000000..4727d2251
--- /dev/null
+++ b/lib/librte_bpf/bpf.c
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+
+#include "bpf_impl.h"
+
+__rte_experimental void
+rte_bpf_destroy(struct rte_bpf *bpf)
+{
+	if (bpf != NULL) {
+		if (bpf->jit.func != NULL)
+			munmap(bpf->jit.func, bpf->jit.sz);
+		munmap(bpf, bpf->sz);
+	}
+}
+
+__rte_experimental int
+rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit)
+{
+	if (bpf == NULL || jit == NULL)
+		return -EINVAL;
+
+	jit[0] = bpf->jit;
+	return 0;
+}
+
+int
+bpf_jit(struct rte_bpf *bpf)
+{
+	int32_t rc;
+
+	rc = -ENOTSUP;
+
+	if (rc != 0)
+		RTE_LOG(WARNING, USER1, "%s(%p) failed, error code: %d;\n",
+			__func__, bpf, rc);
+	return rc;
+}
diff --git a/lib/librte_bpf/bpf_exec.c b/lib/librte_bpf/bpf_exec.c
new file mode 100644
index 000000000..f1c1d3be3
--- /dev/null
+++ b/lib/librte_bpf/bpf_exec.c
@@ -0,0 +1,452 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_memory.h>
+#include <rte_eal.h>
+#include <rte_byteorder.h>
+
+#include "bpf_impl.h"
+
+#define BPF_JMP_UNC(ins)	((ins) += (ins)->off)
+
+#define BPF_JMP_CND_REG(reg, ins, op, type)	\
+	((ins) += \
+		((type)(reg)[(ins)->dst_reg] op (type)(reg)[(ins)->src_reg]) ? \
+		(ins)->off : 0)
+
+#define BPF_JMP_CND_IMM(reg, ins, op, type)	\
+	((ins) += \
+		((type)(reg)[(ins)->dst_reg] op (type)(ins)->imm) ? \
+		(ins)->off : 0)
+
+#define BPF_NEG_ALU(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = (type)(-(reg)[(ins)->dst_reg]))
+
+#define BPF_MOV_ALU_REG(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = (type)(reg)[(ins)->src_reg])
+
+#define BPF_OP_ALU_REG(reg, ins, op, type)	\
+	((reg)[(ins)->dst_reg] = \
+		(type)(reg)[(ins)->dst_reg] op (type)(reg)[(ins)->src_reg])
+
+#define BPF_MOV_ALU_IMM(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = (type)(ins)->imm)
+
+#define BPF_OP_ALU_IMM(reg, ins, op, type)	\
+	((reg)[(ins)->dst_reg] = \
+		(type)(reg)[(ins)->dst_reg] op (type)(ins)->imm)
+
+#define BPF_DIV_ZERO_CHECK(bpf, reg, ins, type) do { \
+	if ((type)(reg)[(ins)->src_reg] == 0) { \
+		RTE_LOG(ERR, USER1, \
+			"%s(%p): division by 0 at pc: %#zx;\n", \
+			__func__, bpf, \
+			(uintptr_t)(ins) - (uintptr_t)(bpf)->prm.ins); \
+		return 0; \
+	} \
+} while (0)
+
+#define BPF_LD_REG(reg, ins, type)	\
+	((reg)[(ins)->dst_reg] = \
+		*(type *)(uintptr_t)((reg)[(ins)->src_reg] + (ins)->off))
+
+#define BPF_ST_IMM(reg, ins, type)	\
+	(*(type *)(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off) = \
+		(type)(ins)->imm)
+
+#define BPF_ST_REG(reg, ins, type)	\
+	(*(type *)(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off) = \
+		(type)(reg)[(ins)->src_reg])
+
+#define BPF_ST_XADD_REG(reg, ins, tp)	\
+	(rte_atomic##tp##_add((rte_atomic##tp##_t *) \
+		(uintptr_t)((reg)[(ins)->dst_reg] + (ins)->off), \
+		reg[ins->src_reg]))
+
+static inline void
+bpf_alu_be(uint64_t reg[MAX_BPF_REG], const struct bpf_insn *ins)
+{
+	uint64_t *v;
+
+	v = reg + ins->dst_reg;
+	switch (ins->imm) {
+	case 16:
+		*v = rte_cpu_to_be_16(*v);
+		break;
+	case 32:
+		*v = rte_cpu_to_be_32(*v);
+		break;
+	case 64:
+		*v = rte_cpu_to_be_64(*v);
+		break;
+	}
+}
+
+static inline void
+bpf_alu_le(uint64_t reg[MAX_BPF_REG], const struct bpf_insn *ins)
+{
+	uint64_t *v;
+
+	v = reg + ins->dst_reg;
+	switch (ins->imm) {
+	case 16:
+		*v = rte_cpu_to_le_16(*v);
+		break;
+	case 32:
+		*v = rte_cpu_to_le_32(*v);
+		break;
+	case 64:
+		*v = rte_cpu_to_le_64(*v);
+		break;
+	}
+}
+
+static inline uint64_t
+bpf_exec(const struct rte_bpf *bpf, uint64_t reg[MAX_BPF_REG])
+{
+	const struct bpf_insn *ins;
+
+	for (ins = bpf->prm.ins; ; ins++) {
+		switch (ins->code) {
+		/* 32 bit ALU IMM operations */
+		case (BPF_ALU | BPF_ADD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, +, uint32_t);
+			break;
+		case (BPF_ALU | BPF_SUB | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, -, uint32_t);
+			break;
+		case (BPF_ALU | BPF_AND | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, &, uint32_t);
+			break;
+		case (BPF_ALU | BPF_OR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, |, uint32_t);
+			break;
+		case (BPF_ALU | BPF_LSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, <<, uint32_t);
+			break;
+		case (BPF_ALU | BPF_RSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, >>, uint32_t);
+			break;
+		case (BPF_ALU | BPF_XOR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, ^, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MUL | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, *, uint32_t);
+			break;
+		case (BPF_ALU | BPF_DIV | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, /, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, %, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOV | BPF_K):
+			BPF_MOV_ALU_IMM(reg, ins, uint32_t);
+			break;
+		/* 32 bit ALU REG operations */
+		case (BPF_ALU | BPF_ADD | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, +, uint32_t);
+			break;
+		case (BPF_ALU | BPF_SUB | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, -, uint32_t);
+			break;
+		case (BPF_ALU | BPF_AND | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, &, uint32_t);
+			break;
+		case (BPF_ALU | BPF_OR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, |, uint32_t);
+			break;
+		case (BPF_ALU | BPF_LSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, <<, uint32_t);
+			break;
+		case (BPF_ALU | BPF_RSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, >>, uint32_t);
+			break;
+		case (BPF_ALU | BPF_XOR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, ^, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MUL | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, *, uint32_t);
+			break;
+		case (BPF_ALU | BPF_DIV | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint32_t);
+			BPF_OP_ALU_REG(reg, ins, /, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOD | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint32_t);
+			BPF_OP_ALU_REG(reg, ins, %, uint32_t);
+			break;
+		case (BPF_ALU | BPF_MOV | BPF_X):
+			BPF_MOV_ALU_REG(reg, ins, uint32_t);
+			break;
+		case (BPF_ALU | BPF_NEG):
+			BPF_NEG_ALU(reg, ins, uint32_t);
+			break;
+		case (BPF_ALU | BPF_END | BPF_TO_BE):
+			bpf_alu_be(reg, ins);
+			break;
+		case (BPF_ALU | BPF_END | BPF_TO_LE):
+			bpf_alu_le(reg, ins);
+			break;
+		/* 64 bit ALU IMM operations */
+		case (BPF_ALU64 | BPF_ADD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, +, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_SUB | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, -, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_AND | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, &, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_OR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, |, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_LSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, <<, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_RSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, >>, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_ARSH | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, >>, int64_t);
+			break;
+		case (BPF_ALU64 | BPF_XOR | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, ^, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MUL | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, *, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_DIV | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, /, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOD | BPF_K):
+			BPF_OP_ALU_IMM(reg, ins, %, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOV | BPF_K):
+			BPF_MOV_ALU_IMM(reg, ins, uint64_t);
+			break;
+		/* 64 bit ALU REG operations */
+		case (BPF_ALU64 | BPF_ADD | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, +, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_SUB | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, -, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_AND | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, &, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_OR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, |, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_LSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, <<, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_RSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, >>, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_ARSH | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, >>, int64_t);
+			break;
+		case (BPF_ALU64 | BPF_XOR | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, ^, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MUL | BPF_X):
+			BPF_OP_ALU_REG(reg, ins, *, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_DIV | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint64_t);
+			BPF_OP_ALU_REG(reg, ins, /, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOD | BPF_X):
+			BPF_DIV_ZERO_CHECK(bpf, reg, ins, uint64_t);
+			BPF_OP_ALU_REG(reg, ins, %, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_MOV | BPF_X):
+			BPF_MOV_ALU_REG(reg, ins, uint64_t);
+			break;
+		case (BPF_ALU64 | BPF_NEG):
+			BPF_NEG_ALU(reg, ins, uint64_t);
+			break;
+		/* load instructions */
+		case (BPF_LDX | BPF_MEM | BPF_B):
+			BPF_LD_REG(reg, ins, uint8_t);
+			break;
+		case (BPF_LDX | BPF_MEM | BPF_H):
+			BPF_LD_REG(reg, ins, uint16_t);
+			break;
+		case (BPF_LDX | BPF_MEM | BPF_W):
+			BPF_LD_REG(reg, ins, uint32_t);
+			break;
+		case (BPF_LDX | BPF_MEM | BPF_DW):
+			BPF_LD_REG(reg, ins, uint64_t);
+			break;
+		/* load 64 bit immediate value */
+		case (BPF_LD | BPF_IMM | BPF_DW):
+			reg[ins->dst_reg] = (uint32_t)ins[0].imm |
+				(uint64_t)(uint32_t)ins[1].imm << 32;
+			ins++;
+			break;
+		/* store instructions */
+		case (BPF_STX | BPF_MEM | BPF_B):
+			BPF_ST_REG(reg, ins, uint8_t);
+			break;
+		case (BPF_STX | BPF_MEM | BPF_H):
+			BPF_ST_REG(reg, ins, uint16_t);
+			break;
+		case (BPF_STX | BPF_MEM | BPF_W):
+			BPF_ST_REG(reg, ins, uint32_t);
+			break;
+		case (BPF_STX | BPF_MEM | BPF_DW):
+			BPF_ST_REG(reg, ins, uint64_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_B):
+			BPF_ST_IMM(reg, ins, uint8_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_H):
+			BPF_ST_IMM(reg, ins, uint16_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_W):
+			BPF_ST_IMM(reg, ins, uint32_t);
+			break;
+		case (BPF_ST | BPF_MEM | BPF_DW):
+			BPF_ST_IMM(reg, ins, uint64_t);
+			break;
+		/* atomic add instructions */
+		case (BPF_STX | BPF_XADD | BPF_W):
+			BPF_ST_XADD_REG(reg, ins, 32);
+			break;
+		case (BPF_STX | BPF_XADD | BPF_DW):
+			BPF_ST_XADD_REG(reg, ins, 64);
+			break;
+		/* jump instructions */
+		case (BPF_JMP | BPF_JA):
+			BPF_JMP_UNC(ins);
+			break;
+		/* jump IMM instructions */
+		case (BPF_JMP | BPF_JEQ | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, ==, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JNE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, !=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JSGT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLT | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSGE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, >=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLE | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, <=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSET | BPF_K):
+			BPF_JMP_CND_IMM(reg, ins, &, uint64_t);
+			break;
+		/* jump REG instructions */
+		case (BPF_JMP | BPF_JEQ | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, ==, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JNE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, !=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JGE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JLE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <=, uint64_t);
+			break;
+		case (BPF_JMP | BPF_JSGT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLT | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSGE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, >=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSLE | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, <=, int64_t);
+			break;
+		case (BPF_JMP | BPF_JSET | BPF_X):
+			BPF_JMP_CND_REG(reg, ins, &, uint64_t);
+			break;
+		/* call instructions */
+		case (BPF_JMP | BPF_CALL):
+			reg[BPF_REG_0] = bpf->prm.xsym[ins->imm].func(
+				reg[BPF_REG_1], reg[BPF_REG_2], reg[BPF_REG_3],
+				reg[BPF_REG_4], reg[BPF_REG_5]);
+			break;
+		/* return instruction */
+		case (BPF_JMP | BPF_EXIT):
+			return reg[BPF_REG_0];
+		default:
+			RTE_LOG(ERR, USER1,
+				"%s(%p): invalid opcode %#x at pc: %#zx;\n",
+				__func__, bpf, ins->code,
+				(uintptr_t)ins - (uintptr_t)bpf->prm.ins);
+			return 0;
+		}
+	}
+
+	/* should never be reached */
+	RTE_VERIFY(0);
+	return 0;
+}
+
+__rte_experimental uint32_t
+rte_bpf_exec_burst(const struct rte_bpf *bpf, void *ctx[], uint64_t rc[],
+	uint32_t num)
+{
+	uint32_t i;
+	uint64_t reg[MAX_BPF_REG];
+	uint64_t stack[MAX_BPF_STACK_SIZE / sizeof(uint64_t)];
+
+	for (i = 0; i != num; i++) {
+
+		reg[BPF_REG_1] = (uintptr_t)ctx[i];
+		reg[BPF_REG_10] = (uintptr_t)(stack + RTE_DIM(stack));
+
+		rc[i] = bpf_exec(bpf, reg);
+	}
+
+	return i;
+}
+
+__rte_experimental uint64_t
+rte_bpf_exec(const struct rte_bpf *bpf, void *ctx)
+{
+	uint64_t rc;
+
+	rte_bpf_exec_burst(bpf, &ctx, &rc, 1);
+	return rc;
+}
diff --git a/lib/librte_bpf/bpf_impl.h b/lib/librte_bpf/bpf_impl.h
new file mode 100644
index 000000000..f09417088
--- /dev/null
+++ b/lib/librte_bpf/bpf_impl.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _BPF_H_
+#define _BPF_H_
+
+#include <rte_bpf.h>
+#include <sys/mman.h>
+#include <linux/bpf.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define MAX_BPF_STACK_SIZE	0x200
+
+struct rte_bpf {
+	struct rte_bpf_prm prm;
+	struct rte_bpf_jit jit;
+	size_t sz;
+	uint32_t stack_sz;
+};
+
+extern int bpf_validate(struct rte_bpf *bpf);
+
+extern int bpf_jit(struct rte_bpf *bpf);
+
+#ifdef RTE_ARCH_X86_64
+extern int bpf_jit_x86(struct rte_bpf *);
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _BPF_H_ */
diff --git a/lib/librte_bpf/bpf_load.c b/lib/librte_bpf/bpf_load.c
new file mode 100644
index 000000000..6ced9c640
--- /dev/null
+++ b/lib/librte_bpf/bpf_load.c
@@ -0,0 +1,380 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <inttypes.h>
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/queue.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_memory.h>
+#include <rte_eal.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+
+#include "bpf_impl.h"
+
+static uint32_t
+bpf_find_xsym(const char *sn, enum rte_bpf_xtype type,
+	const struct rte_bpf_xsym fp[], uint32_t fn)
+{
+	uint32_t i;
+
+	if (sn == NULL || fp == NULL)
+		return UINT32_MAX;
+
+	for (i = 0; i != fn; i++) {
+		if (fp[i].type == type && strcmp(sn, fp[i].name) == 0)
+			break;
+	}
+
+	return (i != fn) ? i : UINT32_MAX;
+}
+
+/*
+ * update BPF code at offset *ofs* with a proper address(index) for external
+ * symbol *sn*
+ */
+static int
+resolve_xsym(const char *sn, size_t ofs, struct bpf_insn *ins, size_t ins_sz,
+	const struct rte_bpf_prm *prm)
+{
+	uint32_t idx, fidx;
+	enum rte_bpf_xtype type;
+
+	if (ofs % sizeof(ins[0]) != 0 || ofs >= ins_sz)
+		return -EINVAL;
+
+	idx = ofs / sizeof(ins[0]);
+	if (ins[idx].code == (BPF_JMP | BPF_CALL))
+		type = RTE_BPF_XTYPE_FUNC;
+	else if (ins[idx].code == (BPF_LD | BPF_IMM | BPF_DW) &&
+			ofs < ins_sz - sizeof(ins[idx]))
+		type = RTE_BPF_XTYPE_VAR;
+	else
+		return -EINVAL;
+
+	fidx = bpf_find_xsym(sn, type, prm->xsym, prm->nb_xsym);
+	if (fidx == UINT32_MAX)
+		return -ENOENT;
+
+	/* for function we just need an index in our xsym table */
+	if (type == RTE_BPF_XTYPE_FUNC)
+		ins[idx].imm = fidx;
+	/* for variable we need to store its absolute address */
+	else {
+		ins[idx].imm = (uintptr_t)prm->xsym[fidx].var;
+		ins[idx + 1].imm = (uintptr_t)prm->xsym[fidx].var >> 32;
+	}
+
+	return 0;
+}
+
+static int
+check_elf_header(const Elf64_Ehdr * eh)
+{
+	const char *err;
+
+	err = NULL;
+
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (eh->e_ident[EI_DATA] != ELFDATA2LSB)
+#else
+	if (eh->e_ident[EI_DATA] != ELFDATA2MSB)
+#endif
+		err = "not native byte order";
+	else if (eh->e_ident[EI_OSABI] != ELFOSABI_NONE)
+		err = "unexpected OS ABI";
+	else if (eh->e_type != ET_REL)
+		err = "unexpected ELF type";
+	else if (eh->e_machine != EM_NONE && eh->e_machine != EM_BPF)
+		err = "unexpected machine type";
+
+	if (err != NULL) {
+		RTE_LOG(ERR, USER1, "%s(): %s\n", __func__, err);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/*
+ * helper function, find executable section by name.
+ */
+static int
+find_elf_code(Elf *elf, const char *section, Elf_Data **psd, size_t *pidx)
+{
+	Elf_Scn *sc;
+	const Elf64_Ehdr *eh;
+	const Elf64_Shdr *sh;
+	Elf_Data *sd;
+	const char *sn;
+	int32_t rc;
+
+	eh = elf64_getehdr(elf);
+	if (eh == NULL) {
+		rc = elf_errno();
+		RTE_LOG(ERR, USER1, "%s(%p, %s) error code: %d(%s)\n",
+			__func__, elf, section, rc, elf_errmsg(rc));
+		return -EINVAL;
+	}
+
+	if (check_elf_header(eh) != 0)
+		return -EINVAL;
+
+	/* find given section by name */
+	for (sc = elf_nextscn(elf, NULL); sc != NULL;
+			sc = elf_nextscn(elf, sc)) {
+		sh = elf64_getshdr(sc);
+		sn = elf_strptr(elf, eh->e_shstrndx, sh->sh_name);
+		if (sn != NULL && strcmp(section, sn) == 0 &&
+				sh->sh_type == SHT_PROGBITS &&
+				sh->sh_flags == (SHF_ALLOC | SHF_EXECINSTR))
+			break;
+	}
+
+	sd = elf_getdata(sc, NULL);
+	if (sd == NULL || sd->d_size == 0 ||
+			sd->d_size % sizeof(struct bpf_insn) != 0) {
+		rc = elf_errno();
+		RTE_LOG(ERR, USER1, "%s(%p, %s) error code: %d(%s)\n",
+			__func__, elf, section, rc, elf_errmsg(rc));
+		return -EINVAL;
+	}
+
+	*psd = sd;
+	*pidx = elf_ndxscn(sc);
+	return 0;
+}
+
+/*
+ * helper function to process data from relocation table.
+ */
+static int
+process_reloc(Elf *elf, size_t sym_idx, Elf64_Rel *re, size_t re_sz,
+	struct bpf_insn *ins, size_t ins_sz, const struct rte_bpf_prm *prm)
+{
+	int32_t rc;
+	uint32_t i, n;
+	size_t ofs, sym;
+	const char *sn;
+	const Elf64_Ehdr *eh;
+	Elf_Scn *sc;
+	const Elf_Data *sd;
+	Elf64_Sym *sm;
+
+	eh = elf64_getehdr(elf);
+
+	/* get symtable by section index */
+	sc = elf_getscn(elf, sym_idx);
+	sd = elf_getdata(sc, NULL);
+	if (sd == NULL)
+		return -EINVAL;
+	sm = sd->d_buf;
+
+	n = re_sz / sizeof(re[0]);
+	for (i = 0; i != n; i++) {
+
+		ofs = re[i].r_offset;
+
+		/* retrieve index in the symtable */
+		sym = ELF64_R_SYM(re[i].r_info);
+		if (sym * sizeof(sm[0]) >= sd->d_size)
+			return -EINVAL;
+
+		sn = elf_strptr(elf, eh->e_shstrndx, sm[sym].st_name);
+
+		rc = resolve_xsym(sn, ofs, ins, ins_sz, prm);
+		if (rc != 0) {
+			RTE_LOG(ERR, USER1,
+				"resolve_xsym(%s, %zu) error code: %d\n",
+				sn, ofs, rc);
+			return rc;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * helper function, find relocation information (if any)
+ * and update bpf code.
+ */
+static int
+elf_reloc_code(Elf *elf, Elf_Data *ed, size_t sidx,
+	const struct rte_bpf_prm *prm)
+{
+	Elf64_Rel *re;
+	Elf_Scn *sc;
+	const Elf64_Shdr *sh;
+	const Elf_Data *sd;
+	int32_t rc;
+
+	rc = 0;
+
+	/* walk through all sections */
+	for (sc = elf_nextscn(elf, NULL); sc != NULL && rc == 0;
+			sc = elf_nextscn(elf, sc)) {
+
+		sh = elf64_getshdr(sc);
+
+		/* relocation data for our code section */
+		if (sh->sh_type == SHT_REL && sh->sh_info == sidx) {
+			sd = elf_getdata(sc, NULL);
+			if (sd == NULL || sd->d_size == 0 ||
+					sd->d_size % sizeof(re[0]) != 0)
+				return -EINVAL;
+			rc = process_reloc(elf, sh->sh_link,
+				sd->d_buf, sd->d_size, ed->d_buf, ed->d_size,
+				prm);
+		}
+	}
+
+	return rc;
+}
+
+static struct rte_bpf *
+bpf_load(const struct rte_bpf_prm *prm)
+{
+	uint8_t *buf;
+	struct rte_bpf *bpf;
+	size_t sz, bsz, insz, xsz;
+
+	xsz =  prm->nb_xsym * sizeof(prm->xsym[0]);
+	insz = prm->nb_ins * sizeof(prm->ins[0]);
+	bsz = sizeof(bpf[0]);
+	sz = insz + xsz + bsz;
+
+	buf = mmap(NULL, sz, PROT_READ | PROT_WRITE,
+		MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (buf == MAP_FAILED)
+		return NULL;
+
+	bpf = (void *)buf;
+	bpf->sz = sz;
+
+	memcpy(&bpf->prm, prm, sizeof(bpf->prm));
+
+	memcpy(buf + bsz, prm->xsym, xsz);
+	memcpy(buf + bsz + xsz, prm->ins, insz);
+
+	bpf->prm.xsym = (void *)(buf + bsz);
+	bpf->prm.ins = (void *)(buf + bsz + xsz);
+
+	return bpf;
+}
+
+__rte_experimental struct rte_bpf *
+rte_bpf_load(const struct rte_bpf_prm *prm)
+{
+	struct rte_bpf *bpf;
+	int32_t rc;
+
+	if (prm == NULL || prm->ins == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	bpf = bpf_load(prm);
+	if (bpf == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+
+	rc = bpf_validate(bpf);
+	if (rc == 0) {
+		bpf_jit(bpf);
+		if (mprotect(bpf, bpf->sz, PROT_READ) != 0)
+			rc = -ENOMEM;
+	}
+
+	if (rc != 0) {
+		rte_bpf_destroy(bpf);
+		rte_errno = -rc;
+		return NULL;
+	}
+
+	return bpf;
+}
+
+static struct rte_bpf *
+bpf_load_elf(const struct rte_bpf_prm *prm, int32_t fd, const char *section)
+{
+	Elf *elf;
+	Elf_Data *sd;
+	size_t sidx;
+	int32_t rc;
+	struct rte_bpf *bpf;
+	struct rte_bpf_prm np;
+
+	elf_version(EV_CURRENT);
+	elf = elf_begin(fd, ELF_C_READ, NULL);
+
+	rc = find_elf_code(elf, section, &sd, &sidx);
+	if (rc == 0)
+		rc = elf_reloc_code(elf, sd, sidx, prm);
+
+	if (rc == 0) {
+		np = prm[0];
+		np.ins = sd->d_buf;
+		np.nb_ins = sd->d_size / sizeof(struct bpf_insn);
+		bpf = rte_bpf_load(&np);
+	} else {
+		bpf = NULL;
+		rte_errno = -rc;
+	}
+
+	elf_end(elf);
+	return bpf;
+}
+
+__rte_experimental struct rte_bpf *
+rte_bpf_elf_load(const struct rte_bpf_prm *prm, const char *fname,
+	const char *sname)
+{
+	int32_t fd, rc;
+	struct rte_bpf *bpf;
+
+	if (prm == NULL || fname == NULL || sname == NULL) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	fd = open(fname, O_RDONLY);
+	if (fd < 0) {
+		rc = errno;
+		RTE_LOG(ERR, USER1, "%s(%s) error code: %d(%s)\n",
+			__func__, fname, rc, strerror(rc));
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	bpf = bpf_load_elf(prm, fd, sname);
+	close(fd);
+
+	if (bpf == NULL) {
+		RTE_LOG(ERR, USER1,
+			"%s(fname=\"%s\", sname=\"%s\") failed, "
+			"error code: %d\n",
+			__func__, fname, sname, rte_errno);
+		return NULL;
+	}
+
+	RTE_LOG(INFO, USER1, "%s(fname=\"%s\", sname=\"%s\") "
+		"successfully creates %p(jit={.func=%p,.sz=%zu});\n",
+		__func__, fname, sname, bpf, bpf->jit.func, bpf->jit.sz);
+	return bpf;
+}
diff --git a/lib/librte_bpf/bpf_validate.c b/lib/librte_bpf/bpf_validate.c
new file mode 100644
index 000000000..7c1267cbd
--- /dev/null
+++ b/lib/librte_bpf/bpf_validate.c
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <inttypes.h>
+
+#include <rte_common.h>
+#include <rte_eal.h>
+
+#include "bpf_impl.h"
+
+/*
+ * dummy one for now, need more work.
+ */
+int
+bpf_validate(struct rte_bpf *bpf)
+{
+	int32_t rc, ofs, stack_sz;
+	uint32_t i, op, dr;
+	const struct bpf_insn *ins;
+
+	rc = 0;
+	stack_sz = 0;
+	for (i = 0; i != bpf->prm.nb_ins; i++) {
+
+		ins = bpf->prm.ins + i;
+		op = ins->code;
+		dr = ins->dst_reg;
+		ofs = ins->off;
+
+		if ((BPF_CLASS(op) == BPF_STX || BPF_CLASS(op) == BPF_ST) &&
+				dr == BPF_REG_10) {
+			ofs -= sizeof(uint64_t);
+			stack_sz = RTE_MIN(ofs, stack_sz);
+		}
+	}
+
+	if (stack_sz != 0) {
+		stack_sz = -stack_sz;
+		if (stack_sz > MAX_BPF_STACK_SIZE)
+			rc = -ERANGE;
+		else
+			bpf->stack_sz = stack_sz;
+	}
+
+	if (rc != 0)
+		RTE_LOG(ERR, USER1, "%s(%p) failed, error code: %d;\n",
+			__func__, bpf, rc);
+	return rc;
+}
diff --git a/lib/librte_bpf/rte_bpf.h b/lib/librte_bpf/rte_bpf.h
new file mode 100644
index 000000000..efee35ad4
--- /dev/null
+++ b/lib/librte_bpf/rte_bpf.h
@@ -0,0 +1,158 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef _RTE_BPF_H_
+#define _RTE_BPF_H_
+
+#include <rte_common.h>
+#include <rte_mbuf.h>
+#include <linux/bpf.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Possible types for external symbols.
+ */
+enum rte_bpf_xtype {
+	RTE_BPF_XTYPE_FUNC, /**< function */
+	RTE_BPF_XTYPE_VAR, /**< variable */
+	RTE_BPF_XTYPE_NUM
+};
+
+/**
+ * Definition for external symbols available in the BPF program.
+ */
+struct rte_bpf_xsym {
+	const char *name;        /**< name */
+	enum rte_bpf_xtype type; /**< type */
+	union {
+		uint64_t (*func)(uint64_t, uint64_t, uint64_t,
+				uint64_t, uint64_t);
+		void *var;
+	}; /**< value */
+};
+
+/**
+ * Possible BPF program types.
+ */
+enum rte_bpf_prog_type {
+	RTE_BPF_PROG_TYPE_UNSPEC = BPF_PROG_TYPE_UNSPEC,
+	/**< input is a pointer to raw data */
+	RTE_BPF_PROG_TYPE_MBUF,
+	/**< input is a pointer to rte_mbuf */
+};
+
+/**
+ * Input parameters for loading eBPF code.
+ */
+struct rte_bpf_prm {
+	const struct bpf_insn *ins; /**< array of eBPF instructions */
+	uint32_t nb_ins;            /**< number of instructions in ins */
+	const struct rte_bpf_xsym *xsym;
+	/**< array of external symbols that eBPF code is allowed to reference */
+	uint32_t nb_xsym; /**< number of elements in xsym */
+	enum rte_bpf_prog_type prog_type; /**< eBPF program type */
+};
+
+/**
+ * Information about compiled into native ISA eBPF code.
+ */
+struct rte_bpf_jit {
+	uint64_t (*func)(void *);
+	size_t sz;
+};
+
+struct rte_bpf;
+
+/**
+ * De-allocate all memory used by this eBPF execution context.
+ *
+ * @param bpf
+ *   BPF handle to destroy.
+ */
+void rte_bpf_destroy(struct rte_bpf *bpf);
+
+/**
+ * Create a new eBPF execution context and load given BPF code into it.
+ *
+ * @param prm
+ *  Parameters used to create and initialise the BPF exeution context.
+ * @return
+ *   BPF handle that is used in future BPF operations,
+ *   or NULL on error, with error code set in rte_errno.
+ *   Possible rte_errno errors include:
+ *   - EINVAL - invalid parameter passed to function
+ *   - ENOMEM - can't reserve enough memory
+ */
+struct rte_bpf *rte_bpf_load(const struct rte_bpf_prm *prm);
+
+/**
+ * Create a new eBPF execution context and load BPF code from given ELF
+ * file into it.
+ *
+ * @param prm
+ *  Parameters used to create and initialise the BPF exeution context.
+ * @param fname
+ *  Pathname for a ELF file.
+ * @param sname
+ *  Name of the executable section within the file to load.
+ * @return
+ *   BPF handle that is used in future BPF operations,
+ *   or NULL on error, with error code set in rte_errno.
+ *   Possible rte_errno errors include:
+ *   - EINVAL - invalid parameter passed to function
+ *   - ENOMEM - can't reserve enough memory
+ */
+struct rte_bpf *rte_bpf_elf_load(const struct rte_bpf_prm *prm,
+	const char *fname, const char *sname);
+
+/**
+ * Execute given BPF bytecode.
+ *
+ * @param bpf
+ *   handle for the BPF code to execute.
+ * @param ctx
+ *   pointer to input context.
+ * @return
+ *   BPF execution return value.
+ */
+uint64_t rte_bpf_exec(const struct rte_bpf *bpf, void *ctx);
+
+/**
+ * Execute given BPF bytecode over a set of input contexts.
+ *
+ * @param bpf
+ *   handle for the BPF code to execute.
+ * @param ctx
+ *   array of pointers to the input contexts.
+ * @param rc
+ *   array of return values (one per input).
+ * @param num
+ *   number of elements in ctx[] (and rc[]).
+ * @return
+ *   number of successfully processed inputs.
+ */
+uint32_t rte_bpf_exec_burst(const struct rte_bpf *bpf, void *ctx[],
+	uint64_t rc[], uint32_t num);
+
+/**
+ * Provide information about natively compield code for given BPF handle.
+ *
+ * @param bpf
+ *   handle for the BPF code.
+ * @param jit
+ *   pointer to the rte_bpf_jit structure to be filled with related data.
+ * @return
+ *   - -EINVAL if the parameters are invalid.
+ *   - Zero if operation completed successfully.
+ */
+int rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BPF_H_ */
diff --git a/lib/librte_bpf/rte_bpf_version.map b/lib/librte_bpf/rte_bpf_version.map
new file mode 100644
index 000000000..ff65144df
--- /dev/null
+++ b/lib/librte_bpf/rte_bpf_version.map
@@ -0,0 +1,12 @@
+EXPERIMENTAL {
+	global:
+
+	rte_bpf_destroy;
+	rte_bpf_elf_load;
+	rte_bpf_exec;
+	rte_bpf_exec_burst;
+	rte_bpf_get_jit;
+	rte_bpf_load;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 3eb41d176..fb41c77d2 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -83,6 +83,8 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER)          += -lrte_power
 _LDLIBS-$(CONFIG_RTE_LIBRTE_TIMER)          += -lrte_timer
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EFD)            += -lrte_efd
 
+_LDLIBS-$(CONFIG_RTE_LIBRTE_BPF)            += -lrte_bpf -lelf
+
 _LDLIBS-y += --whole-archive
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CFGFILE)        += -lrte_cfgfile
-- 
2.13.6

^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [PATCH v4] ethdev: return named opaque type instead of void pointer
  2018-03-09 15:45  0%         ` Ferruh Yigit
@ 2018-03-09 19:06  0%           ` Neil Horman
  0 siblings, 0 replies; 200+ results
From: Neil Horman @ 2018-03-09 19:06 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: John McNamara, Marko Kovacevic, Thomas Monjalon, dev

On Fri, Mar 09, 2018 at 03:45:49PM +0000, Ferruh Yigit wrote:
> On 3/9/2018 3:16 PM, Neil Horman wrote:
> > On Fri, Mar 09, 2018 at 01:00:35PM +0000, Ferruh Yigit wrote:
> >> On 3/9/2018 12:36 PM, Neil Horman wrote:
> >>> On Fri, Mar 09, 2018 at 11:25:31AM +0000, Ferruh Yigit wrote:
> >>>> "struct rte_eth_rxtx_callback" is defined as internal data structure and
> >>>> used as named opaque type.
> >>>>
> >>>> So the functions that are adding callbacks can return objects in this
> >>>> type instead of void pointer.
> >>>>
> >>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> >>>> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
> >>>> ---
> >>>> v2:
> >>>> * keep using struct * in parameters, instead add callback functions
> >>>> return struct rte_eth_rxtx_callback pointer.
> >>>>
> >>>> v4:
> >>>> * Remove deprecation notice. LIBABIVER already increased in this release
> >>>> ---
> >>>>  doc/guides/rel_notes/deprecation.rst |  7 -------
> >>>>  lib/librte_ether/rte_ethdev.c        |  6 +++---
> >>>>  lib/librte_ether/rte_ethdev.h        | 13 ++++++++-----
> >>>>  3 files changed, 11 insertions(+), 15 deletions(-)
> >>>>
> >>> This doesn't quite make sense to me.  If rte_eth_rxtx_callback is defined as an
> >>> internal data structure, then it shouldn't be used as part of the prototype for
> >>> an exported function, as the structure will then no longer be a internal data
> >>> structure, but rather part of the public ABI.
> >>
> >> "struct rte_eth_rxtx_callback" is internal data structure. And application
> >> should not access elements of this structure.
> >>
> >> "struct rte_eth_rxtx_callback;" is defined in the public header, so applications
> >> can use it as opaque type.
> >>
> >> It is possible that both "add" and "remove" APIs use "void *" and API itself can
> >> cast it. But the inconsistency was "add" related APIs return "void *" and
> >> "remove" related APIs require a parameter in "struct rte_eth_rxtx_callback *" type.
> >>
> >> While unifying the usage, "struct rte_eth_rxtx_callback *" preferred against
> >> "void *", because named opaque type documents intention/usage better.
> >>
> >> Thanks,
> >> ferruh
> >>
> > I get what you're saying about rte_eth_rxtx_callback being an internals
> > structure (or its intent is to be an internal structure), but it doesn't seem to
> > hold up to the header file layout.  rte_eth_rxtx_callback is defined in
> > rte_ethdev_core.h which according to the makefile, is listed as a symlinked
> > file, and therefore available for external applications to include.  This
> > negates the intended opaque nature of the struct.  I think before you do this,
> > you want to rectify that.
> 
> Intention is to make "struct rte_eth_rxtx_callback" internal, but as you said it
> is available to applications. This is same for all data structures in
> rte_ethdev_core.h
> 
Well...yes.  Thats what I said

> Unfortunately it can't be actual internal because of inline functions in public
> header uses them. And we can't change inline functions because of performance
> concerns.
> 
I'm sorry, thats not ok with me.  Just declaring a data structure to be
internal-only without enforcing that is asking for applications to mangle
internal data, and theres no reason it can't be fixed (and done without
sacrificing performance).

> Since we can't make the structure real internal, we can't really prevent
> applications to access the internals, this same if you use "void *".
> 
Just typedef a void pointer to some rte_ethdev_cb_handle_t type and pass that
back and forth instead.  That at least hides the fact that you are using a non
opaque structure from user applications without some intentional casting.  You
can further lock the call down by declaring the handles const so that no one
tries to dereference or modify them without generating a warning.

Neil

> > 
> > Neil
> > 
> >>>
> >>> Neil
> >>>
> >>
> >>
> 
> 

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver
  2018-01-23 13:15  2% ` [dpdk-dev] [RFC v2 00/17] " Andrew Rybchenko
  2018-01-31 16:44  0%   ` Olivier Matz
@ 2018-03-10 15:39  3%   ` Andrew Rybchenko
  2018-03-10 15:39  7%     ` [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated Andrew Rybchenko
                       ` (5 more replies)
  1 sibling, 6 replies; 200+ results
From: Andrew Rybchenko @ 2018-03-10 15:39 UTC (permalink / raw)
  To: dev
  Cc: Olivier MATZ, Santosh Shukla, Jerin Jacob, Hemant Agrawal,
	Shreyansh Jain

The initial patch series [1] is split into two to simplify processing.
The second series relies on this one and will add bucket mempool driver
and related ops.

The patch series has generic enhancements suggested by Olivier.
Basically it adds driver callbacks to calculate required memory size and
to populate objects using provided memory area. It allows to remove
so-called capability flags used before to tell generic code how to
allocate and slice allocated memory into mempool objects.
Clean up which removes get_capabilities and register_memory_area is
not strictly required, but I think right thing to do.
Existing mempool drivers are updated.

I've kept rte_mempool_populate_iova_tab() intact since it seems to
be not directly related XMEM API functions.

It breaks ABI since changes rte_mempool_ops. Also it removes
rte_mempool_ops_register_memory_area() and
rte_mempool_ops_get_capabilities() since corresponding callbacks are
removed.

Internal global functions are not listed in map file since it is not
a part of external API.

[1] http://dpdk.org/ml/archives/dev/2018-January/088698.html

RFCv1 -> RFCv2:
  - add driver ops to calculate required memory size and populate
    mempool objects, remove extra flags which were required before
    to control it
  - transition of octeontx and dpaa drivers to the new callbacks
  - change info API to get information from driver required to
    API user to know contiguous block size
  - remove get_capabilities (not required any more and may be
    substituted with more in info get API)
  - remove register_memory_area since it is substituted with
    populate callback which can do more
  - use SPDX tags
  - avoid all objects affinity to single lcore
  - fix bucket get_count
  - deprecate XMEM API
  - avoid introduction of a new function to flush cache
  - fix NO_CACHE_ALIGN case in bucket mempool

RFCv2 -> v1:
  - split the series in two
  - squash octeontx patches which implement calc_mem_size and populate
    callbacks into the patch which removes get_capabilities since it is
    the easiest way to untangle the tangle of tightly related library
    functions and flags advertised by the driver
  - consistently name default callbacks
  - move default callbacks to dedicated file
  - see detailed description in patches

Andrew Rybchenko (7):
  mempool: add op to calculate memory size to be allocated
  mempool: add op to populate objects using provided memory
  mempool: remove callback to get capabilities
  mempool: deprecate xmem functions
  mempool/octeontx: prepare to remove register memory area op
  mempool/dpaa: prepare to remove register memory area op
  mempool: remove callback to register memory area

Artem V. Andreev (2):
  mempool: ensure the mempool is initialized before populating
  mempool: support flushing the default cache of the mempool

 doc/guides/rel_notes/deprecation.rst            |  12 +-
 doc/guides/rel_notes/release_18_05.rst          |  32 ++-
 drivers/mempool/dpaa/dpaa_mempool.c             |  13 +-
 drivers/mempool/octeontx/rte_mempool_octeontx.c |  64 ++++--
 lib/librte_mempool/Makefile                     |   3 +-
 lib/librte_mempool/meson.build                  |   5 +-
 lib/librte_mempool/rte_mempool.c                | 159 +++++++--------
 lib/librte_mempool/rte_mempool.h                | 260 +++++++++++++++++-------
 lib/librte_mempool/rte_mempool_ops.c            |  37 ++--
 lib/librte_mempool/rte_mempool_ops_default.c    |  51 +++++
 lib/librte_mempool/rte_mempool_version.map      |  11 +-
 test/test/test_mempool.c                        |  31 ---
 12 files changed, 437 insertions(+), 241 deletions(-)
 create mode 100644 lib/librte_mempool/rte_mempool_ops_default.c

-- 
2.7.4

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
@ 2018-03-10 15:39  7%     ` Andrew Rybchenko
  2018-03-11 12:51  0%       ` santosh
  2018-03-10 15:39  6%     ` [dpdk-dev] [PATCH v1 2/9] mempool: add op to populate objects using provided memory Andrew Rybchenko
                       ` (4 subsequent siblings)
  5 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2018-03-10 15:39 UTC (permalink / raw)
  To: dev; +Cc: Olivier MATZ

Size of memory chunk required to populate mempool objects depends
on how objects are stored in the memory. Different mempool drivers
may have different requirements and a new operation allows to
calculate memory size in accordance with driver requirements and
advertise requirements on minimum memory chunk size and alignment
in a generic way.

Bump ABI version since the patch breaks it.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
RFCv2 -> v1:
 - move default calc_mem_size callback to rte_mempool_ops_default.c
 - add ABI changes to release notes
 - name default callback consistently: rte_mempool_op_<callback>_default()
 - bump ABI version since it is the first patch which breaks ABI
 - describe default callback behaviour in details
 - avoid introduction of internal function to cope with depration
   (keep it to deprecation patch)
 - move cache-line or page boundary chunk alignment to default callback
 - highlight that min_chunk_size and align parameters are output only

 doc/guides/rel_notes/deprecation.rst         |  3 +-
 doc/guides/rel_notes/release_18_05.rst       |  7 ++-
 lib/librte_mempool/Makefile                  |  3 +-
 lib/librte_mempool/meson.build               |  5 +-
 lib/librte_mempool/rte_mempool.c             | 43 +++++++--------
 lib/librte_mempool/rte_mempool.h             | 80 +++++++++++++++++++++++++++-
 lib/librte_mempool/rte_mempool_ops.c         | 18 +++++++
 lib/librte_mempool/rte_mempool_ops_default.c | 38 +++++++++++++
 lib/librte_mempool/rte_mempool_version.map   |  8 +++
 9 files changed, 177 insertions(+), 28 deletions(-)
 create mode 100644 lib/librte_mempool/rte_mempool_ops_default.c

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 6594585..e02d4ca 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -72,8 +72,7 @@ Deprecation Notices
 
   - removal of ``get_capabilities`` mempool ops and related flags.
   - substitute ``register_memory_area`` with ``populate`` ops.
-  - addition of new ops to customize required memory chunk calculation,
-    customize objects population and allocate contiguous
+  - addition of new ops to customize objects population and allocate contiguous
     block of objects if underlying driver supports it.
 
 * mbuf: The control mbuf API will be removed in v18.05. The impacted
diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index f2525bb..59583ea 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -80,6 +80,11 @@ ABI Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Changed rte_mempool_ops structure.**
+
+  A new callback ``calc_mem_size`` has been added to ``rte_mempool_ops``
+  to allow to customize required memory size calculation.
+
 
 Removed Items
 -------------
@@ -152,7 +157,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_latencystats.so.1
      librte_lpm.so.2
      librte_mbuf.so.3
-     librte_mempool.so.3
+   + librte_mempool.so.4
    + librte_meter.so.2
      librte_metrics.so.1
      librte_net.so.1
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 24e735a..072740f 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -11,11 +11,12 @@ LDLIBS += -lrte_eal -lrte_ring
 
 EXPORT_MAP := rte_mempool_version.map
 
-LIBABIVER := 3
+LIBABIVER := 4
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_ops.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_ops_default.c
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h
 
diff --git a/lib/librte_mempool/meson.build b/lib/librte_mempool/meson.build
index 7a4f3da..9e3b527 100644
--- a/lib/librte_mempool/meson.build
+++ b/lib/librte_mempool/meson.build
@@ -1,7 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-version = 2
-sources = files('rte_mempool.c', 'rte_mempool_ops.c')
+version = 4
+sources = files('rte_mempool.c', 'rte_mempool_ops.c',
+		'rte_mempool_ops_default.c')
 headers = files('rte_mempool.h')
 deps += ['ring']
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 54f7f4b..3bfb36e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -544,39 +544,33 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	unsigned int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
-	size_t size, total_elt_sz, align, pg_sz, pg_shift;
+	ssize_t mem_size;
+	size_t align, pg_sz, pg_shift;
 	rte_iova_t iova;
 	unsigned mz_id, n;
-	unsigned int mp_flags;
 	int ret;
 
 	/* mempool must not be populated */
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
-	/* Get mempool capabilities */
-	mp_flags = 0;
-	ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
-	if ((ret < 0) && (ret != -ENOTSUP))
-		return ret;
-
-	/* update mempool capabilities */
-	mp->flags |= mp_flags;
-
 	if (rte_eal_has_hugepages()) {
 		pg_shift = 0; /* not needed, zone is physically contiguous */
 		pg_sz = 0;
-		align = RTE_CACHE_LINE_SIZE;
 	} else {
 		pg_sz = getpagesize();
 		pg_shift = rte_bsf32(pg_sz);
-		align = pg_sz;
 	}
 
-	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
 	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
-		size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift,
-						mp->flags);
+		size_t min_chunk_size;
+
+		mem_size = rte_mempool_ops_calc_mem_size(mp, n, pg_shift,
+				&min_chunk_size, &align);
+		if (mem_size < 0) {
+			ret = mem_size;
+			goto fail;
+		}
 
 		ret = snprintf(mz_name, sizeof(mz_name),
 			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
@@ -585,7 +579,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		mz = rte_memzone_reserve_aligned(mz_name, size,
+		mz = rte_memzone_reserve_aligned(mz_name, mem_size,
 			mp->socket_id, mz_flags, align);
 		/* not enough memory, retry with the biggest zone we have */
 		if (mz == NULL)
@@ -596,6 +590,12 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
+		if (mz->len < min_chunk_size) {
+			rte_memzone_free(mz);
+			ret = -ENOMEM;
+			goto fail;
+		}
+
 		if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
 			iova = RTE_BAD_IOVA;
 		else
@@ -628,13 +628,14 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 static size_t
 get_anon_size(const struct rte_mempool *mp)
 {
-	size_t size, total_elt_sz, pg_sz, pg_shift;
+	size_t size, pg_sz, pg_shift;
+	size_t min_chunk_size;
+	size_t align;
 
 	pg_sz = getpagesize();
 	pg_shift = rte_bsf32(pg_sz);
-	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
-	size = rte_mempool_xmem_size(mp->size, total_elt_sz, pg_shift,
-					mp->flags);
+	size = rte_mempool_ops_calc_mem_size(mp, mp->size, pg_shift,
+					     &min_chunk_size, &align);
 
 	return size;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8b1b7f7..0151f6c 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -399,6 +399,56 @@ typedef int (*rte_mempool_get_capabilities_t)(const struct rte_mempool *mp,
 typedef int (*rte_mempool_ops_register_memory_area_t)
 (const struct rte_mempool *mp, char *vaddr, rte_iova_t iova, size_t len);
 
+/**
+ * Calculate memory size required to store given number of objects.
+ *
+ * @param[in] mp
+ *   Pointer to the memory pool.
+ * @param[in] obj_num
+ *   Number of objects.
+ * @param[in] pg_shift
+ *   LOG2 of the physical pages size. If set to 0, ignore page boundaries.
+ * @param[out] min_chunk_size
+ *   Location for minimum size of the memory chunk which may be used to
+ *   store memory pool objects.
+ * @param[out] align
+ *   Location with required memory chunk alignment.
+ * @return
+ *   Required memory size aligned at page boundary.
+ */
+typedef ssize_t (*rte_mempool_calc_mem_size_t)(const struct rte_mempool *mp,
+		uint32_t obj_num,  uint32_t pg_shift,
+		size_t *min_chunk_size, size_t *align);
+
+/**
+ * Default way to calculate memory size required to store given number of
+ * objects.
+ *
+ * If page boundaries may be ignored, it is just a product of total
+ * object size including header and trailer and number of objects.
+ * Otherwise, it is a number of pages required to store given number of
+ * objects without crossing page boundary.
+ *
+ * Note that if object size is bigger than page size, then it assumes
+ * that pages are grouped in subsets of physically continuous pages big
+ * enough to store at least one object.
+ *
+ * If mempool driver requires object addresses to be block size aligned
+ * (MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS), space for one extra element is
+ * reserved to be able to meet the requirement.
+ *
+ * Minimum size of memory chunk is either all required space, if
+ * capabilities say that whole memory area must be physically contiguous
+ * (MEMPOOL_F_CAPA_PHYS_CONTIG), or a maximum of the page size and total
+ * element size.
+ *
+ * Required memory chunk alignment is a maximum of page size and cache
+ * line size.
+ */
+ssize_t rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
+		uint32_t obj_num, uint32_t pg_shift,
+		size_t *min_chunk_size, size_t *align);
+
 /** Structure defining mempool operations structure */
 struct rte_mempool_ops {
 	char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
@@ -415,6 +465,11 @@ struct rte_mempool_ops {
 	 * Notify new memory area to mempool
 	 */
 	rte_mempool_ops_register_memory_area_t register_memory_area;
+	/**
+	 * Optional callback to calculate memory size required to
+	 * store specified number of objects.
+	 */
+	rte_mempool_calc_mem_size_t calc_mem_size;
 } __rte_cache_aligned;
 
 #define RTE_MEMPOOL_MAX_OPS_IDX 16  /**< Max registered ops structs */
@@ -564,6 +619,29 @@ rte_mempool_ops_register_memory_area(const struct rte_mempool *mp,
 				char *vaddr, rte_iova_t iova, size_t len);
 
 /**
+ * @internal wrapper for mempool_ops calc_mem_size callback.
+ * API to calculate size of memory required to store specified number of
+ * object.
+ *
+ * @param[in] mp
+ *   Pointer to the memory pool.
+ * @param[in] obj_num
+ *   Number of objects.
+ * @param[in] pg_shift
+ *   LOG2 of the physical pages size. If set to 0, ignore page boundaries.
+ * @param[out] min_chunk_size
+ *   Location for minimum size of the memory chunk which may be used to
+ *   store memory pool objects.
+ * @param[out] align
+ *   Location with required memory chunk alignment.
+ * @return
+ *   Required memory size aligned at page boundary.
+ */
+ssize_t rte_mempool_ops_calc_mem_size(const struct rte_mempool *mp,
+				      uint32_t obj_num, uint32_t pg_shift,
+				      size_t *min_chunk_size, size_t *align);
+
+/**
  * @internal wrapper for mempool_ops free callback.
  *
  * @param mp
@@ -1533,7 +1611,7 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  * of objects. Assume that the memory buffer will be aligned at page
  * boundary.
  *
- * Note that if object size is bigger then page size, then it assumes
+ * Note that if object size is bigger than page size, then it assumes
  * that pages are grouped in subsets of physically continuous pages big
  * enough to store at least one object.
  *
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 0732255..26908cc 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -59,6 +59,7 @@ rte_mempool_register_ops(const struct rte_mempool_ops *h)
 	ops->get_count = h->get_count;
 	ops->get_capabilities = h->get_capabilities;
 	ops->register_memory_area = h->register_memory_area;
+	ops->calc_mem_size = h->calc_mem_size;
 
 	rte_spinlock_unlock(&rte_mempool_ops_table.sl);
 
@@ -123,6 +124,23 @@ rte_mempool_ops_register_memory_area(const struct rte_mempool *mp, char *vaddr,
 	return ops->register_memory_area(mp, vaddr, iova, len);
 }
 
+/* wrapper to notify new memory area to external mempool */
+ssize_t
+rte_mempool_ops_calc_mem_size(const struct rte_mempool *mp,
+				uint32_t obj_num, uint32_t pg_shift,
+				size_t *min_chunk_size, size_t *align)
+{
+	struct rte_mempool_ops *ops;
+
+	ops = rte_mempool_get_ops(mp->ops_index);
+
+	if (ops->calc_mem_size == NULL)
+		return rte_mempool_op_calc_mem_size_default(mp, obj_num,
+				pg_shift, min_chunk_size, align);
+
+	return ops->calc_mem_size(mp, obj_num, pg_shift, min_chunk_size, align);
+}
+
 /* sets mempool ops previously registered by rte_mempool_register_ops. */
 int
 rte_mempool_set_ops_byname(struct rte_mempool *mp, const char *name,
diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c
new file mode 100644
index 0000000..57fe79b
--- /dev/null
+++ b/lib/librte_mempool/rte_mempool_ops_default.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016 6WIND S.A.
+ * Copyright(c) 2018 Solarflare Communications Inc.
+ */
+
+#include <rte_mempool.h>
+
+ssize_t
+rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
+				     uint32_t obj_num, uint32_t pg_shift,
+				     size_t *min_chunk_size, size_t *align)
+{
+	unsigned int mp_flags;
+	int ret;
+	size_t total_elt_sz;
+	size_t mem_size;
+
+	/* Get mempool capabilities */
+	mp_flags = 0;
+	ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
+	if ((ret < 0) && (ret != -ENOTSUP))
+		return ret;
+
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+	mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
+					 mp->flags | mp_flags);
+
+	if (mp_flags & MEMPOOL_F_CAPA_PHYS_CONTIG)
+		*min_chunk_size = mem_size;
+	else
+		*min_chunk_size = RTE_MAX((size_t)1 << pg_shift, total_elt_sz);
+
+	*align = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE, (size_t)1 << pg_shift);
+
+	return mem_size;
+}
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 62b76f9..e2a054b 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -51,3 +51,11 @@ DPDK_17.11 {
 	rte_mempool_populate_iova_tab;
 
 } DPDK_16.07;
+
+DPDK_18.05 {
+	global:
+
+	rte_mempool_op_calc_mem_size_default;
+
+} DPDK_17.11;
+
-- 
2.7.4

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] [PATCH v1 2/9] mempool: add op to populate objects using provided memory
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
  2018-03-10 15:39  7%     ` [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated Andrew Rybchenko
@ 2018-03-10 15:39  6%     ` Andrew Rybchenko
  2018-03-10 15:39  6%     ` [dpdk-dev] [PATCH v1 3/9] mempool: remove callback to get capabilities Andrew Rybchenko
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-03-10 15:39 UTC (permalink / raw)
  To: dev; +Cc: Olivier MATZ

The callback allows to customize how objects are stored in the
memory chunk. Default implementation of the callback which simply
puts objects one by one is available.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
RFCv2 -> v1:
 - advertise ABI changes in release notes
 - use consistent name for default callback:
   rte_mempool_op_<callback>_default()
 - add opaque data pointer to populated object callback
 - move default callback to dedicated file

 doc/guides/rel_notes/deprecation.rst         |  2 +-
 doc/guides/rel_notes/release_18_05.rst       |  2 +
 lib/librte_mempool/rte_mempool.c             | 23 +++----
 lib/librte_mempool/rte_mempool.h             | 90 ++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_ops.c         | 21 +++++++
 lib/librte_mempool/rte_mempool_ops_default.c | 24 ++++++++
 lib/librte_mempool/rte_mempool_version.map   |  1 +
 7 files changed, 148 insertions(+), 15 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index e02d4ca..c06fc67 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -72,7 +72,7 @@ Deprecation Notices
 
   - removal of ``get_capabilities`` mempool ops and related flags.
   - substitute ``register_memory_area`` with ``populate`` ops.
-  - addition of new ops to customize objects population and allocate contiguous
+  - addition of new op to allocate contiguous
     block of objects if underlying driver supports it.
 
 * mbuf: The control mbuf API will be removed in v18.05. The impacted
diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index 59583ea..abaefe5 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -84,6 +84,8 @@ ABI Changes
 
   A new callback ``calc_mem_size`` has been added to ``rte_mempool_ops``
   to allow to customize required memory size calculation.
+  A new callback ``populate`` has been added to ``rte_mempool_ops``
+  to allow to customize objects population.
 
 
 Removed Items
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 3bfb36e..ed0e982 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -99,7 +99,8 @@ static unsigned optimize_object_size(unsigned obj_size)
 }
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj, rte_iova_t iova)
+mempool_add_elem(struct rte_mempool *mp, __rte_unused void *opaque,
+		 void *obj, rte_iova_t iova)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -116,9 +117,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, rte_iova_t iova)
 	tlr = __mempool_get_trailer(obj);
 	tlr->cookie = RTE_MEMPOOL_TRAILER_COOKIE;
 #endif
-
-	/* enqueue in ring */
-	rte_mempool_ops_enqueue_bulk(mp, &obj, 1);
 }
 
 /* call obj_cb() for each mempool element */
@@ -396,16 +394,13 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	else
 		off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr;
 
-	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
-		off += mp->header_size;
-		if (iova == RTE_BAD_IOVA)
-			mempool_add_elem(mp, (char *)vaddr + off,
-				RTE_BAD_IOVA);
-		else
-			mempool_add_elem(mp, (char *)vaddr + off, iova + off);
-		off += mp->elt_size + mp->trailer_size;
-		i++;
-	}
+	if (off > len)
+		return -EINVAL;
+
+	i = rte_mempool_ops_populate(mp, mp->size - mp->populated_size,
+		(char *)vaddr + off,
+		(iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off),
+		len - off, mempool_add_elem, NULL);
 
 	/* not enough room to store one object */
 	if (i == 0)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 0151f6c..49083bd 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -449,6 +449,63 @@ ssize_t rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
 		uint32_t obj_num, uint32_t pg_shift,
 		size_t *min_chunk_size, size_t *align);
 
+/**
+ * Function to be called for each populated object.
+ *
+ * @param[in] mp
+ *   A pointer to the mempool structure.
+ * @param[in] opaque
+ *   An opaque pointer passed to iterator.
+ * @param[in] vaddr
+ *   Object virtual address.
+ * @param[in] iova
+ *   Input/output virtual address of the object or RTE_BAD_IOVA.
+ */
+typedef void (rte_mempool_populate_obj_cb_t)(struct rte_mempool *mp,
+		void *opaque, void *vaddr, rte_iova_t iova);
+
+/**
+ * Populate memory pool objects using provided memory chunk.
+ *
+ * Populated objects should be enqueued to the pool, e.g. using
+ * rte_mempool_ops_enqueue_bulk().
+ *
+ * If the given IO address is unknown (iova = RTE_BAD_IOVA),
+ * the chunk doesn't need to be physically contiguous (only virtually),
+ * and allocated objects may span two pages.
+ *
+ * @param[in] mp
+ *   A pointer to the mempool structure.
+ * @param[in] max_objs
+ *   Maximum number of objects to be populated.
+ * @param[in] vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param[in] iova
+ *   The IO address
+ * @param[in] len
+ *   The length of memory in bytes.
+ * @param[in] obj_cb
+ *   Callback function to be executed for each populated object.
+ * @param[in] obj_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   The number of objects added on success.
+ *   On error, no objects are populated and a negative errno is returned.
+ */
+typedef int (*rte_mempool_populate_t)(struct rte_mempool *mp,
+		unsigned int max_objs,
+		void *vaddr, rte_iova_t iova, size_t len,
+		rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg);
+
+/**
+ * Default way to populate memory pool object using provided memory
+ * chunk: just slice objects one by one.
+ */
+int rte_mempool_op_populate_default(struct rte_mempool *mp,
+		unsigned int max_objs,
+		void *vaddr, rte_iova_t iova, size_t len,
+		rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg);
+
 /** Structure defining mempool operations structure */
 struct rte_mempool_ops {
 	char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
@@ -470,6 +527,11 @@ struct rte_mempool_ops {
 	 * store specified number of objects.
 	 */
 	rte_mempool_calc_mem_size_t calc_mem_size;
+	/**
+	 * Optional callback to populate mempool objects using
+	 * provided memory chunk.
+	 */
+	rte_mempool_populate_t populate;
 } __rte_cache_aligned;
 
 #define RTE_MEMPOOL_MAX_OPS_IDX 16  /**< Max registered ops structs */
@@ -642,6 +704,34 @@ ssize_t rte_mempool_ops_calc_mem_size(const struct rte_mempool *mp,
 				      size_t *min_chunk_size, size_t *align);
 
 /**
+ * @internal wrapper for mempool_ops populate callback.
+ *
+ * Populate memory pool objects using provided memory chunk.
+ *
+ * @param[in] mp
+ *   A pointer to the mempool structure.
+ * @param[in] max_objs
+ *   Maximum number of objects to be populated.
+ * @param[in] vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param[in] iova
+ *   The IO address
+ * @param[in] len
+ *   The length of memory in bytes.
+ * @param[in] obj_cb
+ *   Callback function to be executed for each populated object.
+ * @param[in] obj_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   The number of objects added on success.
+ *   On error, no objects are populated and a negative errno is returned.
+ */
+int rte_mempool_ops_populate(struct rte_mempool *mp, unsigned int max_objs,
+			     void *vaddr, rte_iova_t iova, size_t len,
+			     rte_mempool_populate_obj_cb_t *obj_cb,
+			     void *obj_cb_arg);
+
+/**
  * @internal wrapper for mempool_ops free callback.
  *
  * @param mp
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 26908cc..1a7f39f 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -60,6 +60,7 @@ rte_mempool_register_ops(const struct rte_mempool_ops *h)
 	ops->get_capabilities = h->get_capabilities;
 	ops->register_memory_area = h->register_memory_area;
 	ops->calc_mem_size = h->calc_mem_size;
+	ops->populate = h->populate;
 
 	rte_spinlock_unlock(&rte_mempool_ops_table.sl);
 
@@ -141,6 +142,26 @@ rte_mempool_ops_calc_mem_size(const struct rte_mempool *mp,
 	return ops->calc_mem_size(mp, obj_num, pg_shift, min_chunk_size, align);
 }
 
+/* wrapper to populate memory pool objects using provided memory chunk */
+int
+rte_mempool_ops_populate(struct rte_mempool *mp, unsigned int max_objs,
+				void *vaddr, rte_iova_t iova, size_t len,
+				rte_mempool_populate_obj_cb_t *obj_cb,
+				void *obj_cb_arg)
+{
+	struct rte_mempool_ops *ops;
+
+	ops = rte_mempool_get_ops(mp->ops_index);
+
+	if (ops->populate == NULL)
+		return rte_mempool_op_populate_default(mp, max_objs, vaddr,
+						       iova, len, obj_cb,
+						       obj_cb_arg);
+
+	return ops->populate(mp, max_objs, vaddr, iova, len, obj_cb,
+			     obj_cb_arg);
+}
+
 /* sets mempool ops previously registered by rte_mempool_register_ops. */
 int
 rte_mempool_set_ops_byname(struct rte_mempool *mp, const char *name,
diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c
index 57fe79b..57295f7 100644
--- a/lib/librte_mempool/rte_mempool_ops_default.c
+++ b/lib/librte_mempool/rte_mempool_ops_default.c
@@ -36,3 +36,27 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
 
 	return mem_size;
 }
+
+int
+rte_mempool_op_populate_default(struct rte_mempool *mp, unsigned int max_objs,
+		void *vaddr, rte_iova_t iova, size_t len,
+		rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg)
+{
+	size_t total_elt_sz;
+	size_t off;
+	unsigned int i;
+	void *obj;
+
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+	for (off = 0, i = 0; off + total_elt_sz <= len && i < max_objs; i++) {
+		off += mp->header_size;
+		obj = (char *)vaddr + off;
+		obj_cb(mp, obj_cb_arg, obj,
+		       (iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off));
+		rte_mempool_ops_enqueue_bulk(mp, &obj, 1);
+		off += mp->elt_size + mp->trailer_size;
+	}
+
+	return i;
+}
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index e2a054b..90e79ec 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -56,6 +56,7 @@ DPDK_18.05 {
 	global:
 
 	rte_mempool_op_calc_mem_size_default;
+	rte_mempool_op_populate_default;
 
 } DPDK_17.11;
 
-- 
2.7.4

^ permalink raw reply	[relevance 6%]

* [dpdk-dev] [PATCH v1 3/9] mempool: remove callback to get capabilities
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
  2018-03-10 15:39  7%     ` [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated Andrew Rybchenko
  2018-03-10 15:39  6%     ` [dpdk-dev] [PATCH v1 2/9] mempool: add op to populate objects using provided memory Andrew Rybchenko
@ 2018-03-10 15:39  6%     ` Andrew Rybchenko
  2018-03-10 15:39  5%     ` [dpdk-dev] [PATCH v1 4/9] mempool: deprecate xmem functions Andrew Rybchenko
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-03-10 15:39 UTC (permalink / raw)
  To: dev; +Cc: Olivier MATZ

The callback was introduced to let generic code to know octeontx
mempool driver requirements to use single physically contiguous
memory chunk to store all objects and align object address to
total object size. Now these requirements are met using a new
callbacks to calculate required memory chunk size and to populate
objects using provided memory chunk.

These capability flags are not used anywhere else.

Restricting capabilities to flags is not generic and likely to
be insufficient to describe mempool driver features. If required
in the future, API which returns structured information may be
added.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
RFCv2 -> v1:
 - squash mempool/octeontx patches to add calc_mem_size and populate
   callbacks to this one in order to avoid breakages in the middle of
   patchset
 - advertise API changes in release notes

 doc/guides/rel_notes/deprecation.rst            |  1 -
 doc/guides/rel_notes/release_18_05.rst          | 11 +++++
 drivers/mempool/octeontx/rte_mempool_octeontx.c | 59 +++++++++++++++++++++----
 lib/librte_mempool/rte_mempool.c                | 44 ++----------------
 lib/librte_mempool/rte_mempool.h                | 52 +---------------------
 lib/librte_mempool/rte_mempool_ops.c            | 14 ------
 lib/librte_mempool/rte_mempool_ops_default.c    | 15 +------
 lib/librte_mempool/rte_mempool_version.map      |  1 -
 8 files changed, 68 insertions(+), 129 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index c06fc67..4deed9a 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -70,7 +70,6 @@ Deprecation Notices
 
   The following changes are planned:
 
-  - removal of ``get_capabilities`` mempool ops and related flags.
   - substitute ``register_memory_area`` with ``populate`` ops.
   - addition of new op to allocate contiguous
     block of objects if underlying driver supports it.
diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index abaefe5..c50f26c 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -66,6 +66,14 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =========================================================
 
+* **Removed mempool capability flags and related functions.**
+
+  Flags ``MEMPOOL_F_CAPA_PHYS_CONTIG`` and
+  ``MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS`` were used by octeontx mempool
+  driver to customize generic mempool library behaviour.
+  Now the new driver callbacks ``calc_mem_size`` and ``populate`` may be
+  used to achieve it without specific knowledge in the generic code.
+
 
 ABI Changes
 -----------
@@ -86,6 +94,9 @@ ABI Changes
   to allow to customize required memory size calculation.
   A new callback ``populate`` has been added to ``rte_mempool_ops``
   to allow to customize objects population.
+  Callback ``get_capabilities`` has been removed from ``rte_mempool_ops``
+  since its features are covered by ``calc_mem_size`` and ``populate``
+  callbacks.
 
 
 Removed Items
diff --git a/drivers/mempool/octeontx/rte_mempool_octeontx.c b/drivers/mempool/octeontx/rte_mempool_octeontx.c
index d143d05..f2c4f6a 100644
--- a/drivers/mempool/octeontx/rte_mempool_octeontx.c
+++ b/drivers/mempool/octeontx/rte_mempool_octeontx.c
@@ -126,14 +126,29 @@ octeontx_fpavf_get_count(const struct rte_mempool *mp)
 	return octeontx_fpa_bufpool_free_count(pool);
 }
 
-static int
-octeontx_fpavf_get_capabilities(const struct rte_mempool *mp,
-				unsigned int *flags)
+static ssize_t
+octeontx_fpavf_calc_mem_size(const struct rte_mempool *mp,
+			     uint32_t obj_num, uint32_t pg_shift,
+			     size_t *min_chunk_size, size_t *align)
 {
-	RTE_SET_USED(mp);
-	*flags |= (MEMPOOL_F_CAPA_PHYS_CONTIG |
-			MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS);
-	return 0;
+	ssize_t mem_size;
+
+	/*
+	 * Simply need space for one more object to be able to
+	 * fullfil alignment requirements.
+	 */
+	mem_size = rte_mempool_op_calc_mem_size_default(mp, obj_num + 1,
+							pg_shift,
+							min_chunk_size, align);
+	if (mem_size >= 0) {
+		/*
+		 * Memory area which contains objects must be physically
+		 * contiguous.
+		 */
+		*min_chunk_size = mem_size;
+	}
+
+	return mem_size;
 }
 
 static int
@@ -150,6 +165,33 @@ octeontx_fpavf_register_memory_area(const struct rte_mempool *mp,
 	return octeontx_fpavf_pool_set_range(pool_bar, len, vaddr, gpool);
 }
 
+static int
+octeontx_fpavf_populate(struct rte_mempool *mp, unsigned int max_objs,
+			void *vaddr, rte_iova_t iova, size_t len,
+			rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg)
+{
+	size_t total_elt_sz;
+	size_t off;
+
+	if (iova == RTE_BAD_IOVA)
+		return -EINVAL;
+
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+	/* align object start address to a multiple of total_elt_sz */
+	off = total_elt_sz - ((uintptr_t)vaddr % total_elt_sz);
+
+	if (len < off)
+		return -EINVAL;
+
+	vaddr = (char *)vaddr + off;
+	iova += off;
+	len -= off;
+
+	return rte_mempool_op_populate_default(mp, max_objs, vaddr, iova, len,
+					       obj_cb, obj_cb_arg);
+}
+
 static struct rte_mempool_ops octeontx_fpavf_ops = {
 	.name = "octeontx_fpavf",
 	.alloc = octeontx_fpavf_alloc,
@@ -157,8 +199,9 @@ static struct rte_mempool_ops octeontx_fpavf_ops = {
 	.enqueue = octeontx_fpavf_enqueue,
 	.dequeue = octeontx_fpavf_dequeue,
 	.get_count = octeontx_fpavf_get_count,
-	.get_capabilities = octeontx_fpavf_get_capabilities,
 	.register_memory_area = octeontx_fpavf_register_memory_area,
+	.calc_mem_size = octeontx_fpavf_calc_mem_size,
+	.populate = octeontx_fpavf_populate,
 };
 
 MEMPOOL_REGISTER_OPS(octeontx_fpavf_ops);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ed0e982..fdcda45 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -208,15 +208,9 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  */
 size_t
 rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift,
-		      unsigned int flags)
+		      __rte_unused unsigned int flags)
 {
 	size_t obj_per_page, pg_num, pg_sz;
-	unsigned int mask;
-
-	mask = MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS | MEMPOOL_F_CAPA_PHYS_CONTIG;
-	if ((flags & mask) == mask)
-		/* alignment need one additional object */
-		elt_num += 1;
 
 	if (total_elt_sz == 0)
 		return 0;
@@ -240,18 +234,12 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift,
 ssize_t
 rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 	size_t total_elt_sz, const rte_iova_t iova[], uint32_t pg_num,
-	uint32_t pg_shift, unsigned int flags)
+	uint32_t pg_shift, __rte_unused unsigned int flags)
 {
 	uint32_t elt_cnt = 0;
 	rte_iova_t start, end;
 	uint32_t iova_idx;
 	size_t pg_sz = (size_t)1 << pg_shift;
-	unsigned int mask;
-
-	mask = MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS | MEMPOOL_F_CAPA_PHYS_CONTIG;
-	if ((flags & mask) == mask)
-		/* alignment need one additional object */
-		elt_num += 1;
 
 	/* if iova is NULL, assume contiguous memory */
 	if (iova == NULL) {
@@ -330,8 +318,6 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	rte_iova_t iova, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
 {
-	unsigned total_elt_sz;
-	unsigned int mp_capa_flags;
 	unsigned i = 0;
 	size_t off;
 	struct rte_mempool_memhdr *memhdr;
@@ -354,27 +340,6 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	if (mp->populated_size >= mp->size)
 		return -ENOSPC;
 
-	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
-
-	/* Get mempool capabilities */
-	mp_capa_flags = 0;
-	ret = rte_mempool_ops_get_capabilities(mp, &mp_capa_flags);
-	if ((ret < 0) && (ret != -ENOTSUP))
-		return ret;
-
-	/* update mempool capabilities */
-	mp->flags |= mp_capa_flags;
-
-	/* Detect pool area has sufficient space for elements */
-	if (mp_capa_flags & MEMPOOL_F_CAPA_PHYS_CONTIG) {
-		if (len < total_elt_sz * mp->size) {
-			RTE_LOG(ERR, MEMPOOL,
-				"pool area %" PRIx64 " not enough\n",
-				(uint64_t)len);
-			return -ENOSPC;
-		}
-	}
-
 	memhdr = rte_zmalloc("MEMPOOL_MEMHDR", sizeof(*memhdr), 0);
 	if (memhdr == NULL)
 		return -ENOMEM;
@@ -386,10 +351,7 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	memhdr->free_cb = free_cb;
 	memhdr->opaque = opaque;
 
-	if (mp_capa_flags & MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS)
-		/* align object start address to a multiple of total_elt_sz */
-		off = total_elt_sz - ((uintptr_t)vaddr % total_elt_sz);
-	else if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
+	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
 	else
 		off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 49083bd..cd3b229 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -245,24 +245,6 @@ struct rte_mempool {
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
 #define MEMPOOL_F_POOL_CREATED   0x0010 /**< Internal: pool is created. */
 #define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
-/**
- * This capability flag is advertised by a mempool handler, if the whole
- * memory area containing the objects must be physically contiguous.
- * Note: This flag should not be passed by application.
- */
-#define MEMPOOL_F_CAPA_PHYS_CONTIG 0x0040
-/**
- * This capability flag is advertised by a mempool handler. Used for a case
- * where mempool driver wants object start address(vaddr) aligned to block
- * size(/ total element size).
- *
- * Note:
- * - This flag should not be passed by application.
- *   Flag used for mempool driver only.
- * - Mempool driver must also set MEMPOOL_F_CAPA_PHYS_CONTIG flag along with
- *   MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS.
- */
-#define MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS 0x0080
 
 /**
  * @internal When debug is enabled, store some statistics.
@@ -388,12 +370,6 @@ typedef int (*rte_mempool_dequeue_t)(struct rte_mempool *mp,
 typedef unsigned (*rte_mempool_get_count)(const struct rte_mempool *mp);
 
 /**
- * Get the mempool capabilities.
- */
-typedef int (*rte_mempool_get_capabilities_t)(const struct rte_mempool *mp,
-		unsigned int *flags);
-
-/**
  * Notify new memory area to mempool.
  */
 typedef int (*rte_mempool_ops_register_memory_area_t)
@@ -433,13 +409,7 @@ typedef ssize_t (*rte_mempool_calc_mem_size_t)(const struct rte_mempool *mp,
  * that pages are grouped in subsets of physically continuous pages big
  * enough to store at least one object.
  *
- * If mempool driver requires object addresses to be block size aligned
- * (MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS), space for one extra element is
- * reserved to be able to meet the requirement.
- *
- * Minimum size of memory chunk is either all required space, if
- * capabilities say that whole memory area must be physically contiguous
- * (MEMPOOL_F_CAPA_PHYS_CONTIG), or a maximum of the page size and total
+ * Minimum size of memory chunk is a maximum of the page size and total
  * element size.
  *
  * Required memory chunk alignment is a maximum of page size and cache
@@ -515,10 +485,6 @@ struct rte_mempool_ops {
 	rte_mempool_dequeue_t dequeue;   /**< Dequeue an object. */
 	rte_mempool_get_count get_count; /**< Get qty of available objs. */
 	/**
-	 * Get the mempool capabilities
-	 */
-	rte_mempool_get_capabilities_t get_capabilities;
-	/**
 	 * Notify new memory area to mempool
 	 */
 	rte_mempool_ops_register_memory_area_t register_memory_area;
@@ -644,22 +610,6 @@ unsigned
 rte_mempool_ops_get_count(const struct rte_mempool *mp);
 
 /**
- * @internal wrapper for mempool_ops get_capabilities callback.
- *
- * @param mp [in]
- *   Pointer to the memory pool.
- * @param flags [out]
- *   Pointer to the mempool flags.
- * @return
- *   - 0: Success; The mempool driver has advertised his pool capabilities in
- *   flags param.
- *   - -ENOTSUP - doesn't support get_capabilities ops (valid case).
- *   - Otherwise, pool create fails.
- */
-int
-rte_mempool_ops_get_capabilities(const struct rte_mempool *mp,
-					unsigned int *flags);
-/**
  * @internal wrapper for mempool_ops register_memory_area callback.
  * API to notify the mempool handler when a new memory area is added to pool.
  *
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 1a7f39f..6ac669a 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -57,7 +57,6 @@ rte_mempool_register_ops(const struct rte_mempool_ops *h)
 	ops->enqueue = h->enqueue;
 	ops->dequeue = h->dequeue;
 	ops->get_count = h->get_count;
-	ops->get_capabilities = h->get_capabilities;
 	ops->register_memory_area = h->register_memory_area;
 	ops->calc_mem_size = h->calc_mem_size;
 	ops->populate = h->populate;
@@ -99,19 +98,6 @@ rte_mempool_ops_get_count(const struct rte_mempool *mp)
 	return ops->get_count(mp);
 }
 
-/* wrapper to get external mempool capabilities. */
-int
-rte_mempool_ops_get_capabilities(const struct rte_mempool *mp,
-					unsigned int *flags)
-{
-	struct rte_mempool_ops *ops;
-
-	ops = rte_mempool_get_ops(mp->ops_index);
-
-	RTE_FUNC_PTR_OR_ERR_RET(ops->get_capabilities, -ENOTSUP);
-	return ops->get_capabilities(mp, flags);
-}
-
 /* wrapper to notify new memory area to external mempool */
 int
 rte_mempool_ops_register_memory_area(const struct rte_mempool *mp, char *vaddr,
diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c
index 57295f7..3defc15 100644
--- a/lib/librte_mempool/rte_mempool_ops_default.c
+++ b/lib/librte_mempool/rte_mempool_ops_default.c
@@ -11,26 +11,15 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
 				     uint32_t obj_num, uint32_t pg_shift,
 				     size_t *min_chunk_size, size_t *align)
 {
-	unsigned int mp_flags;
-	int ret;
 	size_t total_elt_sz;
 	size_t mem_size;
 
-	/* Get mempool capabilities */
-	mp_flags = 0;
-	ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
-	if ((ret < 0) && (ret != -ENOTSUP))
-		return ret;
-
 	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
 
 	mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
-					 mp->flags | mp_flags);
+					 mp->flags);
 
-	if (mp_flags & MEMPOOL_F_CAPA_PHYS_CONTIG)
-		*min_chunk_size = mem_size;
-	else
-		*min_chunk_size = RTE_MAX((size_t)1 << pg_shift, total_elt_sz);
+	*min_chunk_size = RTE_MAX((size_t)1 << pg_shift, total_elt_sz);
 
 	*align = RTE_MAX((size_t)RTE_CACHE_LINE_SIZE, (size_t)1 << pg_shift);
 
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 90e79ec..42ca4df 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -45,7 +45,6 @@ DPDK_16.07 {
 DPDK_17.11 {
 	global:
 
-	rte_mempool_ops_get_capabilities;
 	rte_mempool_ops_register_memory_area;
 	rte_mempool_populate_iova;
 	rte_mempool_populate_iova_tab;
-- 
2.7.4

^ permalink raw reply	[relevance 6%]

* [dpdk-dev] [PATCH v1 4/9] mempool: deprecate xmem functions
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
                       ` (2 preceding siblings ...)
  2018-03-10 15:39  6%     ` [dpdk-dev] [PATCH v1 3/9] mempool: remove callback to get capabilities Andrew Rybchenko
@ 2018-03-10 15:39  5%     ` Andrew Rybchenko
  2018-03-10 15:39  8%     ` [dpdk-dev] [PATCH v1 7/9] mempool: remove callback to register memory area Andrew Rybchenko
  2018-03-19 17:03  0%     ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Olivier Matz
  5 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-03-10 15:39 UTC (permalink / raw)
  To: dev; +Cc: Olivier MATZ

Move rte_mempool_xmem_size() code to internal helper function
since it is required in two places: deprecated rte_mempool_xmem_size()
and non-deprecated rte_mempool_op_calc_mem_size_deafult().

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
RFCv2 -> v1:
 - advertise deprecation in release notes
 - factor out default memory size calculation into non-deprecated
   internal function to avoid usage of deprecated function internally
 - remove test for deprecated functions to address build issue because
   of usage of deprecated functions (it is easy to allow usage of
   deprecated function in Makefile, but very complicated in meson)

 doc/guides/rel_notes/deprecation.rst         |  7 -------
 doc/guides/rel_notes/release_18_05.rst       | 10 +++++++++
 lib/librte_mempool/rte_mempool.c             | 19 ++++++++++++++---
 lib/librte_mempool/rte_mempool.h             | 25 ++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_ops_default.c |  4 ++--
 test/test/test_mempool.c                     | 31 ----------------------------
 6 files changed, 53 insertions(+), 43 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 4deed9a..473330d 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -60,13 +60,6 @@ Deprecation Notices
   - ``rte_eal_mbuf_default_mempool_ops``
 
 * mempool: several API and ABI changes are planned in v18.05.
-  The following functions, introduced for Xen, which is not supported
-  anymore since v17.11, are hard to use, not used anywhere else in DPDK.
-  Therefore they will be deprecated in v18.05 and removed in v18.08:
-
-  - ``rte_mempool_xmem_create``
-  - ``rte_mempool_xmem_size``
-  - ``rte_mempool_xmem_usage``
 
   The following changes are planned:
 
diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index c50f26c..0244f91 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -74,6 +74,16 @@ API Changes
   Now the new driver callbacks ``calc_mem_size`` and ``populate`` may be
   used to achieve it without specific knowledge in the generic code.
 
+* **Deprecated mempool xmem functions.**
+
+  The following functions, introduced for Xen, which is not supported
+  anymore since v17.11, are hard to use, not used anywhere else in DPDK.
+  Therefore they were deprecated in v18.05 and will be removed in v18.08:
+
+  - ``rte_mempool_xmem_create``
+  - ``rte_mempool_xmem_size``
+  - ``rte_mempool_xmem_usage``
+
 
 ABI Changes
 -----------
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index fdcda45..b57ba2a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -204,11 +204,13 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 
 
 /*
- * Calculate maximum amount of memory required to store given number of objects.
+ * Internal function to calculate required memory chunk size shared
+ * by default implementation of the corresponding callback and
+ * deprecated external function.
  */
 size_t
-rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift,
-		      __rte_unused unsigned int flags)
+rte_mempool_calc_mem_size_helper(uint32_t elt_num, size_t total_elt_sz,
+				 uint32_t pg_shift)
 {
 	size_t obj_per_page, pg_num, pg_sz;
 
@@ -228,6 +230,17 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift,
 }
 
 /*
+ * Calculate maximum amount of memory required to store given number of objects.
+ */
+size_t
+rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift,
+		      __rte_unused unsigned int flags)
+{
+	return rte_mempool_calc_mem_size_helper(elt_num, total_elt_sz,
+						pg_shift);
+}
+
+/*
  * Calculate how much memory would be actually required with the
  * given memory footprint to store required number of elements.
  */
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index cd3b229..ebfc95c 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -420,6 +420,28 @@ ssize_t rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
 		size_t *min_chunk_size, size_t *align);
 
 /**
+ * @internal Helper function to calculate memory size required to store
+ * specified number of objects in assumption that the memory buffer will
+ * be aligned at page boundary.
+ *
+ * Note that if object size is bigger than page size, then it assumes
+ * that pages are grouped in subsets of physically continuous pages big
+ * enough to store at least one object.
+ *
+ * @param elt_num
+ *   Number of elements.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
+ * @param pg_shift
+ *   LOG2 of the physical pages size. If set to 0, ignore page boundaries.
+ * @return
+ *   Required memory size aligned at page boundary.
+ */
+size_t rte_mempool_calc_mem_size_helper(uint32_t elt_num, size_t total_elt_sz,
+		uint32_t pg_shift);
+
+/**
  * Function to be called for each populated object.
  *
  * @param[in] mp
@@ -905,6 +927,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   The pointer to the new allocated mempool, on success. NULL on error
  *   with rte_errno set appropriately. See rte_mempool_create() for details.
  */
+__rte_deprecated
 struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
@@ -1667,6 +1690,7 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  * @return
  *   Required memory size aligned at page boundary.
  */
+__rte_deprecated
 size_t rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pg_shift, unsigned int flags);
 
@@ -1698,6 +1722,7 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz,
  *   buffer is too small, return a negative value whose absolute value
  *   is the actual number of elements that can be stored in that buffer.
  */
+__rte_deprecated
 ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
 	size_t total_elt_sz, const rte_iova_t iova[], uint32_t pg_num,
 	uint32_t pg_shift, unsigned int flags);
diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c
index 3defc15..fd63ca1 100644
--- a/lib/librte_mempool/rte_mempool_ops_default.c
+++ b/lib/librte_mempool/rte_mempool_ops_default.c
@@ -16,8 +16,8 @@ rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
 
 	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
 
-	mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
-					 mp->flags);
+	mem_size = rte_mempool_calc_mem_size_helper(obj_num, total_elt_sz,
+						    pg_shift);
 
 	*min_chunk_size = RTE_MAX((size_t)1 << pg_shift, total_elt_sz);
 
diff --git a/test/test/test_mempool.c b/test/test/test_mempool.c
index 63f921e..8d29af2 100644
--- a/test/test/test_mempool.c
+++ b/test/test/test_mempool.c
@@ -444,34 +444,6 @@ test_mempool_same_name_twice_creation(void)
 	return 0;
 }
 
-/*
- * Basic test for mempool_xmem functions.
- */
-static int
-test_mempool_xmem_misc(void)
-{
-	uint32_t elt_num, total_size;
-	size_t sz;
-	ssize_t usz;
-
-	elt_num = MAX_KEEP;
-	total_size = rte_mempool_calc_obj_size(MEMPOOL_ELT_SIZE, 0, NULL);
-	sz = rte_mempool_xmem_size(elt_num, total_size, MEMPOOL_PG_SHIFT_MAX,
-					0);
-
-	usz = rte_mempool_xmem_usage(NULL, elt_num, total_size, 0, 1,
-		MEMPOOL_PG_SHIFT_MAX, 0);
-
-	if (sz != (size_t)usz)  {
-		printf("failure @ %s: rte_mempool_xmem_usage(%u, %u) "
-			"returns: %#zx, while expected: %#zx;\n",
-			__func__, elt_num, total_size, sz, (size_t)usz);
-		return -1;
-	}
-
-	return 0;
-}
-
 static void
 walk_cb(struct rte_mempool *mp, void *userdata __rte_unused)
 {
@@ -596,9 +568,6 @@ test_mempool(void)
 	if (test_mempool_same_name_twice_creation() < 0)
 		goto err;
 
-	if (test_mempool_xmem_misc() < 0)
-		goto err;
-
 	/* test the stack handler */
 	if (test_mempool_basic(mp_stack, 1) < 0)
 		goto err;
-- 
2.7.4

^ permalink raw reply	[relevance 5%]

* [dpdk-dev] [PATCH v1 7/9] mempool: remove callback to register memory area
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
                       ` (3 preceding siblings ...)
  2018-03-10 15:39  5%     ` [dpdk-dev] [PATCH v1 4/9] mempool: deprecate xmem functions Andrew Rybchenko
@ 2018-03-10 15:39  8%     ` Andrew Rybchenko
  2018-03-19 17:03  0%     ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Olivier Matz
  5 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-03-10 15:39 UTC (permalink / raw)
  To: dev; +Cc: Olivier MATZ

The callback is not required any more since there is a new callback
to populate objects using provided memory area which provides
the same information.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
---
RFCv2 -> v1:
 - advertise ABI changes in release notes

 doc/guides/rel_notes/deprecation.rst       |  1 -
 doc/guides/rel_notes/release_18_05.rst     |  2 ++
 lib/librte_mempool/rte_mempool.c           |  5 -----
 lib/librte_mempool/rte_mempool.h           | 31 ------------------------------
 lib/librte_mempool/rte_mempool_ops.c       | 14 --------------
 lib/librte_mempool/rte_mempool_version.map |  1 -
 6 files changed, 2 insertions(+), 52 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 473330d..5301259 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -63,7 +63,6 @@ Deprecation Notices
 
   The following changes are planned:
 
-  - substitute ``register_memory_area`` with ``populate`` ops.
   - addition of new op to allocate contiguous
     block of objects if underlying driver supports it.
 
diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst
index 0244f91..9d40db1 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -107,6 +107,8 @@ ABI Changes
   Callback ``get_capabilities`` has been removed from ``rte_mempool_ops``
   since its features are covered by ``calc_mem_size`` and ``populate``
   callbacks.
+  Callback ``register_memory_area`` has been removed from ``rte_mempool_ops``
+  since the new callback ``populate`` may be used instead of it.
 
 
 Removed Items
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index b57ba2a..844d907 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -344,11 +344,6 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 		mp->flags |= MEMPOOL_F_POOL_CREATED;
 	}
 
-	/* Notify memory area to mempool */
-	ret = rte_mempool_ops_register_memory_area(mp, vaddr, iova, len);
-	if (ret != -ENOTSUP && ret < 0)
-		return ret;
-
 	/* mempool is already populated */
 	if (mp->populated_size >= mp->size)
 		return -ENOSPC;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index ebfc95c..5f63f86 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -370,12 +370,6 @@ typedef int (*rte_mempool_dequeue_t)(struct rte_mempool *mp,
 typedef unsigned (*rte_mempool_get_count)(const struct rte_mempool *mp);
 
 /**
- * Notify new memory area to mempool.
- */
-typedef int (*rte_mempool_ops_register_memory_area_t)
-(const struct rte_mempool *mp, char *vaddr, rte_iova_t iova, size_t len);
-
-/**
  * Calculate memory size required to store given number of objects.
  *
  * @param[in] mp
@@ -507,10 +501,6 @@ struct rte_mempool_ops {
 	rte_mempool_dequeue_t dequeue;   /**< Dequeue an object. */
 	rte_mempool_get_count get_count; /**< Get qty of available objs. */
 	/**
-	 * Notify new memory area to mempool
-	 */
-	rte_mempool_ops_register_memory_area_t register_memory_area;
-	/**
 	 * Optional callback to calculate memory size required to
 	 * store specified number of objects.
 	 */
@@ -632,27 +622,6 @@ unsigned
 rte_mempool_ops_get_count(const struct rte_mempool *mp);
 
 /**
- * @internal wrapper for mempool_ops register_memory_area callback.
- * API to notify the mempool handler when a new memory area is added to pool.
- *
- * @param mp
- *   Pointer to the memory pool.
- * @param vaddr
- *   Pointer to the buffer virtual address.
- * @param iova
- *   Pointer to the buffer IO address.
- * @param len
- *   Pool size.
- * @return
- *   - 0: Success;
- *   - -ENOTSUP - doesn't support register_memory_area ops (valid error case).
- *   - Otherwise, rte_mempool_populate_phys fails thus pool create fails.
- */
-int
-rte_mempool_ops_register_memory_area(const struct rte_mempool *mp,
-				char *vaddr, rte_iova_t iova, size_t len);
-
-/**
  * @internal wrapper for mempool_ops calc_mem_size callback.
  * API to calculate size of memory required to store specified number of
  * object.
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 6ac669a..ea9be1e 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -57,7 +57,6 @@ rte_mempool_register_ops(const struct rte_mempool_ops *h)
 	ops->enqueue = h->enqueue;
 	ops->dequeue = h->dequeue;
 	ops->get_count = h->get_count;
-	ops->register_memory_area = h->register_memory_area;
 	ops->calc_mem_size = h->calc_mem_size;
 	ops->populate = h->populate;
 
@@ -99,19 +98,6 @@ rte_mempool_ops_get_count(const struct rte_mempool *mp)
 }
 
 /* wrapper to notify new memory area to external mempool */
-int
-rte_mempool_ops_register_memory_area(const struct rte_mempool *mp, char *vaddr,
-					rte_iova_t iova, size_t len)
-{
-	struct rte_mempool_ops *ops;
-
-	ops = rte_mempool_get_ops(mp->ops_index);
-
-	RTE_FUNC_PTR_OR_ERR_RET(ops->register_memory_area, -ENOTSUP);
-	return ops->register_memory_area(mp, vaddr, iova, len);
-}
-
-/* wrapper to notify new memory area to external mempool */
 ssize_t
 rte_mempool_ops_calc_mem_size(const struct rte_mempool *mp,
 				uint32_t obj_num, uint32_t pg_shift,
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 42ca4df..f539a5a 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -45,7 +45,6 @@ DPDK_16.07 {
 DPDK_17.11 {
 	global:
 
-	rte_mempool_ops_register_memory_area;
 	rte_mempool_populate_iova;
 	rte_mempool_populate_iova_tab;
 
-- 
2.7.4

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated
  2018-03-10 15:39  7%     ` [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated Andrew Rybchenko
@ 2018-03-11 12:51  0%       ` santosh
  2018-03-12  6:53  0%         ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: santosh @ 2018-03-11 12:51 UTC (permalink / raw)
  To: Andrew Rybchenko, dev; +Cc: Olivier MATZ

Hi Andrew,


On Saturday 10 March 2018 09:09 PM, Andrew Rybchenko wrote:
> Size of memory chunk required to populate mempool objects depends
> on how objects are stored in the memory. Different mempool drivers
> may have different requirements and a new operation allows to
> calculate memory size in accordance with driver requirements and
> advertise requirements on minimum memory chunk size and alignment
> in a generic way.
>
> Bump ABI version since the patch breaks it.
>
> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
> ---
> RFCv2 -> v1:
>  - move default calc_mem_size callback to rte_mempool_ops_default.c
>  - add ABI changes to release notes
>  - name default callback consistently: rte_mempool_op_<callback>_default()
>  - bump ABI version since it is the first patch which breaks ABI
>  - describe default callback behaviour in details
>  - avoid introduction of internal function to cope with depration
>    (keep it to deprecation patch)
>  - move cache-line or page boundary chunk alignment to default callback
>  - highlight that min_chunk_size and align parameters are output only
>
[...]

> diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c
> new file mode 100644
> index 0000000..57fe79b
> --- /dev/null
> +++ b/lib/librte_mempool/rte_mempool_ops_default.c
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2016 Intel Corporation.
> + * Copyright(c) 2016 6WIND S.A.
> + * Copyright(c) 2018 Solarflare Communications Inc.
> + */
> +
> +#include <rte_mempool.h>
> +
> +ssize_t
> +rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
> +				     uint32_t obj_num, uint32_t pg_shift,
> +				     size_t *min_chunk_size, size_t *align)
> +{
> +	unsigned int mp_flags;
> +	int ret;
> +	size_t total_elt_sz;
> +	size_t mem_size;
> +
> +	/* Get mempool capabilities */
> +	mp_flags = 0;
> +	ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
> +	if ((ret < 0) && (ret != -ENOTSUP))
> +		return ret;
> +
> +	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
> +
> +	mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
> +					 mp->flags | mp_flags);
> +

Looks ok to me except a nit:
(mp->flags | mp_flags) style expression is to differentiate that
mp_flags holds driver specific flag like BLK_ALIGN and mp->flags
has appl specific flags.. is it so? If not then why not simply
do like:
mp->flags |= mp_flags.

Thanks.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 1/2] eventdev: add device stop flush callback
  @ 2018-03-12  6:25  3%   ` Jerin Jacob
  2018-03-12 14:30  3%     ` Eads, Gage
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2018-03-12  6:25 UTC (permalink / raw)
  To: Gage Eads
  Cc: dev, harry.van.haaren, hemant.agrawal, bruce.richardson,
	santosh.shukla, nipun.gupta

-----Original Message-----
> When an event device is stopped, it drains all event queues. These events
> may contain pointers, so to prevent memory leaks eventdev now supports a
> user-provided flush callback that is called during the queue drain process.
> This callback is stored in process memory, so the callback must be
> registered by any process that may call rte_event_dev_stop().
> 
> This commit also clarifies the behavior of rte_event_dev_stop().
> 
> This follows this mailing list discussion:
> http://dpdk.org/ml/archives/dev/2018-January/087484.html
> 
> Signed-off-by: Gage Eads <gage.eads@intel.com>
> ---
> v2: allow a NULL callback pointer to unregister the callback
> 
>  lib/librte_eventdev/rte_eventdev.c           | 17 +++++++++
>  lib/librte_eventdev/rte_eventdev.h           | 55 +++++++++++++++++++++++++++-
>  lib/librte_eventdev/rte_eventdev_version.map |  6 +++
>  3 files changed, 76 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.c b/lib/librte_eventdev/rte_eventdev.c
> index 851a119..1aacb7b 100644
> --- a/lib/librte_eventdev/rte_eventdev.c
> +++ b/lib/librte_eventdev/rte_eventdev.c
> @@ -1123,6 +1123,23 @@ rte_event_dev_start(uint8_t dev_id)
>  	return 0;
>  }
>  
> +typedef void (*eventdev_stop_flush_t)(uint8_t dev_id, struct rte_event event,
> +		void *arg);
> +/**< Callback function called during rte_event_dev_stop(), invoked once per
> + * flushed event.
> + */
> +
>  #define RTE_EVENTDEV_NAME_MAX_LEN	(64)
>  /**< @internal Max length of name of event PMD */
>  
> @@ -1176,6 +1194,11 @@ struct rte_eventdev {
>  	event_dequeue_burst_t dequeue_burst;
>  	/**< Pointer to PMD dequeue burst function. */
>  
> +	eventdev_stop_flush_t dev_stop_flush;
> +	/**< Optional, user-provided event flush function */
> +	void *dev_stop_flush_arg;
> +	/**< User-provided argument for event flush function */
> +

I think, we can move this additions to the internal rte_eventdev_data structure. Reasons are
1) Changes to "struct rte_eventdev" would call for ABI change
2) We can keep "struct rte_eventdev" only for fast path functions,
slow path functions can have additional redirection.

>  	struct rte_eventdev_data *data;
>  	/**< Pointer to device data */
>  	const struct rte_eventdev_ops *dev_ops;
> @@ -1822,6 +1845,34 @@ rte_event_dev_xstats_reset(uint8_t dev_id,
>   */
>  int rte_event_dev_selftest(uint8_t dev_id);
>  
> +/**
> + * Registers a callback function to be invoked during rte_event_dev_stop() for
> + * each flushed event. This function can be used to properly dispose of queued
> + * events, for example events containing memory pointers.
> + *
> + * The callback function is only registered for the calling process. The
> + * callback function must be registered in every process that can call
> + * rte_event_dev_stop().
> + *
> + * To unregister a callback, call this function with a NULL callback pointer.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param callback
> + *   Callback function invoked once per flushed event.
> + * @param userdata
> + *   Argument supplied to callback.
> + *
> + * @return
> + *  - 0 on success.
> + *  - -EINVAL if *dev_id* is invalid
> + *
> + * @see rte_event_dev_stop()
> + */
> +int
> +rte_event_dev_stop_flush_callback_register(uint8_t dev_id,
> +		eventdev_stop_flush_t callback, void *userdata);
> +
IMO, It would be better if we place this function near to rte_event_dev_stop().

Other than above minor changes, It looks good to me.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated
  2018-03-11 12:51  0%       ` santosh
@ 2018-03-12  6:53  0%         ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2018-03-12  6:53 UTC (permalink / raw)
  To: santosh, dev; +Cc: Olivier MATZ

On 03/11/2018 03:51 PM, santosh wrote:
> Hi Andrew,
>
>
> On Saturday 10 March 2018 09:09 PM, Andrew Rybchenko wrote:
>> Size of memory chunk required to populate mempool objects depends
>> on how objects are stored in the memory. Different mempool drivers
>> may have different requirements and a new operation allows to
>> calculate memory size in accordance with driver requirements and
>> advertise requirements on minimum memory chunk size and alignment
>> in a generic way.
>>
>> Bump ABI version since the patch breaks it.
>>
>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
>> ---
>> RFCv2 -> v1:
>>   - move default calc_mem_size callback to rte_mempool_ops_default.c
>>   - add ABI changes to release notes
>>   - name default callback consistently: rte_mempool_op_<callback>_default()
>>   - bump ABI version since it is the first patch which breaks ABI
>>   - describe default callback behaviour in details
>>   - avoid introduction of internal function to cope with depration
>>     (keep it to deprecation patch)
>>   - move cache-line or page boundary chunk alignment to default callback
>>   - highlight that min_chunk_size and align parameters are output only
>>
> [...]
>
>> diff --git a/lib/librte_mempool/rte_mempool_ops_default.c b/lib/librte_mempool/rte_mempool_ops_default.c
>> new file mode 100644
>> index 0000000..57fe79b
>> --- /dev/null
>> +++ b/lib/librte_mempool/rte_mempool_ops_default.c
>> @@ -0,0 +1,38 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2016 Intel Corporation.
>> + * Copyright(c) 2016 6WIND S.A.
>> + * Copyright(c) 2018 Solarflare Communications Inc.
>> + */
>> +
>> +#include <rte_mempool.h>
>> +
>> +ssize_t
>> +rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
>> +				     uint32_t obj_num, uint32_t pg_shift,
>> +				     size_t *min_chunk_size, size_t *align)
>> +{
>> +	unsigned int mp_flags;
>> +	int ret;
>> +	size_t total_elt_sz;
>> +	size_t mem_size;
>> +
>> +	/* Get mempool capabilities */
>> +	mp_flags = 0;
>> +	ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
>> +	if ((ret < 0) && (ret != -ENOTSUP))
>> +		return ret;
>> +
>> +	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
>> +
>> +	mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
>> +					 mp->flags | mp_flags);
>> +
> Looks ok to me except a nit:
> (mp->flags | mp_flags) style expression is to differentiate that
> mp_flags holds driver specific flag like BLK_ALIGN and mp->flags
> has appl specific flags.. is it so? If not then why not simply
> do like:
> mp->flags |= mp_flags.

In fact it does not mater a lot since the code is removed in the patch 3.
Here it is required just for consistency. Also, mp argument is a const
which will not allow to change its members.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 3/6] net/mlx5: add a function to rdma-core glue
  @ 2018-03-12  9:13  3%   ` Nélio Laranjeiro
  0 siblings, 0 replies; 200+ results
From: Nélio Laranjeiro @ 2018-03-12  9:13 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: wenzhuo.lu, jingjing.wu, adrien.mazarguil, olivier.matz, dev

On Fri, Mar 09, 2018 at 05:25:29PM -0800, Yongseok Koh wrote:
> mlx5dv_create_wq() is added.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_glue.c | 9 +++++++++
>  drivers/net/mlx5/mlx5_glue.h | 4 ++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
> index 1c4396ada..e33fc76b5 100644
> --- a/drivers/net/mlx5/mlx5_glue.c
> +++ b/drivers/net/mlx5/mlx5_glue.c
> @@ -287,6 +287,14 @@ mlx5_glue_dv_create_cq(struct ibv_context *context,
>  	return mlx5dv_create_cq(context, cq_attr, mlx5_cq_attr);
>  }
>  
> +static struct ibv_wq *
> +mlx5_glue_dv_create_wq(struct ibv_context *context,
> +		       struct ibv_wq_init_attr *wq_attr,
> +		       struct mlx5dv_wq_init_attr *mlx5_wq_attr)
> +{
> +	return mlx5dv_create_wq(context, wq_attr, mlx5_wq_attr);
> +}
> +
>  static int
>  mlx5_glue_dv_query_device(struct ibv_context *ctx,
>  			  struct mlx5dv_context *attrs_out)
> @@ -347,6 +355,7 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
>  	.port_state_str = mlx5_glue_port_state_str,
>  	.cq_ex_to_cq = mlx5_glue_cq_ex_to_cq,
>  	.dv_create_cq = mlx5_glue_dv_create_cq,
> +	.dv_create_wq = mlx5_glue_dv_create_wq,
>  	.dv_query_device = mlx5_glue_dv_query_device,
>  	.dv_set_context_attr = mlx5_glue_dv_set_context_attr,
>  	.dv_init_obj = mlx5_glue_dv_init_obj,
> diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
> index b5efee3b6..21a713961 100644
> --- a/drivers/net/mlx5/mlx5_glue.h
> +++ b/drivers/net/mlx5/mlx5_glue.h
> @@ -100,6 +100,10 @@ struct mlx5_glue {
>  		(struct ibv_context *context,
>  		 struct ibv_cq_init_attr_ex *cq_attr,
>  		 struct mlx5dv_cq_init_attr *mlx5_cq_attr);
> +	struct ibv_wq *(*dv_create_wq)
> +		(struct ibv_context *context,
> +		 struct ibv_wq_init_attr *wq_attr,
> +		 struct mlx5dv_wq_init_attr *mlx5_wq_attr);
>  	int (*dv_query_device)(struct ibv_context *ctx_in,
>  			       struct mlx5dv_context *attrs_out);
>  	int (*dv_set_context_attr)(struct ibv_context *ibv_ctx,
> -- 
> 2.11.0
 
You missed to change the GLUE ABI version, it must be updated.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 1/2] eventdev: add device stop flush callback
  2018-03-12  6:25  3%   ` Jerin Jacob
@ 2018-03-12 14:30  3%     ` Eads, Gage
  2018-03-12 14:38  0%       ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Eads, Gage @ 2018-03-12 14:30 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Van Haaren, Harry, hemant.agrawal, Richardson, Bruce,
	santosh.shukla, nipun.gupta



> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Monday, March 12, 2018 1:25 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: dev@dpdk.org; Van Haaren, Harry <harry.van.haaren@intel.com>;
> hemant.agrawal@nxp.com; Richardson, Bruce <bruce.richardson@intel.com>;
> santosh.shukla@caviumnetworks.com; nipun.gupta@nxp.com
> Subject: Re: [PATCH v2 1/2] eventdev: add device stop flush callback
> 
> -----Original Message-----
> > When an event device is stopped, it drains all event queues. These
> > events may contain pointers, so to prevent memory leaks eventdev now
> > supports a user-provided flush callback that is called during the queue drain
> process.
> > This callback is stored in process memory, so the callback must be
> > registered by any process that may call rte_event_dev_stop().
> >
> > This commit also clarifies the behavior of rte_event_dev_stop().
> >
> > This follows this mailing list discussion:
> > http://dpdk.org/ml/archives/dev/2018-January/087484.html
> >
> > Signed-off-by: Gage Eads <gage.eads@intel.com>
> > ---
> > v2: allow a NULL callback pointer to unregister the callback
> >
> >  lib/librte_eventdev/rte_eventdev.c           | 17 +++++++++
> >  lib/librte_eventdev/rte_eventdev.h           | 55
> +++++++++++++++++++++++++++-
> >  lib/librte_eventdev/rte_eventdev_version.map |  6 +++
> >  3 files changed, 76 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_eventdev/rte_eventdev.c
> > b/lib/librte_eventdev/rte_eventdev.c
> > index 851a119..1aacb7b 100644
> > --- a/lib/librte_eventdev/rte_eventdev.c
> > +++ b/lib/librte_eventdev/rte_eventdev.c
> > @@ -1123,6 +1123,23 @@ rte_event_dev_start(uint8_t dev_id)
> >  	return 0;
> >  }
> >
> > +typedef void (*eventdev_stop_flush_t)(uint8_t dev_id, struct rte_event event,
> > +		void *arg);
> > +/**< Callback function called during rte_event_dev_stop(), invoked
> > +once per
> > + * flushed event.
> > + */
> > +
> >  #define RTE_EVENTDEV_NAME_MAX_LEN	(64)
> >  /**< @internal Max length of name of event PMD */
> >
> > @@ -1176,6 +1194,11 @@ struct rte_eventdev {
> >  	event_dequeue_burst_t dequeue_burst;
> >  	/**< Pointer to PMD dequeue burst function. */
> >
> > +	eventdev_stop_flush_t dev_stop_flush;
> > +	/**< Optional, user-provided event flush function */
> > +	void *dev_stop_flush_arg;
> > +	/**< User-provided argument for event flush function */
> > +
> 
> I think, we can move this additions to the internal rte_eventdev_data structure.
> Reasons are
> 1) Changes to "struct rte_eventdev" would call for ABI change
> 2) We can keep "struct rte_eventdev" only for fast path functions, slow path
> functions can have additional redirection.
> 

Good points -- I hadn't considered the ABI impact of modifying rte_eventdev. rte_eventdev_data is in shared memory, though, so it's not multi-process friendly for function pointers. How about putting it in rte_eventdev_ops?

> >  	struct rte_eventdev_data *data;
> >  	/**< Pointer to device data */
> >  	const struct rte_eventdev_ops *dev_ops; @@ -1822,6 +1845,34 @@
> > rte_event_dev_xstats_reset(uint8_t dev_id,
> >   */
> >  int rte_event_dev_selftest(uint8_t dev_id);
> >
> > +/**
> > + * Registers a callback function to be invoked during
> > +rte_event_dev_stop() for
> > + * each flushed event. This function can be used to properly dispose
> > +of queued
> > + * events, for example events containing memory pointers.
> > + *
> > + * The callback function is only registered for the calling process.
> > +The
> > + * callback function must be registered in every process that can
> > +call
> > + * rte_event_dev_stop().
> > + *
> > + * To unregister a callback, call this function with a NULL callback pointer.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param callback
> > + *   Callback function invoked once per flushed event.
> > + * @param userdata
> > + *   Argument supplied to callback.
> > + *
> > + * @return
> > + *  - 0 on success.
> > + *  - -EINVAL if *dev_id* is invalid
> > + *
> > + * @see rte_event_dev_stop()
> > + */
> > +int
> > +rte_event_dev_stop_flush_callback_register(uint8_t dev_id,
> > +		eventdev_stop_flush_t callback, void *userdata);
> > +
> IMO, It would be better if we place this function near to rte_event_dev_stop().
> 
> Other than above minor changes, It looks good to me.

Ok, will address in v3.

Thanks,
Gage

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v2 1/2] eventdev: add device stop flush callback
  2018-03-12 14:30  3%     ` Eads, Gage
@ 2018-03-12 14:38  0%       ` Jerin Jacob
  0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2018-03-12 14:38 UTC (permalink / raw)
  To: Eads, Gage
  Cc: dev, Van Haaren, Harry, hemant.agrawal, Richardson, Bruce,
	santosh.shukla, nipun.gupta

-----Original Message-----
> Date: Mon, 12 Mar 2018 14:30:49 +0000
> From: "Eads, Gage" <gage.eads@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>, "Van Haaren, Harry"
>  <harry.van.haaren@intel.com>, "hemant.agrawal@nxp.com"
>  <hemant.agrawal@nxp.com>, "Richardson, Bruce"
>  <bruce.richardson@intel.com>, "santosh.shukla@caviumnetworks.com"
>  <santosh.shukla@caviumnetworks.com>, "nipun.gupta@nxp.com"
>  <nipun.gupta@nxp.com>
> Subject: RE: [PATCH v2 1/2] eventdev: add device stop flush callback
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, March 12, 2018 1:25 AM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: dev@dpdk.org; Van Haaren, Harry <harry.van.haaren@intel.com>;
> > hemant.agrawal@nxp.com; Richardson, Bruce <bruce.richardson@intel.com>;
> > santosh.shukla@caviumnetworks.com; nipun.gupta@nxp.com
> > Subject: Re: [PATCH v2 1/2] eventdev: add device stop flush callback
> > 
> > -----Original Message-----
> > > When an event device is stopped, it drains all event queues. These
> > > events may contain pointers, so to prevent memory leaks eventdev now
> > > supports a user-provided flush callback that is called during the queue drain
> > process.
> > > This callback is stored in process memory, so the callback must be
> > > registered by any process that may call rte_event_dev_stop().
> > >
> > > This commit also clarifies the behavior of rte_event_dev_stop().
> > >
> > > This follows this mailing list discussion:
> > > http://dpdk.org/ml/archives/dev/2018-January/087484.html
> > >
> > > Signed-off-by: Gage Eads <gage.eads@intel.com>
> > > ---
> > > v2: allow a NULL callback pointer to unregister the callback
> > >
> > >  lib/librte_eventdev/rte_eventdev.c           | 17 +++++++++
> > >  lib/librte_eventdev/rte_eventdev.h           | 55
> > +++++++++++++++++++++++++++-
> > >  lib/librte_eventdev/rte_eventdev_version.map |  6 +++
> > >  3 files changed, 76 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_eventdev/rte_eventdev.c
> > > b/lib/librte_eventdev/rte_eventdev.c
> > > index 851a119..1aacb7b 100644
> > > --- a/lib/librte_eventdev/rte_eventdev.c
> > > +++ b/lib/librte_eventdev/rte_eventdev.c
> > > @@ -1123,6 +1123,23 @@ rte_event_dev_start(uint8_t dev_id)
> > >  	return 0;
> > >  }
> > >
> > > +typedef void (*eventdev_stop_flush_t)(uint8_t dev_id, struct rte_event event,
> > > +		void *arg);
> > > +/**< Callback function called during rte_event_dev_stop(), invoked
> > > +once per
> > > + * flushed event.
> > > + */
> > > +
> > >  #define RTE_EVENTDEV_NAME_MAX_LEN	(64)
> > >  /**< @internal Max length of name of event PMD */
> > >
> > > @@ -1176,6 +1194,11 @@ struct rte_eventdev {
> > >  	event_dequeue_burst_t dequeue_burst;
> > >  	/**< Pointer to PMD dequeue burst function. */
> > >
> > > +	eventdev_stop_flush_t dev_stop_flush;
> > > +	/**< Optional, user-provided event flush function */
> > > +	void *dev_stop_flush_arg;
> > > +	/**< User-provided argument for event flush function */
> > > +
> > 
> > I think, we can move this additions to the internal rte_eventdev_data structure.
> > Reasons are
> > 1) Changes to "struct rte_eventdev" would call for ABI change
> > 2) We can keep "struct rte_eventdev" only for fast path functions, slow path
> > functions can have additional redirection.
> > 
> 
> Good points -- I hadn't considered the ABI impact of modifying rte_eventdev. rte_eventdev_data is in shared memory, though, so it's not multi-process friendly for function pointers. How about putting it in rte_eventdev_ops?

Yes. Make sense to move to rte_eventdev_ops. But need to take care
updating the those function pointers in secondary process.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC] Switch device offload with DPDK
@ 2018-03-12 15:55  1% Adrien Mazarguil
  0 siblings, 0 replies; 200+ results
From: Adrien Mazarguil @ 2018-03-12 15:55 UTC (permalink / raw)
  To: dev
  Cc: Alex Rosenbaum, Ferruh Yigit, Thomas Monjalon, Shahaf Shuler,
	Doherty, Declan, Qi Zhang, Alejandro Lucero

Hi All,

(I tried to CC key people part of related discussions but likely missed
many, feel free to add them back)

The following RFC formalizes what has been discussed so far regarding switch
offload control using rte_flow through port/VF representors. It doesn't
bring new ideas besides the "transfer" attribute used internally to
implement port representors themselves and reuses some from past and current
threads or RFCs (see related anchors [2][3][4][5][7][8][9][10]).

It is meant to be converted without much hassle as proper DPDK documentation
hence the verbosity, however it all boils down to this:

- Port representors shall exist as additional ethdevs [2] created by PMDs
  according to devargs [4] discussed elsewhere; out of scope for this RFC.

- Besides enabling basic management of these ethdevs (e.g. MAC
  configuration), they should support rte_flow rules.

- Flow rules can target a different DPDK port ID [7] in order for matching
  traffic to be automatically forwarded without going through software.

- Forwarding may involve transformation such as VXLAN encap/decap [8][9].

---

===============================
Switch device offload with DPDK
===============================

Rationale
=========

Network adapters with multiple physical ports and/or SR-IOV capabilities
usually support the offload of traffic steering rules between their virtual
functions (VFs), physical functions (PFs) and ports.

Like for standard Ethernet switches, this involves a combination of
automatic MAC learning and manual configuration. For most purposes it is
managed by the host system and fully transparent to users and applications.

On the other hand, applications typically found on hypervisors that process
layer 2 (L2) traffic (such as OVS) need to steer traffic themselves
according on their own criteria.

Without a standard software interface to manage traffic steering rules
between VFs, PFs and the various physical ports of a given device,
applications cannot take advantage of these offloads; software processing is
mandatory even for traffic which ends up re-injected into the device it
originates from.

This document describes how such steering rules can be configured through
the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case
(PF/VF steering) using a single physical port for clarity, however the same
logic applies to any number of ports without necessarily involving SR-IOV.

Port representors
=================

In many cases, traffic steering rules cannot be determined in advance;
applications usually have to process a bit of traffic in software before
thinking about offloading specific flows to hardware.

Applications therefore need the ability to receive and inject traffic to
various device endpoints (other VFs, PFs or physical ports) before
connecting them together. Device drivers must provide means to hook the
"other end" of these endpoints and to refer them when configuring flow
rules.

This role is left to so-called "port representors" (also known as "VF
representors" in the specific context of VFs), which are to DPDK what the
Ethernet switch device driver model (**switchdev**) [1]_ is to Linux, and
which can be thought as a software "patch panel" front-end for applications.

- DPDK port representors are implemented as additional virtual Ethernet
  device (**ethdev**) instances [2]_, spawned on a needed basis through
  configuration parameters [3]_ [4]_ by the driver of the underlying
  device.

- As virtual devices, they may be more limited than their physical
  counterparts, for instance by exposing only a subset of device
  configuration callbacks and/or by not necessarily having Rx/Tx capability.

- Among other things, they can be used to assign MAC addresses to the
  resource they represent.

- Applications can tell port representors apart by checking their device
  information structure which contains dedicated fields [5]_ describing
  parent/child device or group relationship (exact API remains to be
  defined).

.. [1] `Ethernet switch device driver model (switchdev)
       <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_

.. [2] `[RFC 0/5] Port Representor for control and monitoring of VF devices
       <http://dpdk.org/ml/archives/dev/2017-December/084639.html>`_

.. [3] `[PATCH v4 0/5] lib: add Port Representors
       <http://dpdk.org/ml/archives/dev/2018-January/086598.html>`_

.. [4] `doc: document the new devargs syntax
       <http://dpdk.org/ml/archives/dev/2018-January/087416.html>`_

.. [5] `doc: announce ABI change to support VF representors
       <http://dpdk.org/ml/archives/dev/2018-February/090958.html>`_

Basic SR-IOV
============

"Basic" in the sense that it is not managed by applications, which
nonetheless expect traffic to flow between the various endpoints and the
outside as if everything was linked by an Ethernet hub.

The following diagram pictures a setup involving a device with one PF, two
VFs and one shared physical port::

    .-------------.                 .-------------. .-------------.
    | hypervisor  |                 |    VM 1     | |    VM 2     |
    | application |                 | application | | application |
    `--+----------'                 `----------+--' `--+----------'
       |                                       |       |
 .-----+-----.                                 |       |
 | port_id 3 |                                 |       |
 `-----+-----'                                 |       |
       |                                       |       |
     .-+--.                                .---+--. .--+---.
     | PF |                                | VF 1 | | VF 2 |
     `-+--'                                `---+--' `--+---'
       |                                       |       |
       `---------.     .-----------------------'       |
                 |     |     .-------------------------'
                 |     |     |
              .--+-----+-----+--.
              | interconnection |
              `--------+--------'
                       |
                  .----+-----.
                  | physical |
                  |  port 0  |
                  `----------'

- A DPDK application running on the hypervisor owns the PF device, which is
  arbitrarily assigned port index 3.

- Both VFs are assigned to VMs and used by unknown applications; they may be
  DPDK-based or anything else.

- Interconnection is not necessarily done through a true Ethernet switch and
  may not even exist as a separate entity. The role of this block is to show
  that something brings PF, VFs and physical ports together and enables
  communication between them, with a number of built-in restrictions.

Subsequent sections in this document describe means for DPDK applications
running on the hypervisor to freely assign specific flows between PF, VFs
and physical ports based on traffic properties, by managing this
interconnection.

Controlled SR-IOV
=================

Initialization
--------------

When a DPDK application gets assigned a PF device and is deliberately not
started in `basic SR-IOV`_ mode, any traffic coming from physical ports is
received by PF according to default rules, while VFs remain isolated::

    .-------------.                 .-------------. .-------------.
    | hypervisor  |                 |    VM 1     | |    VM 2     |
    | application |                 | application | | application |
    `--+----------'                 `----------+--' `--+----------'
       |                                       |       |
 .-----+-----.                                 |       |
 | port_id 3 |                                 |       |
 `-----+-----'                                 |       |
       |                                       |       |
     .-+--.                                .---+--. .--+---.
     | PF |                                | VF 1 | | VF 2 |
     `-+--'                                `------' `------'
       |
       `-----.
             |
          .--+----------------------.
          | managed interconnection |
          `------------+------------'
                       |
                  .----+-----.
                  | physical |
                  |  port 0  |
                  `----------'

In this mode, interconnection must be configured by the application to
enable VF communication, for instance by explicitly directing traffic with a
given destination MAC address to VF 1 and allowing that with the same source
MAC address to come out of it.

For this to work, hypervisor applications need a way to refer to either VF 1
or VF 2 in addition to the PF. This is addressed by `VF representors`_.

VF representors
---------------

VF representors are virtual but standard DPDK network devices (albeit with
limited capabilities) created by PMDs when managing a PF device.

Since they represent VF instances used by other applications, configuring
them (e.g. assigning a MAC address or setting up promiscuous mode) affects
interconnection accordingly. If supported, they may also be used as two-way
communication ports with VFs (assuming **switchdev** topology)::

    .-------------.                 .-------------. .-------------.
    | hypervisor  |                 |    VM 1     | |    VM 2     |
    | application |                 | application | | application |
    `--+---+---+--'                 `----------+--' `--+----------'
       |   |   |                               |       |
       |   |   `-------------------.           |       |
       |   `---------.             |           |       |
       |             |             |           |       |
 .-----+-----. .-----+-----. .-----+-----.     |       |
 | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 `-----+-----' `-----+-----' `-----+-----'     |       |
       |             |             |           |       |
     .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
     | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
     `-+--'    `-----+-----' `-----+-----' `---+--' `--+---'
       |             |             |           |       |
       |             |   .---------'           |       |
       `-----.       |   |   .-----------------'       |
             |       |   |   |   .---------------------'
             |       |   |   |   |
          .--+-------+---+---+---+--.
          | managed interconnection |
          `------------+------------'
                       |
                  .----+-----.
                  | physical |
                  |  port 0  |
                  `----------'

- VF representors are assigned arbitrary port indices 4 and 5 in the
  hypervisor application and are respectively associated with VF 1 and VF 2.

- They can't be dissociated; even if VF 1 and VF 2 were not connected,
  representors could still be used for configuration.

- In this context, port index 3 can be thought as a representor for physical
  port 0.

As previously described, the "interconnection" block represents a logical
concept. Interconnection occurs when hardware configuration enables traffic
flows from one place to another (e.g. physical port 0 to VF 1) according to
some criteria.

This is discussed in more detail in `traffic steering`_.

Traffic steering
----------------

In the following diagram, each meaningful traffic origin or endpoint as seen
by the hypervisor application is tagged with a unique letter from A to F::

    .-------------.                 .-------------. .-------------.
    | hypervisor  |                 |    VM 1     | |    VM 2     |
    | application |                 | application | | application |
    `--+---+---+--'                 `----------+--' `--+----------'
       |   |   |                               |       |
       |   |   `-------------------.           |       |
       |   `---------.             |           |       |
       |             |             |           |       |
 .----(A)----. .----(B)----. .----(C)----.     |       |
 | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 `-----+-----' `-----+-----' `-----+-----'     |       |
       |             |             |           |       |
     .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
     | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
     `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
       |             |             |           |       |
       |             |   .---------'           |       |
       `-----.       |   |   .-----------------'       |
             |       |   |   |   .---------------------'
             |       |   |   |   |
          .--+-------+---+---+---+--.
          | managed interconnection |
          `------------+------------'
                       |
                  .---(F)----.
                  | physical |
                  |  port 0  |
                  `----------'

- **A**: PF device.
- **B**: port representor for VF 1.
- **C**: port representor for VF 2.
- **D**: VF 1 proper.
- **E**: VF 2 proper.
- **F**: physical port.

Although uncommon, some devices do not enforce a one to one mapping between
PF and physical ports. For instance, by default all ports of **mlx4**
adapters are available to all their PF/VF instances, in which case
additional ports appear next to **F** in the above diagram.

Assuming no interconnection is provided by default in this mode, setting up
a `basic SR-IOV`_ configuration involving physical port 0 could be broken
down as:

PF:

- **A to F**: let everything through.
- **F to A**: PF MAC as destination.

VF 1:

- **A to D**, **E to D** and **F to D**: VF 1 MAC as destination.
- **D to A**: VF 1 MAC as source and PF MAC as destination.
- **D to E**: VF 1 MAC as source and VF 2 MAC as destination.
- **D to F**: VF 1 MAC as source.

VF 2:

- **A to E**, **D to E** and **F to E**: VF 2 MAC as destination.
- **E to A**: VF 2 MAC as source and PF MAC as destination.
- **E to D**: VF 2 MAC as source and VF 1 MAC as destination.
- **E to F**: VF 2 MAC as source.

Devices may additionally support advanced matching criteria such as
IPv4/IPv6 addresses or TCP/UDP ports.

The combination of matching criteria with target endpoints fits well with
**rte_flow** [6]_, which expresses flow rules as combinations of patterns
and actions.

Enhancing **rte_flow** with the ability to make flow rules match and target
these endpoints provides a standard interface to manage their
interconnection without introducing new concepts and whole new API to
implement them. This is described in `flow API (rte_flow)`_.

.. [6] `Generic flow API (rte_flow)
       <http://dpdk.org/doc/guides/prog_guide/rte_flow.html>`_

Flow API (rte_flow)
===================

Extensions
----------

Compared to creating a brand new dedicated interface, **rte_flow** was
deemed flexible enough to manage representor traffic only with minor
extensions:

- Using physical ports, PF, VF or port representors as targets.

- Affecting traffic that is not necessarily addressed to the DPDK port ID a
  flow rule is associated with (e.g. forcing VF traffic redirection to PF).

For advanced uses:

- Rule-based packet counters.

- The ability to combine several identical actions for traffic duplication
  (e.g. VF representor in addition to a physical port).

- Dedicated actions for traffic encapsulation / decapsulation before
  reaching a endpoint.

The extensions described in the following sections follow up on Qi Zhang's
original RFC [7]_.

.. [7] `rte_flow extension for vSwitch acceleration
       <http://dpdk.org/ml/archives/dev/2017-December/084598.html>`_

Traffic direction
-----------------

>From an application standpoint, "ingress" and "egress" flow rule attributes
apply to the DPDK port ID they are associated with. They select a traffic
direction for matching patterns, but have no impact on actions.

When matching traffic coming from or going to a different place than the
immediate port ID a flow rule is associated with, these attributes keep
their meaning while applying to the chosen origin, as highlighted by the
following diagram::

    .-------------.                 .-------------. .-------------.
    | hypervisor  |                 |    VM 1     | |    VM 2     |
    | application |                 | application | | application |
    `--+---+---+--'                 `----------+--' `--+----------'
       |   |   |                               |       |
       |   |   `-------------------.           |       |
       |   `---------.             |           |       |
       | ^           | ^           | ^         |       |
       | | ingress   | | ingress   | | ingress |       |
       | | egress    | | egress    | | egress  |       |
       | v           | v           | v         |       |
 .----(A)----. .----(B)----. .----(C)----.     |       |
 | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 `-----+-----' `-----+-----' `-----+-----'     |       |
       |             |             |           |       |
     .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
     | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
     `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
       |             |             |         ^ |       | ^
       |             |             |  egress | |       | | egress
       |             |             | ingress | |       | | ingress
       |             |   .---------'         v |       | v
       `-----.       |   |   .-----------------'       |
             |       |   |   |   .---------------------'
             |       |   |   |   |
          .--+-------+---+---+---+--.
          | managed interconnection |
          `------------+------------'
                     ^ |
             ingress | |
              egress | |
                     v |
                  .---(F)----.
                  | physical |
                  |  port 0  |
                  `----------'

Ingress and egress are defined as relative to the application creating the
flow rule.

For instance, matching traffic sent by VM 2 would be done through an ingress
flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port
(**F**). This also applies to **C** and **A** respectively.

Transferring traffic
--------------------

Without port representors
~~~~~~~~~~~~~~~~~~~~~~~~~

`Traffic direction`_ describes how an application could match traffic coming
from or going to a specific place reachable from a DPDK port ID. This makes
sense when the traffic in question is normally seen (i.e. sent or received)
by the application creating the flow rule (e.g. as in "redirect all traffic
coming from VF 1 to local queue 6").

However this does not force such traffic to take a specific route. Creating
a flow rule on **A** matching traffic coming from **D** is only meaningful
if it can be received by **A** in the first place, otherwise doing so simply
has no effect.

A new flow rule attribute named "transfer" is necessary for that. Combining
it with "ingress" or "egress" and a specific origin requests a flow rule to
be applied at the lowest level::

          ingress only           :       ingress + transfer
                                 :
 .-------------. .-------------. : .-------------. .-------------.
 | hypervisor  | |    VM 1     | : | hypervisor  | |    VM 1     |
 | application | | application | : | application | | application |
 `------+------' `--+----------' : `------+------' `--+----------'
        |           | | traffic  :        |           | | traffic
  .----(A)----.     | v          :  .----(A)----.     | v
  | port_id 3 |     |            :  | port_id 3 |     |
  `-----+-----'     |            :  `-----+-----'     |
        |           |            :        | ^         |
        |           |            :        | | traffic |
      .-+--.    .---+--.         :      .-+--.    .---+--.
      | PF |    | VF 1 |         :      | PF |    | VF 1 |
      `-+--'    `--(D)-'         :      `-+--'    `--(D)-'
        |           | | traffic  :        | ^         | | traffic
        |           | v          :        | | traffic | v
     .--+-----------+--.         :     .--+-----------+--.
     | interconnection |         :     | interconnection |
     `--------+--------'         :     `--------+--------'
              | | traffic        :              |
              | v                :              |
         .---(F)----.            :         .---(F)----.
         | physical |            :         | physical |
         |  port 0  |            :         |  port 0  |
         `----------'            :         `----------'

With "ingress" only, traffic is matched on **A** thus still goes to physical
port **F** by default::

 testpmd> flow create 3 ingress pattern vf id is 1 / end
              actions queue index 6 / end

With "ingress + transfer", traffic is matched on **D** and is therefore
successfully assigned to queue 6 on **A**::

 testpmd> flow create 3 ingress transfer pattern vf id is 1 / end
              actions queue index 6 / end

With port representors
~~~~~~~~~~~~~~~~~~~~~~

When port representors exist, implicit flow rules with the "transfer"
attribute (described in `without port representors`_) are be assumed to
exist between them and their represented resources. These may be immutable.

In this case, traffic is received by default through the representor and
neither the "transfer" attribute nor traffic origin in flow rule patterns
are necessary. They simply have to be created on the representor port
directly and may target a different representor as described in `PORT_ID
action`_.

Implicit traffic flow with port representor::

    .-------------.   .-------------.
    | hypervisor  |   |    VM 1     |
    | application |   | application |
    `--+-------+--'   `----------+--'
       |       | ^               | | traffic
       |       | | traffic       | v
       |       `-----.           |
       |             |           |
 .----(A)----. .----(B)----.     |
 | port_id 3 | | port_id 4 |     |
 `-----+-----' `-----+-----'     |
       |             |           |
     .-+--.    .-----+-----. .---+--.
     | PF |    | VF 1 rep. | | VF 1 |
     `-+--'    `-----+-----' `--(D)-'
       |             |           |
    .--|-------------|-----------|--.
    |  |             |           |  |
    |  |             `-----------'  |
    |  |              <-- traffic   |
    `--|----------------------------'
       |
  .---(F)----.
  | physical |
  |  port 0  |
  `----------'

Pattern items and actions
-------------------------

``PORT`` pattern item
~~~~~~~~~~~~~~~~~~~~~

Matches traffic originating from (ingress) or going to (egress) a physical
port of the underlying device.

Using this pattern item without specifying a port index matches the physical
port associated with the current DPDK port ID by default. As described in
`traffic steering`_, specifying it should be rarely needed.

- Matches **F** in `traffic steering`_.

``PORT`` action
~~~~~~~~~~~~~~~

Directs matching traffic to a given physical port index.

- Targets **F** in `traffic steering`_.

``PORT_ID`` pattern item
~~~~~~~~~~~~~~~~~~~~~~~~

Matches traffic originating from (ingress) or going to (egress) a given DPDK
port ID.

Normally only supported if the port ID in question is known by the
underlying PMD and related to the device the flow rule is created against.

This must not be confused with the `PORT pattern item`_ which refers to the
physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev``
object on the application side (also known as "port representor" depending
on the kind of underlying device).

- Matches **A**, **B** or **C** in `traffic steering`_.

``PORT_ID`` action
~~~~~~~~~~~~~~~~~~

Directs matching traffic to a given DPDK port ID.

Same restrictions as `PORT_ID pattern item`_.

- Targets **A**, **B** or **C** in `traffic steering`_.

``PF`` pattern item
~~~~~~~~~~~~~~~~~~~

Matches traffic originating from (ingress) or going to (egress) the physical
function of the current device.

If supported, should work even if the physical function is not managed by
the application and thus not associated with a DPDK port ID. Its behavior is
otherwise similar to `PORT_ID pattern item`_ using PF port ID.

- Matches **A** in `traffic steering`_.

``PF`` action
~~~~~~~~~~~~~

Directs matching traffic to the physical function of the current device.

Same restrictions as `PF pattern item`_.

- Targets **A** in `traffic steering`_.

``VF`` pattern item
~~~~~~~~~~~~~~~~~~~

Matches traffic originating from (ingress) or going to (egress) a given
virtual function of the current device.

If supported, should work even if the virtual function is not managed by
the application and thus not associated with a DPDK port ID. Its behavior is
otherwise similar to `PORT_ID pattern item`_ using VF port ID.

Note this pattern item does not match VF representors traffic which, as
separate entities, should be addressed through their own port IDs.

- Matches **D** or **E** in `traffic steering`_.

``VF`` action
~~~~~~~~~~~~~

Directs matching traffic to a given virtual function of the current device.

Same restrictions as `VF pattern item`_.

- Targets **D** or **E** in `traffic steering`_.

``*_ENCAP`` actions
~~~~~~~~~~~~~~~~~~~

These actions are named according to the protocol they encapsulate traffic
with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for
VXLAN).

While they modify traffic and can be used multiple times (order matters),
unlike `PORT_ID action`_ and friends, they have no impact on steering.

As described in `actions order and repetition`_ this means they are useless
if used alone in an action list, the resulting traffic gets dropped unless
combined with either ``PASSTHRU`` or other endpoint-targeting actions.

All these are under discussion in the context of adding support for tunnel
endpoint (TEP) [8]_ [9]_.

.. [8] `[RFC] tunnel endpoint hw acceleration enablement
       <http://dpdk.org/ml/archives/dev/2017-December/084676.html>`_

.. [9] `ethdev: Additions to rte_flows to support vTEP encap/decap offload
       <http://dpdk.org/ml/archives/dev/2018-March/092378.html>`_

``*_DECAP`` actions
~~~~~~~~~~~~~~~~~~~

They perform the reverse of `*_ENCAP actions`_ by popping protocol headers
from traffic instead of pushing them. They can be used multiple times as
well.

Note that using these actions on non-matching traffic results in undefined
behavior. It is recommended to match the protocol headers to decapsulate on
the pattern side of a flow rule in order to use these actions or otherwise
make sure only matching traffic goes through.

Actions order and repetition
----------------------------

Flow rules are currently restricted to at most a single action of each
supported type, performed in an unpredictable order (or all at once). To
repeat actions in a predictable fashion, applications have to make rules
pass-through and use priority levels.

It's now clear that PMD support for chaining multiple non-terminating flow
rules of varying priority levels is prohibitively difficult to implement
compared to simply allowing multiple identical actions performed in a
defined order by a single flow rule.

- This change is required to support protocol encapsulation offloads and the
  ability to perform them multiple times (e.g. VLAN then VXLAN).

- It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can
  be combined for duplication.

- The (non-)terminating property of actions must be discarded. Instead, flow
  rules themselves must be considered terminating by default (i.e. dropping
  traffic if there is no specific target) unless a ``PASSTHRU`` action is
  also specified.

This change was announced [10]_ on the DPDK mailing list.

.. [10] `doc: announce API change for flow actions
	<http://dpdk.org/ml/archives/dev/2018-February/090989.html>`_

Examples
--------

This section provides practical examples based on the established Testpmd
flow command syntax [11]_, in the context described in `traffic steering`_::

    .-------------.                 .-------------. .-------------.
    | hypervisor  |                 |    VM 1     | |    VM 2     |
    | application |                 | application | | application |
    `--+---+---+--'                 `----------+--' `--+----------'
       |   |   |                               |       |
       |   |   `-------------------.           |       |
       |   `---------.             |           |       |
       |             |             |           |       |
 .----(A)----. .----(B)----. .----(C)----.     |       |
 | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 `-----+-----' `-----+-----' `-----+-----'     |       |
       |             |             |           |       |
     .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
     | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
     `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
       |             |             |           |       |
       |             |   .---------'           |       |
       `-----.       |   |   .-----------------'       |
             |       |   |   |   .---------------------'
             |       |   |   |   |
          .--|-------|---|---|---|--.
          |  |       |   `---|---'  |
          |  |       `-------'      |
          |  `---------.            |
          `------------|------------'
                       |
                  .---(F)----.
                  | physical |
                  |  port 0  |
                  `----------'

By default, PF (**A**) can communicate with the physical port it is
associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated
and restricted to communicate with the hypervisor application through their
respective representors (**B** and **C**) if supported.

Examples in subsequent sections apply to hypervisor applications only and
are based on port representors **A**, **B** and **C**.

.. [11] `Flow syntax
	<http://dpdk.org/doc/guides/testpmd_app_ug/testpmd_funcs.html#flow-syntax>`_

Associating VF 1 with physical port 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through
their representors::

 flow create 3 ingress pattern / end actions port_id id 4 / end
 flow create 4 ingress pattern / end actions port_id id 3 / end

More practical example with MAC address restrictions::

 flow create 3 ingress
     pattern eth dst is {VF 1 MAC} / end
     actions port_id id 4 / end
 flow create 4 ingress
     pattern eth src is {VF 1 MAC} / end
     actions port_id id 3 / end

Sharing broadcasts
~~~~~~~~~~~~~~~~~~

>From outside to PF and VFs::

 flow create 3 ingress
     pattern eth dst is ff:ff:ff:ff:ff:ff / end
     actions port_id id 3 / port_id id 4 / port_id id 5 / end

Note ``port_id id 3`` is necessary otherwise only VFs would receive matching
traffic.

>From PF to outside and VFs::

 flow create 3 egress
     pattern eth dst is ff:ff:ff:ff:ff:ff / end
     actions port / port_id id 4 / port_id id 5 / end

>From VFs to outside and PF::

 flow create 4 ingress
     pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end
     actions port_id id 3 / port_id id 5 / end
 flow create 5 ingress
     pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end
     actions port_id id 4 / port_id id 4 / end

Similar ``33:33:*`` rules based on known MAC addresses should be added for
IPv6 traffic.

Encapsulating VF 2 traffic in VXLAN
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Assuming pass-through flow rules are supported::

 flow create 5 ingress
     pattern eth / end
     actions vxlan_encap vni 42 / passthru / end
 flow create 5 egress
     pattern vxlan vni is 42 / end
     actions vxlan_decap / passthru / end

Here ``passthru`` is needed since as described in `actions order and
repetition`_, flow rules are otherwise terminating; if supported, a rule
without a target endpoint will drop traffic.

Without pass-through support, ingress encapsulation on the destination
endpoint might not be supported and action list must provide one::

 flow create 5 ingress
      pattern eth src is {VF 2 MAC} / end
      actions vxlan_encap vni 42 / port_id id 3 / end
 flow create 3 ingress
      pattern vxlan vni is 42 / end
      actions vxlan_decap / port_id id 5 / end

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[relevance 1%]

* Re: [dpdk-dev] [PATCH v2 1/2] Add RIB library
  @ 2018-03-14 11:09  4%   ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2018-03-14 11:09 UTC (permalink / raw)
  To: Medvedkin Vladimir; +Cc: dev

On Wed, Feb 21, 2018 at 09:44:54PM +0000, Medvedkin Vladimir wrote:
> RIB is an alternative to current LPM library.
> It solves the following problems
>  - Increases the speed of control plane operations against lpm such as
>    adding/deleting routes
>  - Adds abstraction from dataplane algorithms, so it is possible to add
>    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
>    in addition to current dir24_8
>  - It is possible to keep user defined application specific additional
>    information in struct rte_rib_node which represents route entry.
>    It can be next hop/set of next hops (i.e. active and feasible),
>    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
>    plenty of additional control plane information.
>  - For dir24_8 implementation it is possible to remove rte_lpm_tbl_entry.depth
>    field that helps to save 6 bits.
>  - Also new dir24_8 implementation supports different next_hop sizes
>    (1/2/4/8 bytes per next hop)
>  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate ternary operator.
>    Instead it returns special default value if there is no route.
> 
> Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
> ---
>  config/common_base                 |   6 +
>  doc/api/doxy-api.conf              |   1 +
>  lib/Makefile                       |   2 +
>  lib/librte_rib/Makefile            |  22 ++
>  lib/librte_rib/rte_dir24_8.c       | 482 +++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_dir24_8.h       | 116 ++++++++
>  lib/librte_rib/rte_rib.c           | 526 +++++++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
>  lib/librte_rib/rte_rib_version.map |  18 ++
>  mk/rte.app.mk                      |   1 +
>  10 files changed, 1496 insertions(+)
>  create mode 100644 lib/librte_rib/Makefile
>  create mode 100644 lib/librte_rib/rte_dir24_8.c
>  create mode 100644 lib/librte_rib/rte_dir24_8.h
>  create mode 100644 lib/librte_rib/rte_rib.c
>  create mode 100644 lib/librte_rib/rte_rib.h
>  create mode 100644 lib/librte_rib/rte_rib_version.map
> 

First pass review comments. For now just reviewed the main public header
file rte_rib.h. Later reviews will cover the other files as best I can.

/Bruce

<snip>
> diff --git a/lib/librte_rib/rte_rib.h b/lib/librte_rib/rte_rib.h
> new file mode 100644
> index 0000000..6eac8fb
> --- /dev/null
> +++ b/lib/librte_rib/rte_rib.h
> @@ -0,0 +1,322 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> + */
> +
> +#ifndef _RTE_RIB_H_
> +#define _RTE_RIB_H_
> +
> +/**
> + * @file
> + * Compressed trie implementation for Longest Prefix Match
> + */
> +
> +/** @internal Macro to enable/disable run-time checks. */
> +#if defined(RTE_LIBRTE_RIB_DEBUG)
> +#define RTE_RIB_RETURN_IF_TRUE(cond, retval) do {	\
> +	if (cond)					\
> +		return retval;				\
> +} while (0)
> +#else
> +#define RTE_RIB_RETURN_IF_TRUE(cond, retval)
> +#endif

use RTE_ASSERT?

> +
> +#define RTE_RIB_VALID_NODE	1

should there be an INVALID_NODE macro?

> +#define RTE_RIB_GET_NXT_ALL	0
> +#define RTE_RIB_GET_NXT_COVER	1
> +
> +#define RTE_RIB_INVALID_ROUTE	0
> +#define RTE_RIB_VALID_ROUTE	1
> +
> +/** Max number of characters in RIB name. */
> +#define RTE_RIB_NAMESIZE	64
> +
> +/** Maximum depth value possible for IPv4 RIB. */
> +#define RTE_RIB_MAXDEPTH	32

I think we should have IPv4 in the name here. Will it not be extended to
support IPv6 in future?

> +
> +/**
> + * Macro to check if prefix1 {key1/depth1}
> + * is covered by prefix2 {key2/depth2}
> + */
> +#define RTE_RIB_IS_COVERED(key1, depth1, key2, depth2)			\
> +	((((key1 ^ key2) & (uint32_t)(UINT64_MAX << (32 - depth2))) == 0)\
> +		&& (depth1 > depth2))
Neat check!

Any particular reason for using UINT64_MAX here rather than UINT32_MAX?
I think you can avoid the casting and have a slightly shorter mask by
changing "(uint32_t)(UINT64_MAX << (32 - depth2)" to 
"~(UINT32_MAX >> depth2)"
I'd also suggest for readability putting the second check first, and,
for maintainability, using an inline function rather than a macro.

> +
> +/** @internal Macro to get next node in tree*/
> +#define RTE_RIB_GET_NXT_NODE(node, key)					\
> +	((key & (1 << (31 - node->depth))) ? node->right : node->left)
> +/** @internal Macro to check if node is right child*/
> +#define RTE_RIB_IS_RIGHT_NODE(node)	(node->parent->right == node)

Again, consider inline fns rather than macros.
For the latter macro, rather than doing additional pointer derefs to
parent, can you also get if it's a right node by using:
"(node->key & (1 << (32 - node->depth)))"? 

> +
> +
> +struct rte_rib_node {
> +	struct rte_rib_node *left;
> +	struct rte_rib_node *right;
> +	struct rte_rib_node *parent;
> +	uint32_t	key;
> +	uint8_t		depth;
> +	uint8_t		flag;
> +	uint64_t	nh;
> +	uint64_t	ext[0];
> +};
> +
> +struct rte_rib;
> +
> +/** Type of FIB struct*/
> +enum rte_rib_type {
> +	RTE_RIB_DIR24_8_1B,
> +	RTE_RIB_DIR24_8_2B,
> +	RTE_RIB_DIR24_8_4B,
> +	RTE_RIB_DIR24_8_8B,
> +	RTE_RIB_TYPE_MAX
> +};

If the plan is to support multiple underlying fib types and algorithms
under the rib library, would it not be better to separate out the
algorithm part from the data storage part? So have the type just be
DIR_24_8, and have the 1, 2, 4 or 8 specified separately.

> +
> +enum rte_rib_op {
> +	RTE_RIB_ADD,
> +	RTE_RIB_DEL
> +};
> +
> +/** RIB nodes allocation type */
> +enum rte_rib_alloc_type {
> +	RTE_RIB_MALLOC,
> +	RTE_RIB_MEMPOOL,
> +	RTE_RIB_ALLOC_MAX
> +};

Not sure you need this any more. Malloc allocations and mempool
allocations are now pretty much the same thing.

> +
> +typedef int (*rte_rib_modify_fn_t)(struct rte_rib *rib, uint32_t key,
> +	uint8_t depth, uint64_t next_hop, enum rte_rib_op op);

Do you anticipate more ops in future than just add and delete? If not,
why not just split this function into two and drop the op struct.

> +typedef int (*rte_rib_tree_lookup_fn_t)(void *fib, const uint32_t *ips,
> +	uint64_t *next_hops, const unsigned n);
> +typedef struct rte_rib_node *(*rte_rib_alloc_node_fn_t)(struct rte_rib *rib);
> +typedef void (*rte_rib_free_node_fn_t)(struct rte_rib *rib,
> +	struct rte_rib_node *node);
> +
> +struct rte_rib {
> +	char name[RTE_RIB_NAMESIZE];
> +	/*pointer to rib trie*/
> +	struct rte_rib_node	*trie;
> +	/*pointer to dataplane struct*/
> +	void	*fib;
> +	/*prefix modification*/
> +	rte_rib_modify_fn_t	modify;
> +	/* Bulk lookup fn*/
> +	rte_rib_tree_lookup_fn_t	lookup;
> +	/*alloc trie element*/
> +	rte_rib_alloc_node_fn_t	alloc_node;
> +	/*free trie element*/
> +	rte_rib_free_node_fn_t	free_node;
> +	struct rte_mempool	*node_pool;
> +	uint32_t		cur_nodes;
> +	uint32_t		cur_routes;
> +	int			max_nodes;
> +	int			node_sz;
> +	enum rte_rib_type	type;
> +	enum rte_rib_alloc_type	alloc_type;
> +};
> +
> +/** RIB configuration structure */
> +struct rte_rib_conf {
> +	enum rte_rib_type	type;
> +	enum rte_rib_alloc_type	alloc_type;
> +	int	max_nodes;
> +	size_t	node_sz;
> +	uint64_t def_nh;
> +};
> +
> +/**
> + * Lookup an IP into the RIB structure
> + *
> + * @param rib
> + *  RIB object handle
> + * @param key
> + *  IP to be looked up in the RIB
> + * @return
> + *  pointer to struct rte_rib_node on success,
> + *  NULL otherwise
> + */
> +struct rte_rib_node *
> +rte_rib_tree_lookup(struct rte_rib *rib, uint32_t key);
> +
> +/**
> + * Lookup less specific route into the RIB structure
> + *
> + * @param ent
> + *  Pointer to struct rte_rib_node that represents target route
> + * @return
> + *  pointer to struct rte_rib_node that represents
> + *  less specific route on success,
> + *  NULL otherwise
> + */
> +struct rte_rib_node *
> +rte_rib_tree_lookup_parent(struct rte_rib_node *ent);
> +
> +/**
> + * Lookup prefix into the RIB structure
> + *
> + * @param rib
> + *  RIB object handle
> + * @param key
> + *  net to be looked up in the RIB
> + * @param depth
> + *  prefix length
> + * @return
> + *  pointer to struct rte_rib_node on success,
> + *  NULL otherwise
> + */
> +struct rte_rib_node *
> +rte_rib_tree_lookup_exact(struct rte_rib *rib, uint32_t key, uint8_t depth);

Can you explain the difference between this and regular lookup, and how
they would be used. I don't think the names convey the differences
sufficiently, and so we should look to rename one or both to be clearer.

> +
> +/**
> + * Retrieve next more specific prefix from the RIB
s/more/most/

> + * that is covered by key/depth supernet
> + *
> + * @param rib
> + *  RIB object handle
> + * @param key
> + *  net address of supernet prefix that covers returned more specific prefixes
> + * @param depth
> + *  supernet prefix length
> + * @param cur
> + *   pointer to the last returned prefix to get next prefix
> + *   or
> + *   NULL to get first more specific prefix
> + * @param flag
> + *  -RTE_RIB_GET_NXT_ALL
> + *   get all prefixes from subtrie

By all prefixes do you mean more specific, i.e. the final prefix?

> + *  -RTE_RIB_GET_NXT_COVER
> + *   get only first more specific prefix even if it have more specifics
> + * @return
> + *  pointer to the next more specific prefix
> + *  or
> + *  NULL if there is no prefixes left
> + */
> +struct rte_rib_node *
> +rte_rib_tree_get_nxt(struct rte_rib *rib, uint32_t key, uint8_t depth,
> +	struct rte_rib_node *cur, int flag);
> +
> +/**
> + * Remove prefix from the RIB
> + *
> + * @param rib
> + *  RIB object handle
> + * @param key
> + *  net to be removed from the RIB
> + * @param depth
> + *  prefix length
> + */
> +void
> +rte_rib_tree_remove(struct rte_rib *rib, uint32_t key, uint8_t depth);
> +
> +/**
> + * Insert prefix into the RIB
> + *
> + * @param rib
> + *  RIB object handle
> + * @param key
> + *  net to be inserted to the RIB
> + * @param depth
> + *  prefix length
> + * @return
> + *  pointer to new rte_rib_node on success
> + *  NULL otherwise
> + */
> +struct rte_rib_node *
> +rte_rib_tree_insert(struct rte_rib *rib, uint32_t key, uint8_t depth);
> +
> +/**
> + * Create RIB
> + *
> + * @param name
> + *  RIB name
> + * @param socket_id
> + *  NUMA socket ID for RIB table memory allocation
> + * @param conf
> + *  Structure containing the configuration
> + * @return
> + *  Handle to RIB object on success
> + *  NULL otherwise with rte_errno set to an appropriate values.
> + */
> +struct rte_rib *
> +rte_rib_create(const char *name, int socket_id, struct rte_rib_conf *conf);
> +
> +/**
> + * Find an existing RIB object and return a pointer to it.
> + *
> + * @param name
> + *  Name of the rib object as passed to rte_rib_create()
> + * @return
> + *  Pointer to rib object or NULL if object not found with rte_errno
> + *  set appropriately. Possible rte_errno values include:
> + *   - ENOENT - required entry not available to return.
> + */
> +struct rte_rib *
> +rte_rib_find_existing(const char *name);
> +
> +/**
> + * Free an RIB object.
> + *
> + * @param rib
> + *   RIB object handle
> + * @return
> + *   None
> + */
> +void
> +rte_rib_free(struct rte_rib *rib);
> +
> +/**
> + * Add a rule to the RIB.
> + *
> + * @param rib
> + *   RIB object handle
> + * @param ip
> + *   IP of the rule to be added to the RIB
> + * @param depth
> + *   Depth of the rule to be added to the RIB
> + * @param next_hop
> + *   Next hop of the rule to be added to the RIB
> + * @return
> + *   0 on success, negative value otherwise
> + */
> +int
> +rte_rib_add(struct rte_rib *rib, uint32_t ip, uint8_t depth, uint64_t next_hop);
> +
> +/**
> + * Delete a rule from the RIB.
> + *
> + * @param rib
> + *   RIB object handle
> + * @param ip
> + *   IP of the rule to be deleted from the RIB
> + * @param depth
> + *   Depth of the rule to be deleted from the RIB
> + * @return
> + *   0 on success, negative value otherwise
> + */
> +int
> +rte_rib_delete(struct rte_rib *rib, uint32_t ip, uint8_t depth);
> +
> +/**
> + * Lookup multiple IP addresses in an FIB. This may be implemented as a
> + * macro, so the address of the function should not be used.
> + *
> + * @param RIB
> + *   RIB object handle
> + * @param ips
> + *   Array of IPs to be looked up in the FIB
> + * @param next_hops
> + *   Next hop of the most specific rule found for IP.
> + *   This is an array of eight byte values.
> + *   If the lookup for the given IP failed, then corresponding element would
> + *   contain default value, see description of then next parameter.
> + * @param n
> + *   Number of elements in ips (and next_hops) array to lookup. This should be a
> + *   compile time constant, and divisible by 8 for best performance.
> + * @param defv
> + *   Default value to populate into corresponding element of hop[] array,
> + *   if lookup would fail.
> + *  @return
> + *   -EINVAL for incorrect arguments, otherwise 0
> + */
> +#define rte_rib_fib_lookup_bulk(rib, ips, next_hops, n)	\
> +	rib->lookup(rib->fib, ips, next_hops, n)

My main thought here is whether this needs to be a function at all?
Given that it takes a full burst of addresses in a single go, how much
performance would actually be lost by making this a regular function in
the C file?
IF we do convert this to a regular function, then a lot of the structure
definitions above - most importantly, the rib structure itself - can
probably be moved to a private header file and not exposed to
applications at all. This will make ABI compatibility a *lot* easier, as
the structures can be changed without affecting the public ABI.

/Bruce

> +
> +#endif /* _RTE_RIB_H_ */

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
  @ 2018-03-14 12:31  0%     ` Ananyev, Konstantin
  2018-03-15  3:13  0%       ` Zhang, Qi Z
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2018-03-14 12:31 UTC (permalink / raw)
  To: Zhang, Qi Z, thomas
  Cc: dev, Xing, Beilei, Wu, Jingjing, Lu, Wenzhuo, Zhang, Qi Z

Hi Qi,

> 
> The patch let etherdev driver expose the capability flag through
> rte_eth_dev_info_get when it support deferred queue configuraiton,
> then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> continue to setup the queue or just return fail when device already
> started.
> 
> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> ---
>  doc/guides/nics/features.rst  |  8 ++++++++
>  lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
>  lib/librte_ether/rte_ethdev.h | 11 +++++++++++
>  3 files changed, 37 insertions(+), 12 deletions(-)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index 1b4fb979f..36ad21a1f 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -892,7 +892,15 @@ Documentation describes performance values.
> 
>  See ``dpdk.org/doc/perf/*``.
> 
> +.. _nic_features_queue_deferred_setup_capabilities:
> 
> +Queue deferred setup capabilities
> +---------------------------------
> +
> +Supports queue setup / release after device started.
> +
> +* **[provides] rte_eth_dev_info**:
> ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFERRED_TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> +* **[related]  API**: ``rte_eth_dev_info_get()``.
> 
>  .. _nic_features_other:
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index a6ce2a5ba..6c906c4df 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
>  		return -EINVAL;
>  	}
> 
> -	if (dev->data->dev_started) {
> -		RTE_PMD_DEBUG_TRACE(
> -		    "port %d must be stopped to allow configuration\n", port_id);
> -		return -EBUSY;
> -	}
> -
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get, -ENOTSUP);
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup, -ENOTSUP);
> 
> @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
>  		return -EINVAL;
>  	}
> 
> +	if (dev->data->dev_started &&
> +		!(dev_info.deferred_queue_config_capa &
> +			DEV_DEFERRED_RX_QUEUE_SETUP))
> +		return -EINVAL;
> +

I think now you have to check here that the queue is stopped.
Otherwise you might attempt to reconfigure running queue.


>  	rxq = dev->data->rx_queues;
>  	if (rxq[rx_queue_id]) {
>  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
>  					-ENOTSUP);

I don't think it is *that* straightforward.
rx_queue_setup() parameters can imply different rx function (and related dev icesettings)
that are already setuped by previous queue_setup()/dev_start.
So I think you need to do one of 2 things:
1. rework ethdev layer to introduce a separate rx function (and related settings) for each queue.
2. at rx_queue_setup() if it is invoked after dev_start - check that given queue settings wouldn't
contradict with current device settings  (rx function, etc.).
If they do - return an error.

>From my perspective - 1) is a better choice though it required more work, and possibly ABI breakage.
I did some work in that direction as RFC:
http://dpdk.org/dev/patchwork/patch/31866/

2) might be also possible, but looks a bit clumsy as rx_queue_setup() might now fail even with
valid parameters - all depends on previous queue configurations.

Same story applies for TX.


> +		if (dev->data->dev_started &&
> +			!(dev_info.deferred_queue_config_capa &
> +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> +			return -EINVAL;
>  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
>  		rxq[rx_queue_id] = NULL;
>  	}
> @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
>  		return -EINVAL;
>  	}
> 
> -	if (dev->data->dev_started) {
> -		RTE_PMD_DEBUG_TRACE(
> -		    "port %d must be stopped to allow configuration\n", port_id);
> -		return -EBUSY;
> -	}
> -
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get, -ENOTSUP);
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup, -ENOTSUP);
> 
> @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
>  		return -EINVAL;
>  	}
> 
> +	if (dev->data->dev_started &&
> +		!(dev_info.deferred_queue_config_capa &
> +			DEV_DEFERRED_TX_QUEUE_SETUP))
> +		return -EINVAL;
> +
>  	txq = dev->data->tx_queues;
>  	if (txq[tx_queue_id]) {
>  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
>  					-ENOTSUP);
> +		if (dev->data->dev_started &&
> +			!(dev_info.deferred_queue_config_capa &
> +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> +			return -EINVAL;
>  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
>  		txq[tx_queue_id] = NULL;
>  	}
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 036153306..410e58c50 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -981,6 +981,15 @@ struct rte_eth_conf {
>   */
>  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> 
> +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001
> +/**< Deferred setup rx queue */
> +#define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002
> +/**< Deferred setup tx queue */
> +#define DEV_DEFERRED_RX_QUEUE_RELEASE 0x00000004
> +/**< Deferred release rx queue */
> +#define DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008
> +/**< Deferred release tx queue */
> +

I don't think we do need flags for both setup a and release.
If runtime setup is supported - surely dynamic release should be supported too.
Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.

Konstantin

>  /*
>   * If new Tx offload capabilities are defined, they also must be
>   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
>  	/** Configured number of rx/tx queues */
>  	uint16_t nb_rx_queues; /**< Number of RX queues. */
>  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> +	uint64_t deferred_queue_config_capa;
> +	/**< queues can be setup/release after dev_start (DEV_DEFERRED_). */
>  };
> 
>  /**
> --
> 2.13.6

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
  2018-03-14 12:31  0%     ` Ananyev, Konstantin
@ 2018-03-15  3:13  0%       ` Zhang, Qi Z
  2018-03-15 13:16  0%         ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Zhang, Qi Z @ 2018-03-15  3:13 UTC (permalink / raw)
  To: Ananyev, Konstantin, thomas; +Cc: dev, Xing, Beilei, Wu, Jingjing, Lu, Wenzhuo

Hi Konstantin:

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Wednesday, March 14, 2018 8:32 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> 
> Hi Qi,
> 
> >
> > The patch let etherdev driver expose the capability flag through
> > rte_eth_dev_info_get when it support deferred queue configuraiton,
> > then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> > continue to setup the queue or just return fail when device already
> > started.
> >
> > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > ---
> >  doc/guides/nics/features.rst  |  8 ++++++++
> > lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
> > lib/librte_ether/rte_ethdev.h | 11 +++++++++++
> >  3 files changed, 37 insertions(+), 12 deletions(-)
> >
> > diff --git a/doc/guides/nics/features.rst
> > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f 100644
> > --- a/doc/guides/nics/features.rst
> > +++ b/doc/guides/nics/features.rst
> > @@ -892,7 +892,15 @@ Documentation describes performance values.
> >
> >  See ``dpdk.org/doc/perf/*``.
> >
> > +.. _nic_features_queue_deferred_setup_capabilities:
> >
> > +Queue deferred setup capabilities
> > +---------------------------------
> > +
> > +Supports queue setup / release after device started.
> > +
> > +* **[provides] rte_eth_dev_info**:
> >
> ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> RRED_
> > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> >
> >  .. _nic_features_other:
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> uint16_t rx_queue_id,
> >  		return -EINVAL;
> >  	}
> >
> > -	if (dev->data->dev_started) {
> > -		RTE_PMD_DEBUG_TRACE(
> > -		    "port %d must be stopped to allow configuration\n", port_id);
> > -		return -EBUSY;
> > -	}
> > -
> >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> -ENOTSUP);
> >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> -ENOTSUP);
> >
> > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> uint16_t rx_queue_id,
> >  		return -EINVAL;
> >  	}
> >
> > +	if (dev->data->dev_started &&
> > +		!(dev_info.deferred_queue_config_capa &
> > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > +		return -EINVAL;
> > +
> 
> I think now you have to check here that the queue is stopped.
> Otherwise you might attempt to reconfigure running queue.

I'm not sure if it's necessary to let application use different API sequence for a deferred configure and deferred re-configure.
Can we just call dev_ops->rx_queue_stop before rx_queue_release here

> 
> 
> >  	rxq = dev->data->rx_queues;
> >  	if (rxq[rx_queue_id]) {
> >  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> >  					-ENOTSUP);
> 
> I don't think it is *that* straightforward.
> rx_queue_setup() parameters can imply different rx function (and related dev
> icesettings) that are already setuped by previous queue_setup()/dev_start.
> So I think you need to do one of 2 things:
> 1. rework ethdev layer to introduce a separate rx function (and related
> settings) for each queue.
> 2. at rx_queue_setup() if it is invoked after dev_start - check that given
> queue settings wouldn't contradict with current device settings  (rx function,
> etc.).
> If they do - return an error.
Yes, I think what we have is option 2 here, the dev_ops->rx_queue_setup will return fail if conflict with previous setting
I'm also thinking about option 1, the idea is to move per queue rx/tx function into driver layer, so it will not break existing API.

1. driver can expose the capability like per_queue_rx or per_queue_tx
2. application can enable this capability by dev_config with rte_eth_conf
3, if per_queue_rx is not enable, nothing change, so we are at option 2
4. if per_queue_rx is enabled, driver will set rx_pkt_burst with a hook function which redirect to an function ptr in a per queue rx function tables ( I guess performance is impacted somehow, but this is the cost if you want different offload for different queue)

> 
> From my perspective - 1) is a better choice though it required more work,
> and possibly ABI breakage.
> I did some work in that direction as RFC:
> http://dpdk.org/dev/patchwork/patch/31866/

I will learn this, thanks for the heads up.
> 
> 2) might be also possible, but looks a bit clumsy as rx_queue_setup() might
> now fail even with valid parameters - all depends on previous queue
> configurations.
> 
> Same story applies for TX.
> 
> 
> > +		if (dev->data->dev_started &&
> > +			!(dev_info.deferred_queue_config_capa &
> > +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> > +			return -EINVAL;
> >  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> >  		rxq[rx_queue_id] = NULL;
> >  	}
> > @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> uint16_t tx_queue_id,
> >  		return -EINVAL;
> >  	}
> >
> > -	if (dev->data->dev_started) {
> > -		RTE_PMD_DEBUG_TRACE(
> > -		    "port %d must be stopped to allow configuration\n", port_id);
> > -		return -EBUSY;
> > -	}
> > -
> >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> -ENOTSUP);
> >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup,
> -ENOTSUP);
> >
> > @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> uint16_t tx_queue_id,
> >  		return -EINVAL;
> >  	}
> >
> > +	if (dev->data->dev_started &&
> > +		!(dev_info.deferred_queue_config_capa &
> > +			DEV_DEFERRED_TX_QUEUE_SETUP))
> > +		return -EINVAL;
> > +
> >  	txq = dev->data->tx_queues;
> >  	if (txq[tx_queue_id]) {
> >  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
> >  					-ENOTSUP);
> > +		if (dev->data->dev_started &&
> > +			!(dev_info.deferred_queue_config_capa &
> > +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> > +			return -EINVAL;
> >  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
> >  		txq[tx_queue_id] = NULL;
> >  	}
> > diff --git a/lib/librte_ether/rte_ethdev.h
> > b/lib/librte_ether/rte_ethdev.h index 036153306..410e58c50 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -981,6 +981,15 @@ struct rte_eth_conf {
> >   */
> >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> >
> > +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001 /**< Deferred
> setup rx
> > +queue */ #define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002 /**<
> Deferred
> > +setup tx queue */ #define DEV_DEFERRED_RX_QUEUE_RELEASE
> 0x00000004
> > +/**< Deferred release rx queue */ #define
> > +DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008 /**< Deferred release
> tx
> > +queue */
> > +
> 
> I don't think we do need flags for both setup a and release.
> If runtime setup is supported - surely dynamic release should be supported
> too.
> Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.

Agree

Thanks
Qi

> 
> Konstantin
> 
> >  /*
> >   * If new Tx offload capabilities are defined, they also must be
> >   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> > @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
> >  	/** Configured number of rx/tx queues */
> >  	uint16_t nb_rx_queues; /**< Number of RX queues. */
> >  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> > +	uint64_t deferred_queue_config_capa;
> > +	/**< queues can be setup/release after dev_start (DEV_DEFERRED_). */
> >  };
> >
> >  /**
> > --
> > 2.13.6

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
  2018-03-15  3:13  0%       ` Zhang, Qi Z
@ 2018-03-15 13:16  0%         ` Ananyev, Konstantin
  2018-03-15 15:08  0%           ` Zhang, Qi Z
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2018-03-15 13:16 UTC (permalink / raw)
  To: Zhang, Qi Z, thomas; +Cc: dev, Xing, Beilei, Wu, Jingjing, Lu, Wenzhuo

Hi Qi,

> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Thursday, March 15, 2018 3:14 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; thomas@monjalon.net
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> 
> Hi Konstantin:
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Wednesday, March 14, 2018 8:32 PM
> > To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> >
> > Hi Qi,
> >
> > >
> > > The patch let etherdev driver expose the capability flag through
> > > rte_eth_dev_info_get when it support deferred queue configuraiton,
> > > then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> > > continue to setup the queue or just return fail when device already
> > > started.
> > >
> > > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > > ---
> > >  doc/guides/nics/features.rst  |  8 ++++++++
> > > lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
> > > lib/librte_ether/rte_ethdev.h | 11 +++++++++++
> > >  3 files changed, 37 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/doc/guides/nics/features.rst
> > > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f 100644
> > > --- a/doc/guides/nics/features.rst
> > > +++ b/doc/guides/nics/features.rst
> > > @@ -892,7 +892,15 @@ Documentation describes performance values.
> > >
> > >  See ``dpdk.org/doc/perf/*``.
> > >
> > > +.. _nic_features_queue_deferred_setup_capabilities:
> > >
> > > +Queue deferred setup capabilities
> > > +---------------------------------
> > > +
> > > +Supports queue setup / release after device started.
> > > +
> > > +* **[provides] rte_eth_dev_info**:
> > >
> > ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> > RRED_
> > > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> > >
> > >  .. _nic_features_other:
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > uint16_t rx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > -	if (dev->data->dev_started) {
> > > -		RTE_PMD_DEBUG_TRACE(
> > > -		    "port %d must be stopped to allow configuration\n", port_id);
> > > -		return -EBUSY;
> > > -	}
> > > -
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > -ENOTSUP);
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> > -ENOTSUP);
> > >
> > > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > uint16_t rx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > +	if (dev->data->dev_started &&
> > > +		!(dev_info.deferred_queue_config_capa &
> > > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > > +		return -EINVAL;
> > > +
> >
> > I think now you have to check here that the queue is stopped.
> > Otherwise you might attempt to reconfigure running queue.
> 
> I'm not sure if it's necessary to let application use different API sequence for a deferred configure and deferred re-configure.
> Can we just call dev_ops->rx_queue_stop before rx_queue_release here

I don't follow you here.
Let say now inside queue_start() we do check:

if (dev->data->rx_queue_state[rx_queue_id] != RTE_ETH_QUEUE_STATE_STOPPED)

Right now it is not possible to call queue_setup() without dev_stop() before it -
that's why we have check if (dev->data->dev_started) in queue_setup() right now.
Though with your patch it not the case anymore - user is able to call queue_setup()
without stopping the whole device.
But he still has to stop the queue. 

> 
> >
> >
> > >  	rxq = dev->data->rx_queues;
> > >  	if (rxq[rx_queue_id]) {
> > >  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> > >  					-ENOTSUP);
> >
> > I don't think it is *that* straightforward.
> > rx_queue_setup() parameters can imply different rx function (and related dev
> > icesettings) that are already setuped by previous queue_setup()/dev_start.
> > So I think you need to do one of 2 things:
> > 1. rework ethdev layer to introduce a separate rx function (and related
> > settings) for each queue.
> > 2. at rx_queue_setup() if it is invoked after dev_start - check that given
> > queue settings wouldn't contradict with current device settings  (rx function,
> > etc.).
> > If they do - return an error.
> Yes, I think what we have is option 2 here, the dev_ops->rx_queue_setup will return fail if conflict with previous setting

Hmm and what makes you think that?
As I know it is not the case  right now.
Let say I do:
    ....
   rx_queue_setup(port=0,queue=0, mp=mb_size_2048);
   dev_start(port=0);
   ...
   rx_queue_setup(port=0,queue=1,mp=mb_size_1024);
   
 If current rx function doesn't support multi-segs then second rx_queue_setup() should fail.
 Though I don't think that would happen with the current implementation. 

Same story for TX offloads, though it probably not that critical, as for most Intel PMDs HW TX offloads will become per port in 18.05.

As I can see you do have either of these options implemented right  now - that's the problem.

> I'm also thinking about option 1, the idea is to move per queue rx/tx function into driver layer, so it will not break existing API.
> 
> 1. driver can expose the capability like per_queue_rx or per_queue_tx
> 2. application can enable this capability by dev_config with rte_eth_conf
> 3, if per_queue_rx is not enable, nothing change, so we are at option 2
> 4. if per_queue_rx is enabled, driver will set rx_pkt_burst with a hook function which redirect to an function ptr in a per queue rx function
> tables ( I guess performance is impacted somehow, but this is the cost if you want different offload for different queue)

I don't think we need to overcomplicate things here.
It should be transparent to the user - user just calls queue_setup() - based on its input parameters
PMD selects a function that fits best.
Pretty much what we have right now, just possibly have an array of functions (one per queue).

> 
> >
> > From my perspective - 1) is a better choice though it required more work,
> > and possibly ABI breakage.
> > I did some work in that direction as RFC:
> > http://dpdk.org/dev/patchwork/patch/31866/
> 
> I will learn this, thanks for the heads up.
> >
> > 2) might be also possible, but looks a bit clumsy as rx_queue_setup() might
> > now fail even with valid parameters - all depends on previous queue
> > configurations.
> >
> > Same story applies for TX.
> >
> >
> > > +		if (dev->data->dev_started &&
> > > +			!(dev_info.deferred_queue_config_capa &
> > > +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> > > +			return -EINVAL;
> > >  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> > >  		rxq[rx_queue_id] = NULL;
> > >  	}
> > > @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > uint16_t tx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > -	if (dev->data->dev_started) {
> > > -		RTE_PMD_DEBUG_TRACE(
> > > -		    "port %d must be stopped to allow configuration\n", port_id);
> > > -		return -EBUSY;
> > > -	}
> > > -
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > -ENOTSUP);
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup,
> > -ENOTSUP);
> > >
> > > @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > uint16_t tx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > +	if (dev->data->dev_started &&
> > > +		!(dev_info.deferred_queue_config_capa &
> > > +			DEV_DEFERRED_TX_QUEUE_SETUP))
> > > +		return -EINVAL;
> > > +
> > >  	txq = dev->data->tx_queues;
> > >  	if (txq[tx_queue_id]) {
> > >  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
> > >  					-ENOTSUP);
> > > +		if (dev->data->dev_started &&
> > > +			!(dev_info.deferred_queue_config_capa &
> > > +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> > > +			return -EINVAL;
> > >  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
> > >  		txq[tx_queue_id] = NULL;
> > >  	}
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 036153306..410e58c50 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -981,6 +981,15 @@ struct rte_eth_conf {
> > >   */
> > >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> > >
> > > +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001 /**< Deferred
> > setup rx
> > > +queue */ #define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002 /**<
> > Deferred
> > > +setup tx queue */ #define DEV_DEFERRED_RX_QUEUE_RELEASE
> > 0x00000004
> > > +/**< Deferred release rx queue */ #define
> > > +DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008 /**< Deferred release
> > tx
> > > +queue */
> > > +
> >
> > I don't think we do need flags for both setup a and release.
> > If runtime setup is supported - surely dynamic release should be supported
> > too.
> > Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.
> 
> Agree
> 
> Thanks
> Qi
> 
> >
> > Konstantin
> >
> > >  /*
> > >   * If new Tx offload capabilities are defined, they also must be
> > >   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> > > @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
> > >  	/** Configured number of rx/tx queues */
> > >  	uint16_t nb_rx_queues; /**< Number of RX queues. */
> > >  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> > > +	uint64_t deferred_queue_config_capa;
> > > +	/**< queues can be setup/release after dev_start (DEV_DEFERRED_). */
> > >  };
> > >
> > >  /**
> > > --
> > > 2.13.6

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
  2018-03-15 13:16  0%         ` Ananyev, Konstantin
@ 2018-03-15 15:08  0%           ` Zhang, Qi Z
  2018-03-15 15:38  0%             ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Zhang, Qi Z @ 2018-03-15 15:08 UTC (permalink / raw)
  To: Ananyev, Konstantin, thomas; +Cc: dev, Xing, Beilei, Wu, Jingjing, Lu, Wenzhuo



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Thursday, March 15, 2018 9:17 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> 
> Hi Qi,
> 
> > -----Original Message-----
> > From: Zhang, Qi Z
> > Sent: Thursday, March 15, 2018 3:14 AM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > thomas@monjalon.net
> > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > setup
> >
> > Hi Konstantin:
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Wednesday, March 14, 2018 8:32 PM
> > > To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> > > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang,
> > > Qi Z <qi.z.zhang@intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > > setup
> > >
> > > Hi Qi,
> > >
> > > >
> > > > The patch let etherdev driver expose the capability flag through
> > > > rte_eth_dev_info_get when it support deferred queue configuraiton,
> > > > then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> > > > continue to setup the queue or just return fail when device
> > > > already started.
> > > >
> > > > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > > > ---
> > > >  doc/guides/nics/features.rst  |  8 ++++++++
> > > > lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
> > > > lib/librte_ether/rte_ethdev.h | 11 +++++++++++
> > > >  3 files changed, 37 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/doc/guides/nics/features.rst
> > > > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f 100644
> > > > --- a/doc/guides/nics/features.rst
> > > > +++ b/doc/guides/nics/features.rst
> > > > @@ -892,7 +892,15 @@ Documentation describes performance
> values.
> > > >
> > > >  See ``dpdk.org/doc/perf/*``.
> > > >
> > > > +.. _nic_features_queue_deferred_setup_capabilities:
> > > >
> > > > +Queue deferred setup capabilities
> > > > +---------------------------------
> > > > +
> > > > +Supports queue setup / release after device started.
> > > > +
> > > > +* **[provides] rte_eth_dev_info**:
> > > >
> > >
> ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> > > RRED_
> > > > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > > > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > > > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> > > >
> > > >  .. _nic_features_other:
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df 100644
> > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > > uint16_t rx_queue_id,
> > > >  		return -EINVAL;
> > > >  	}
> > > >
> > > > -	if (dev->data->dev_started) {
> > > > -		RTE_PMD_DEBUG_TRACE(
> > > > -		    "port %d must be stopped to allow configuration\n",
> port_id);
> > > > -		return -EBUSY;
> > > > -	}
> > > > -
> > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > -ENOTSUP);
> > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> > > -ENOTSUP);
> > > >
> > > > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t
> port_id,
> > > uint16_t rx_queue_id,
> > > >  		return -EINVAL;
> > > >  	}
> > > >
> > > > +	if (dev->data->dev_started &&
> > > > +		!(dev_info.deferred_queue_config_capa &
> > > > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > > > +		return -EINVAL;
> > > > +
> > >
> > > I think now you have to check here that the queue is stopped.
> > > Otherwise you might attempt to reconfigure running queue.
> >
> > I'm not sure if it's necessary to let application use different API sequence
> for a deferred configure and deferred re-configure.
> > Can we just call dev_ops->rx_queue_stop before rx_queue_release here
> 
> I don't follow you here.
> Let say now inside queue_start() we do check:
> 
> if (dev->data->rx_queue_state[rx_queue_id] !=
> RTE_ETH_QUEUE_STATE_STOPPED)
> 
> Right now it is not possible to call queue_setup() without dev_stop() before
> it - that's why we have check if (dev->data->dev_started) in queue_setup()
> right now.
> Though with your patch it not the case anymore - user is able to call
> queue_setup() without stopping the whole device.
> But he still has to stop the queue.

> 
> >
> > >
> > >
> > > >  	rxq = dev->data->rx_queues;
> > > >  	if (rxq[rx_queue_id]) {
> > > >
> 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> > > >  					-ENOTSUP);
> > >
> > > I don't think it is *that* straightforward.
> > > rx_queue_setup() parameters can imply different rx function (and
> > > related dev
> > > icesettings) that are already setuped by previous
> queue_setup()/dev_start.
> > > So I think you need to do one of 2 things:
> > > 1. rework ethdev layer to introduce a separate rx function (and
> > > related
> > > settings) for each queue.
> > > 2. at rx_queue_setup() if it is invoked after dev_start - check that
> > > given queue settings wouldn't contradict with current device
> > > settings  (rx function, etc.).
> > > If they do - return an error.
> > Yes, I think what we have is option 2 here, the
> > dev_ops->rx_queue_setup will return fail if conflict with previous
> > setting
> 
> Hmm and what makes you think that?
> As I know it is not the case  right now.
> Let say I do:
>     ....
>    rx_queue_setup(port=0,queue=0, mp=mb_size_2048);
>    dev_start(port=0);
>    ...
>    rx_queue_setup(port=0,queue=1,mp=mb_size_1024);
> 
>  If current rx function doesn't support multi-segs then second
> rx_queue_setup() should fail.
>  Though I don't think that would happen with the current implementation.

Why you think that would not happen? dev_ops->rx_queue_setup can fail, right?
I mean it's the responsibility of low level driver (i40e) to check the conflict with current implementation.
> 
> Same story for TX offloads, though it probably not that critical, as for most
> Intel PMDs HW TX offloads will become per port in 18.05.
> 
> As I can see you do have either of these options implemented right  now -
> that's the problem.
> 
> > I'm also thinking about option 1, the idea is to move per queue rx/tx
> function into driver layer, so it will not break existing API.
> >
> > 1. driver can expose the capability like per_queue_rx or per_queue_tx
> > 2. application can enable this capability by dev_config with
> > rte_eth_conf 3, if per_queue_rx is not enable, nothing change, so we
> > are at option 2 4. if per_queue_rx is enabled, driver will set
> > rx_pkt_burst with a hook function which redirect to an function ptr in
> > a per queue rx function tables ( I guess performance is impacted
> > somehow, but this is the cost if you want different offload for
> > different queue)
> 
> I don't think we need to overcomplicate things here.
> It should be transparent to the user - user just calls queue_setup() - based on
> its input parameters PMD selects a function that fits best.
> Pretty much what we have right now, just possibly have an array of functions
> (one per queue).

If we don't introduce a new capability or something like, but just take per queue functions as default way, 
does that mean, we need to change all drivers to adapt this?
Or do you mean below?

If (dev->rx_pkt_burst)
	/* default way */
else
	/* per queue function */

Regards
Qi

> 
> >
> > >
> > > From my perspective - 1) is a better choice though it required more
> > > work, and possibly ABI breakage.
> > > I did some work in that direction as RFC:
> > > http://dpdk.org/dev/patchwork/patch/31866/
> >
> > I will learn this, thanks for the heads up.
> > >
> > > 2) might be also possible, but looks a bit clumsy as
> > > rx_queue_setup() might now fail even with valid parameters - all
> > > depends on previous queue configurations.
> > >
> > > Same story applies for TX.
> > >
> > >
> > > > +		if (dev->data->dev_started &&
> > > > +			!(dev_info.deferred_queue_config_capa &
> > > > +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> > > > +			return -EINVAL;
> > > >  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> > > >  		rxq[rx_queue_id] = NULL;
> > > >  	}
> > > > @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > > uint16_t tx_queue_id,
> > > >  		return -EINVAL;
> > > >  	}
> > > >
> > > > -	if (dev->data->dev_started) {
> > > > -		RTE_PMD_DEBUG_TRACE(
> > > > -		    "port %d must be stopped to allow configuration\n",
> port_id);
> > > > -		return -EBUSY;
> > > > -	}
> > > > -
> > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > -ENOTSUP);
> > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup,
> > > -ENOTSUP);
> > > >
> > > > @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t
> port_id,
> > > uint16_t tx_queue_id,
> > > >  		return -EINVAL;
> > > >  	}
> > > >
> > > > +	if (dev->data->dev_started &&
> > > > +		!(dev_info.deferred_queue_config_capa &
> > > > +			DEV_DEFERRED_TX_QUEUE_SETUP))
> > > > +		return -EINVAL;
> > > > +
> > > >  	txq = dev->data->tx_queues;
> > > >  	if (txq[tx_queue_id]) {
> > > >
> 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
> > > >  					-ENOTSUP);
> > > > +		if (dev->data->dev_started &&
> > > > +			!(dev_info.deferred_queue_config_capa &
> > > > +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> > > > +			return -EINVAL;
> > > >  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
> > > >  		txq[tx_queue_id] = NULL;
> > > >  	}
> > > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > > b/lib/librte_ether/rte_ethdev.h index 036153306..410e58c50 100644
> > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > @@ -981,6 +981,15 @@ struct rte_eth_conf {
> > > >   */
> > > >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> > > >
> > > > +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001 /**<
> Deferred
> > > setup rx
> > > > +queue */ #define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002
> /**<
> > > Deferred
> > > > +setup tx queue */ #define DEV_DEFERRED_RX_QUEUE_RELEASE
> > > 0x00000004
> > > > +/**< Deferred release rx queue */ #define
> > > > +DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008 /**< Deferred
> release
> > > tx
> > > > +queue */
> > > > +
> > >
> > > I don't think we do need flags for both setup a and release.
> > > If runtime setup is supported - surely dynamic release should be
> > > supported too.
> > > Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.
> >
> > Agree
> >
> > Thanks
> > Qi
> >
> > >
> > > Konstantin
> > >
> > > >  /*
> > > >   * If new Tx offload capabilities are defined, they also must be
> > > >   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> > > > @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
> > > >  	/** Configured number of rx/tx queues */
> > > >  	uint16_t nb_rx_queues; /**< Number of RX queues. */
> > > >  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> > > > +	uint64_t deferred_queue_config_capa;
> > > > +	/**< queues can be setup/release after dev_start
> > > > +(DEV_DEFERRED_). */
> > > >  };
> > > >
> > > >  /**
> > > > --
> > > > 2.13.6

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
  2018-03-15 15:08  0%           ` Zhang, Qi Z
@ 2018-03-15 15:38  0%             ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2018-03-15 15:38 UTC (permalink / raw)
  To: Zhang, Qi Z, thomas; +Cc: dev, Xing, Beilei, Wu, Jingjing, Lu, Wenzhuo



> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Thursday, March 15, 2018 3:09 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; thomas@monjalon.net
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Thursday, March 15, 2018 9:17 PM
> > To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> >
> > Hi Qi,
> >
> > > -----Original Message-----
> > > From: Zhang, Qi Z
> > > Sent: Thursday, March 15, 2018 3:14 AM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > thomas@monjalon.net
> > > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > > setup
> > >
> > > Hi Konstantin:
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Wednesday, March 14, 2018 8:32 PM
> > > > To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> > > > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang,
> > > > Qi Z <qi.z.zhang@intel.com>
> > > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > > > setup
> > > >
> > > > Hi Qi,
> > > >
> > > > >
> > > > > The patch let etherdev driver expose the capability flag through
> > > > > rte_eth_dev_info_get when it support deferred queue configuraiton,
> > > > > then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> > > > > continue to setup the queue or just return fail when device
> > > > > already started.
> > > > >
> > > > > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > > > > ---
> > > > >  doc/guides/nics/features.rst  |  8 ++++++++
> > > > > lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
> > > > > lib/librte_ether/rte_ethdev.h | 11 +++++++++++
> > > > >  3 files changed, 37 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/doc/guides/nics/features.rst
> > > > > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f 100644
> > > > > --- a/doc/guides/nics/features.rst
> > > > > +++ b/doc/guides/nics/features.rst
> > > > > @@ -892,7 +892,15 @@ Documentation describes performance
> > values.
> > > > >
> > > > >  See ``dpdk.org/doc/perf/*``.
> > > > >
> > > > > +.. _nic_features_queue_deferred_setup_capabilities:
> > > > >
> > > > > +Queue deferred setup capabilities
> > > > > +---------------------------------
> > > > > +
> > > > > +Supports queue setup / release after device started.
> > > > > +
> > > > > +* **[provides] rte_eth_dev_info**:
> > > > >
> > > >
> > ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> > > > RRED_
> > > > > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > > > > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > > > > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> > > > >
> > > > >  .. _nic_features_other:
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > > > uint16_t rx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > -	if (dev->data->dev_started) {
> > > > > -		RTE_PMD_DEBUG_TRACE(
> > > > > -		    "port %d must be stopped to allow configuration\n",
> > port_id);
> > > > > -		return -EBUSY;
> > > > > -	}
> > > > > -
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > > -ENOTSUP);
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> > > > -ENOTSUP);
> > > > >
> > > > > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t
> > port_id,
> > > > uint16_t rx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > +	if (dev->data->dev_started &&
> > > > > +		!(dev_info.deferred_queue_config_capa &
> > > > > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > > > > +		return -EINVAL;
> > > > > +
> > > >
> > > > I think now you have to check here that the queue is stopped.
> > > > Otherwise you might attempt to reconfigure running queue.
> > >
> > > I'm not sure if it's necessary to let application use different API sequence
> > for a deferred configure and deferred re-configure.
> > > Can we just call dev_ops->rx_queue_stop before rx_queue_release here
> >
> > I don't follow you here.
> > Let say now inside queue_start() we do check:
> >
> > if (dev->data->rx_queue_state[rx_queue_id] !=
> > RTE_ETH_QUEUE_STATE_STOPPED)
> >
> > Right now it is not possible to call queue_setup() without dev_stop() before
> > it - that's why we have check if (dev->data->dev_started) in queue_setup()
> > right now.
> > Though with your patch it not the case anymore - user is able to call
> > queue_setup() without stopping the whole device.
> > But he still has to stop the queue.
> 
> >
> > >
> > > >
> > > >
> > > > >  	rxq = dev->data->rx_queues;
> > > > >  	if (rxq[rx_queue_id]) {
> > > > >
> > 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> > > > >  					-ENOTSUP);
> > > >
> > > > I don't think it is *that* straightforward.
> > > > rx_queue_setup() parameters can imply different rx function (and
> > > > related dev
> > > > icesettings) that are already setuped by previous
> > queue_setup()/dev_start.
> > > > So I think you need to do one of 2 things:
> > > > 1. rework ethdev layer to introduce a separate rx function (and
> > > > related
> > > > settings) for each queue.
> > > > 2. at rx_queue_setup() if it is invoked after dev_start - check that
> > > > given queue settings wouldn't contradict with current device
> > > > settings  (rx function, etc.).
> > > > If they do - return an error.
> > > Yes, I think what we have is option 2 here, the
> > > dev_ops->rx_queue_setup will return fail if conflict with previous
> > > setting
> >
> > Hmm and what makes you think that?
> > As I know it is not the case  right now.
> > Let say I do:
> >     ....
> >    rx_queue_setup(port=0,queue=0, mp=mb_size_2048);
> >    dev_start(port=0);
> >    ...
> >    rx_queue_setup(port=0,queue=1,mp=mb_size_1024);
> >
> >  If current rx function doesn't support multi-segs then second
> > rx_queue_setup() should fail.
> >  Though I don't think that would happen with the current implementation.
> 
> Why you think that would not happen? dev_ops->rx_queue_setup can fail, right?
> I mean it's the responsibility of low level driver (i40e) to check the conflict with current implementation.

Yes it is responsibility if the PMD because only it knows its own logic of rx/tx function selection.
But I don't see such changes in i40e in your patch series.
Probably I missed them?

> >
> > Same story for TX offloads, though it probably not that critical, as for most
> > Intel PMDs HW TX offloads will become per port in 18.05.
> >
> > As I can see you do have either of these options implemented right  now -
> > that's the problem.
> >
> > > I'm also thinking about option 1, the idea is to move per queue rx/tx
> > function into driver layer, so it will not break existing API.
> > >
> > > 1. driver can expose the capability like per_queue_rx or per_queue_tx
> > > 2. application can enable this capability by dev_config with
> > > rte_eth_conf 3, if per_queue_rx is not enable, nothing change, so we
> > > are at option 2 4. if per_queue_rx is enabled, driver will set
> > > rx_pkt_burst with a hook function which redirect to an function ptr in
> > > a per queue rx function tables ( I guess performance is impacted
> > > somehow, but this is the cost if you want different offload for
> > > different queue)
> >
> > I don't think we need to overcomplicate things here.
> > It should be transparent to the user - user just calls queue_setup() - based on
> > its input parameters PMD selects a function that fits best.
> > Pretty much what we have right now, just possibly have an array of functions
> > (one per queue).
> 
> If we don't introduce a new capability or something like, but just take per queue functions as default way,
> does that mean, we need to change all drivers to adapt this?
> Or do you mean below?
> 
> If (dev->rx_pkt_burst)
> 	/* default way */
> else
> 	/* per queue function */

For me either way seems ok.
Second one probably a bit easier, as no changes from PMDs are required.
But again - might be even rte_ethdev layer can fill queue's rx_pkt_burst[] array
for the drivers that don't support it - just by copying dev->rx_pkt_burst into it.
Konstantin 

> 
> Regards
> Qi
> 
> >
> > >
> > > >
> > > > From my perspective - 1) is a better choice though it required more
> > > > work, and possibly ABI breakage.
> > > > I did some work in that direction as RFC:
> > > > http://dpdk.org/dev/patchwork/patch/31866/
> > >
> > > I will learn this, thanks for the heads up.
> > > >
> > > > 2) might be also possible, but looks a bit clumsy as
> > > > rx_queue_setup() might now fail even with valid parameters - all
> > > > depends on previous queue configurations.
> > > >
> > > > Same story applies for TX.
> > > >
> > > >
> > > > > +		if (dev->data->dev_started &&
> > > > > +			!(dev_info.deferred_queue_config_capa &
> > > > > +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> > > > > +			return -EINVAL;
> > > > >  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> > > > >  		rxq[rx_queue_id] = NULL;
> > > > >  	}
> > > > > @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > > > uint16_t tx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > -	if (dev->data->dev_started) {
> > > > > -		RTE_PMD_DEBUG_TRACE(
> > > > > -		    "port %d must be stopped to allow configuration\n",
> > port_id);
> > > > > -		return -EBUSY;
> > > > > -	}
> > > > > -
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > > -ENOTSUP);
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup,
> > > > -ENOTSUP);
> > > > >
> > > > > @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t
> > port_id,
> > > > uint16_t tx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > +	if (dev->data->dev_started &&
> > > > > +		!(dev_info.deferred_queue_config_capa &
> > > > > +			DEV_DEFERRED_TX_QUEUE_SETUP))
> > > > > +		return -EINVAL;
> > > > > +
> > > > >  	txq = dev->data->tx_queues;
> > > > >  	if (txq[tx_queue_id]) {
> > > > >
> > 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
> > > > >  					-ENOTSUP);
> > > > > +		if (dev->data->dev_started &&
> > > > > +			!(dev_info.deferred_queue_config_capa &
> > > > > +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> > > > > +			return -EINVAL;
> > > > >  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
> > > > >  		txq[tx_queue_id] = NULL;
> > > > >  	}
> > > > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > > > b/lib/librte_ether/rte_ethdev.h index 036153306..410e58c50 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > @@ -981,6 +981,15 @@ struct rte_eth_conf {
> > > > >   */
> > > > >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> > > > >
> > > > > +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001 /**<
> > Deferred
> > > > setup rx
> > > > > +queue */ #define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002
> > /**<
> > > > Deferred
> > > > > +setup tx queue */ #define DEV_DEFERRED_RX_QUEUE_RELEASE
> > > > 0x00000004
> > > > > +/**< Deferred release rx queue */ #define
> > > > > +DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008 /**< Deferred
> > release
> > > > tx
> > > > > +queue */
> > > > > +
> > > >
> > > > I don't think we do need flags for both setup a and release.
> > > > If runtime setup is supported - surely dynamic release should be
> > > > supported too.
> > > > Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.
> > >
> > > Agree
> > >
> > > Thanks
> > > Qi
> > >
> > > >
> > > > Konstantin
> > > >
> > > > >  /*
> > > > >   * If new Tx offload capabilities are defined, they also must be
> > > > >   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> > > > > @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
> > > > >  	/** Configured number of rx/tx queues */
> > > > >  	uint16_t nb_rx_queues; /**< Number of RX queues. */
> > > > >  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> > > > > +	uint64_t deferred_queue_config_capa;
> > > > > +	/**< queues can be setup/release after dev_start
> > > > > +(DEV_DEFERRED_). */
> > > > >  };
> > > > >
> > > > >  /**
> > > > > --
> > > > > 2.13.6

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver
  2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
                       ` (4 preceding siblings ...)
  2018-03-10 15:39  8%     ` [dpdk-dev] [PATCH v1 7/9] mempool: remove callback to register memory area Andrew Rybchenko
@ 2018-03-19 17:03  0%     ` Olivier Matz
  5 siblings, 0 replies; 200+ results
From: Olivier Matz @ 2018-03-19 17:03 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, Santosh Shukla, Jerin Jacob, Hemant Agrawal, Shreyansh Jain

Hi Andrew,

Thank you for this nice rework.
Globally, the patchset looks good to me. I'm sending some comments
as reply to specific patches.

On Sat, Mar 10, 2018 at 03:39:33PM +0000, Andrew Rybchenko wrote:
> The initial patch series [1] is split into two to simplify processing.
> The second series relies on this one and will add bucket mempool driver
> and related ops.
> 
> The patch series has generic enhancements suggested by Olivier.
> Basically it adds driver callbacks to calculate required memory size and
> to populate objects using provided memory area. It allows to remove
> so-called capability flags used before to tell generic code how to
> allocate and slice allocated memory into mempool objects.
> Clean up which removes get_capabilities and register_memory_area is
> not strictly required, but I think right thing to do.
> Existing mempool drivers are updated.
> 
> I've kept rte_mempool_populate_iova_tab() intact since it seems to
> be not directly related XMEM API functions.

The function rte_mempool_populate_iova_tab() (actually, it was
rte_mempool_populate_phys_tab()) was introduced to support XMEM
API. In my opinion, it can also be deprecated.

> It breaks ABI since changes rte_mempool_ops. Also it removes
> rte_mempool_ops_register_memory_area() and
> rte_mempool_ops_get_capabilities() since corresponding callbacks are
> removed.
> 
> Internal global functions are not listed in map file since it is not
> a part of external API.
> 
> [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html
> 
> RFCv1 -> RFCv2:
>   - add driver ops to calculate required memory size and populate
>     mempool objects, remove extra flags which were required before
>     to control it
>   - transition of octeontx and dpaa drivers to the new callbacks
>   - change info API to get information from driver required to
>     API user to know contiguous block size
>   - remove get_capabilities (not required any more and may be
>     substituted with more in info get API)
>   - remove register_memory_area since it is substituted with
>     populate callback which can do more
>   - use SPDX tags
>   - avoid all objects affinity to single lcore
>   - fix bucket get_count
>   - deprecate XMEM API
>   - avoid introduction of a new function to flush cache
>   - fix NO_CACHE_ALIGN case in bucket mempool
> 
> RFCv2 -> v1:
>   - split the series in two
>   - squash octeontx patches which implement calc_mem_size and populate
>     callbacks into the patch which removes get_capabilities since it is
>     the easiest way to untangle the tangle of tightly related library
>     functions and flags advertised by the driver
>   - consistently name default callbacks
>   - move default callbacks to dedicated file
>   - see detailed description in patches
> 
> Andrew Rybchenko (7):
>   mempool: add op to calculate memory size to be allocated
>   mempool: add op to populate objects using provided memory
>   mempool: remove callback to get capabilities
>   mempool: deprecate xmem functions
>   mempool/octeontx: prepare to remove register memory area op
>   mempool/dpaa: prepare to remove register memory area op
>   mempool: remove callback to register memory area
> 
> Artem V. Andreev (2):
>   mempool: ensure the mempool is initialized before populating
>   mempool: support flushing the default cache of the mempool
> 
>  doc/guides/rel_notes/deprecation.rst            |  12 +-
>  doc/guides/rel_notes/release_18_05.rst          |  32 ++-
>  drivers/mempool/dpaa/dpaa_mempool.c             |  13 +-
>  drivers/mempool/octeontx/rte_mempool_octeontx.c |  64 ++++--
>  lib/librte_mempool/Makefile                     |   3 +-
>  lib/librte_mempool/meson.build                  |   5 +-
>  lib/librte_mempool/rte_mempool.c                | 159 +++++++--------
>  lib/librte_mempool/rte_mempool.h                | 260 +++++++++++++++++-------
>  lib/librte_mempool/rte_mempool_ops.c            |  37 ++--
>  lib/librte_mempool/rte_mempool_ops_default.c    |  51 +++++
>  lib/librte_mempool/rte_mempool_version.map      |  11 +-
>  test/test/test_mempool.c                        |  31 ---
>  12 files changed, 437 insertions(+), 241 deletions(-)
>  create mode 100644 lib/librte_mempool/rte_mempool_ops_default.c
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[relevance 0%]

Results 3801-4000 of ~18000  next (newer) | prev (older) | reverse | sort options + mbox downloads above

-- links below jump to the message on this page --
2017-09-11 13:39     [dpdk-dev] [PATCH] doc: announce ABI change for ring structure Olivier Matz
2017-12-08 14:14     ` Olivier MATZ
2017-12-08 17:01       ` Thomas Monjalon
2018-01-17 21:07  4%     ` Thomas Monjalon
2017-11-24 16:06     [dpdk-dev] [RFC PATCH 0/6] mempool: add bucket mempool driver Andrew Rybchenko
2017-12-14 13:36     ` Olivier MATZ
2018-01-17 15:03  0%   ` Andrew Rybchenko
2018-01-23 13:15  2% ` [dpdk-dev] [RFC v2 00/17] " Andrew Rybchenko
2018-01-31 16:44  0%   ` Olivier Matz
2018-03-10 15:39  3%   ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Andrew Rybchenko
2018-03-10 15:39  7%     ` [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated Andrew Rybchenko
2018-03-11 12:51  0%       ` santosh
2018-03-12  6:53  0%         ` Andrew Rybchenko
2018-03-10 15:39  6%     ` [dpdk-dev] [PATCH v1 2/9] mempool: add op to populate objects using provided memory Andrew Rybchenko
2018-03-10 15:39  6%     ` [dpdk-dev] [PATCH v1 3/9] mempool: remove callback to get capabilities Andrew Rybchenko
2018-03-10 15:39  5%     ` [dpdk-dev] [PATCH v1 4/9] mempool: deprecate xmem functions Andrew Rybchenko
2018-03-10 15:39  8%     ` [dpdk-dev] [PATCH v1 7/9] mempool: remove callback to register memory area Andrew Rybchenko
2018-03-19 17:03  0%     ` [dpdk-dev] [PATCH v1 0/9] mempool: prepare to add bucket driver Olivier Matz
2017-11-28 11:57     [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
2018-01-07  9:45     ` [dpdk-dev] [PATCH v2 0/6] ethdev: port ownership Matan Azrad
2018-01-07  9:45       ` [dpdk-dev] [PATCH v2 2/6] ethdev: add " Matan Azrad
2018-01-10 13:36         ` Ananyev, Konstantin
2018-01-10 16:58           ` Matan Azrad
2018-01-11 12:40             ` Ananyev, Konstantin
2018-01-11 14:51               ` Matan Azrad
2018-01-12  0:02                 ` Ananyev, Konstantin
2018-01-12  7:24                   ` Matan Azrad
2018-01-15 11:45                     ` Ananyev, Konstantin
2018-01-15 13:09                       ` Matan Azrad
2018-01-15 18:43                         ` Ananyev, Konstantin
2018-01-16  8:04                           ` Matan Azrad
2018-01-16 19:11                             ` Ananyev, Konstantin
2018-01-16 20:32                               ` Matan Azrad
2018-01-17 11:24                                 ` Ananyev, Konstantin
2018-01-17 12:05                                   ` Matan Azrad
2018-01-17 12:54                                     ` Ananyev, Konstantin
2018-01-17 13:10                                       ` Matan Azrad
2018-01-17 16:52                                         ` Ananyev, Konstantin
2018-01-17 20:34                                           ` Matan Azrad
2018-01-18 14:17                                             ` Ananyev, Konstantin
2018-01-18 14:26                                               ` Matan Azrad
2018-01-18 14:41                                                 ` Ananyev, Konstantin
2018-01-18 14:45  3%                                               ` Matan Azrad
2018-01-18 14:51  0%                                                 ` Ananyev, Konstantin
2018-01-18 15:00  0%                                                   ` Matan Azrad
2017-12-01 18:56     [dpdk-dev] [PATCH 0/4] dpdk: enhance EXPERIMENTAL api tagging Neil Horman
2017-12-13 15:17     ` [dpdk-dev] [PATCHv4 " Neil Horman
2017-12-13 15:17       ` [dpdk-dev] [PATCHv4 1/5] buildtools: Add tool to check EXPERIMENTAL api exports Neil Horman
2018-01-21 18:31  3%     ` Thomas Monjalon
2018-01-21 22:07  0%       ` Neil Horman
2017-12-13 15:17       ` [dpdk-dev] [PATCHv4 2/5] compat: Add __experimental macro Neil Horman
2018-01-21 18:37  4%     ` Thomas Monjalon
2017-12-13 15:17       ` [dpdk-dev] [PATCHv4 5/5] doc: Add ABI __experimental tag documentation Neil Horman
2018-01-21 20:14  4%     ` Thomas Monjalon
2018-01-12 11:49     ` [dpdk-dev] [PATCHv4 3/5] Makefiles: Add experimental tag check and warnings to trigger on use Ferruh Yigit
2018-01-12 12:44       ` Neil Horman
2018-01-21 18:54         ` Thomas Monjalon
2018-01-22  1:34  3%       ` Neil Horman
2018-01-22  1:37  0%         ` Thomas Monjalon
2018-01-22  1:48  4% ` [dpdk-dev] [PATCH 0/5] dpdk: enhance EXPERIMENTAL api tagging Neil Horman
2018-01-22  1:48  4%   ` [dpdk-dev] [[PATCH v5] 1/5] buildtools: Add tool to check EXPERIMENTAL api exports Neil Horman
2018-01-22  1:48  5%   ` [dpdk-dev] [[PATCH v5] 2/5] compat: Add __rte_experimental macro Neil Horman
2018-01-22  1:48 10%   ` [dpdk-dev] [[PATCH v5] 5/5] doc: Add ABI __experimental tag documentation Neil Horman
2018-01-23 10:35  4%     ` Mcnamara, John
2018-01-29 21:42  4%       ` Thomas Monjalon
2017-12-04 15:55     [dpdk-dev] [PATCH] relicense various bits of the dpdk Neil Horman
2018-02-01 12:19  8% ` [dpdk-dev] [PATCH v2] " Neil Horman
2018-02-01 12:49  0%   ` Hemant Agrawal
2017-12-22 12:41     [dpdk-dev] [PATCH v2] eal: add function to return number of detected sockets Anatoly Burakov
2018-01-16 17:53     ` [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal Anatoly Burakov
2018-01-23 10:39  4%   ` Mcnamara, John
2018-02-07 10:10  4%     ` Jerin Jacob
2018-02-09 14:42  4%       ` Bruce Richardson
2018-02-14  0:04  4%         ` Thomas Monjalon
2018-02-14 14:25  4%           ` Thomas Monjalon
2018-02-12 16:00  4%   ` Jonas Pfefferle
     [not found]     ` <cover.1517848624.git.anatoly.burakov@intel.com>
2018-02-05 16:37  8%   ` [dpdk-dev] [PATCH v3] eal: add function to return number of detected sockets Anatoly Burakov
2018-02-05 17:39  3%     ` Burakov, Anatoly
2018-02-05 22:45  0%       ` Thomas Monjalon
2018-02-06  9:28  0%         ` Burakov, Anatoly
2018-02-06  9:47  0%           ` Thomas Monjalon
2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] Add " Anatoly Burakov
2018-02-07  9:58  5%     ` [dpdk-dev] [PATCH 18.05 v4] eal: add " Anatoly Burakov
2018-03-08 12:12  3%       ` Bruce Richardson
2018-03-08 14:38  0%         ` Burakov, Anatoly
2018-03-09 16:32  0%           ` Bruce Richardson
2018-01-08 10:00     [dpdk-dev] [PATCH v3] lib/librte_meter: add meter configuration profile Jasvinder Singh
2018-01-08 15:43     ` [dpdk-dev] [PATCH v4] " Jasvinder Singh
2018-02-19 21:12  3%   ` Thomas Monjalon
2018-01-11  0:20     [dpdk-dev] [PATCH v6 00/23] eventtimer: introduce event timer adapter Erik Gabriel Carrillo
2018-03-08 21:53     ` [dpdk-dev] [PATCH v7 0/7] " Erik Gabriel Carrillo
2018-03-08 21:54  2%   ` [dpdk-dev] [PATCH v7 2/7] eventtimer: add common code Erik Gabriel Carrillo
2018-01-12 10:27     [dpdk-dev] [PATCH v2] doc: ethdev ABI change deprecation notice Kirill Rybalchenko
2018-01-12 10:29     ` [dpdk-dev] [PATCH v3] " Kirill Rybalchenko
2018-01-12 14:38       ` Neil Horman
2018-02-13 12:09  4%     ` Ferruh Yigit
2018-02-13 13:21  4%       ` Olivier Matz
2018-02-14  0:14  4%         ` Thomas Monjalon
2018-02-14 17:18  4%           ` Thomas Monjalon
2018-01-12 20:45     [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function Erik Gabriel Carrillo
2018-02-13 14:37  0% ` Ferruh Yigit
2018-02-13 14:43  0%   ` Van Haaren, Harry
2018-02-13 14:47  0%     ` Pavan Nikhilesh
2018-02-14  0:09  0% ` Thomas Monjalon
2018-02-14 10:59  0%   ` Thomas Monjalon
2018-01-12 21:01     [dpdk-dev] [PATCH 1/3] app/testpmd: Moved cmdline_flow to librte_cmdline Georgios Katsikas
2018-01-15  1:30     ` Lu, Wenzhuo
2018-01-16  8:39       ` Olivier Matz
2018-01-16  8:45         ` george.dit
2018-01-16  9:24           ` Olivier Matz
2018-01-16 14:31             ` Adrien Mazarguil
2018-01-16 14:54               ` george.dit
2018-01-16 17:54                 ` Adrien Mazarguil
2018-01-24 11:57  0%               ` george.dit
2018-01-15 11:51     [dpdk-dev] [PATCH 1/2] lib/cryptodev: add support to set session private data Abhinandan Gujjar
2018-01-15 12:18     ` Akhil Goyal
2018-01-16  6:09       ` Gujjar, Abhinandan S
2018-01-16  6:24         ` Akhil Goyal
2018-01-16  7:05           ` Gujjar, Abhinandan S
2018-01-16  7:26             ` Akhil Goyal
2018-01-16  9:03               ` Gujjar, Abhinandan S
2018-01-16  9:21                 ` Akhil Goyal
2018-01-16 11:36                   ` Gujjar, Abhinandan S
2018-01-16 12:00                     ` Akhil Goyal
2018-01-16 12:29                       ` Gujjar, Abhinandan S
2018-01-16 13:02                         ` Akhil Goyal
2018-01-17  6:35                           ` Gujjar, Abhinandan S
2018-01-17  9:46                             ` De Lara Guarch, Pablo
2018-01-17 10:05                               ` Gujjar, Abhinandan S
2018-01-17 10:52                                 ` Akhil Goyal
2018-01-18  6:52  4%                               ` Gujjar, Abhinandan S
2018-01-22  6:51  0%                                 ` Gujjar, Abhinandan S
2018-01-15 16:58     [dpdk-dev] [PATCH v2] ethdev: increase flow type limit from 32 to 64 Kirill Rybalchenko
2018-01-15 17:33     ` [dpdk-dev] [PATCH v3] " Kirill Rybalchenko
2018-01-17 16:56  0%   ` Ferruh Yigit
2018-01-18  9:24  0%     ` Rybalchenko, Kirill
2018-01-18 12:25  0%     ` Ferruh Yigit
2018-01-15 19:05     [dpdk-dev] [PATCH] checkpatches.sh: Add checks for ABI symbol addition Neil Horman
2018-01-16 18:22     ` [dpdk-dev] [PATCH v2] " Neil Horman
2018-01-21 20:29  9%   ` Thomas Monjalon
2018-01-22  1:54  4%     ` Neil Horman
2018-01-22  2:05  4%       ` Thomas Monjalon
2018-01-31 17:27  6% ` [dpdk-dev] [PATCH v3] " Neil Horman
2018-02-04 14:44  7%   ` Thomas Monjalon
2018-02-05 17:29  6% ` [dpdk-dev] [PATCH v4] " Neil Horman
2018-02-05 17:57  4%   ` Thomas Monjalon
2018-02-09 15:21  6% ` [dpdk-dev] [PATCH v5] " Neil Horman
2018-02-13 22:57  4%   ` Thomas Monjalon
2018-02-14 19:19  6% ` [dpdk-dev] [PATCH v6] " Neil Horman
2018-01-17 17:17 10% [dpdk-dev] [PATCH] doc: add deprecation notice for physmem layout function Anatoly Burakov
2018-01-18 10:32 13% ` [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes Anatoly Burakov
2018-01-23 10:36  0%   ` Mcnamara, John
2018-02-05 11:47  0%   ` Bruce Richardson
2018-02-07 10:11  0%     ` Jerin Jacob
2018-02-14 14:48  0%       ` Thomas Monjalon
2018-02-12 15:58  0%   ` Jonas Pfefferle
2018-02-13  0:24  0%   ` Yongseok Koh
2018-01-17 21:57     [dpdk-dev] [PATCH v3 2/6] ethdev: return named opaque type instead of void pointer Ferruh Yigit
2018-03-09 11:25     ` [dpdk-dev] [PATCH v4] " Ferruh Yigit
     [not found]       ` <20180309123651.GB19004@hmswarspite.think-freely.org>
2018-03-09 13:00  0%     ` Ferruh Yigit
2018-03-09 15:16  0%       ` Neil Horman
2018-03-09 15:45  0%         ` Ferruh Yigit
2018-03-09 19:06  0%           ` Neil Horman
2018-01-19 13:44     [dpdk-dev] [RFC 00/24] vhost: add virtio-vhost-user transport Stefan Hajnoczi
2018-01-19 13:44  2% ` [dpdk-dev] [RFC 15/24] vhost: add virtio pci framework Stefan Hajnoczi
2018-01-22  2:02  3% [dpdk-dev] [dpdk-announce] release candidate 18.02-rc1 Thomas Monjalon
2018-01-22 15:42  3% [dpdk-dev] [PATCH] build: make compat a universal dependency Bruce Richardson
2018-01-22 17:43  0% ` Luca Boccassi
2018-01-23  9:26  0%   ` Bruce Richardson
2018-01-23 10:00  0%   ` Bruce Richardson
2018-01-22 15:45     [dpdk-dev] [PATCH] net/octeontx: register fpa as platform HW mempool Pavan Nikhilesh
2018-01-31 19:51  4% ` Ferruh Yigit
2018-01-23  8:54     [dpdk-dev] [RFC v2, 1/2] cryptodev: add support to set session private data Abhinandan Gujjar
2018-01-24 19:46  4% ` De Lara Guarch, Pablo
2018-01-25  6:42  0%   ` Akhil Goyal
2018-01-25 15:37  0%   ` Gujjar, Abhinandan S
2018-01-31 13:40  0%     ` De Lara Guarch, Pablo
2018-01-23 13:23 13% [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool Andrew Rybchenko
2018-01-31 16:46  4% ` Olivier Matz
2018-02-01  6:40  4%   ` Jerin Jacob
2018-02-01 12:53  4%     ` Hemant Agrawal
2018-02-14 15:23  4%       ` Thomas Monjalon
2018-01-26  9:03  4% [dpdk-dev] [PATCH] doc: announce ABI change for crypto info struct Pablo de Lara
2018-01-26 10:44  4% ` Trahe, Fiona
2018-01-26 11:08  4%   ` De Lara Guarch, Pablo
2018-01-29  9:26  4%     ` Akhil Goyal
2018-01-30  7:55  4%       ` Verma, Shally
2018-01-30 11:21  4%         ` De Lara Guarch, Pablo
2018-01-30 11:53  4%           ` Verma, Shally
2018-02-02  9:07  4%             ` De Lara Guarch, Pablo
2018-02-02 10:52  4%               ` Verma, Shally
2018-01-30 11:37  4% ` Akhil Goyal
2018-01-30 12:14  7% ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices Pablo de Lara
2018-01-30 12:14  4%   ` [dpdk-dev] [PATCH v2 1/3] doc: announce ABI change for crypto info struct Pablo de Lara
2018-02-13 11:45  4%   ` [dpdk-dev] [PATCH v2 0/3] Cryptodev API/ABI deprecation notices De Lara Guarch, Pablo
2018-02-01 19:47     [dpdk-dev] [PATCH 1/2] Revert "eal: fix default mempool ops" Hemant Agrawal
2018-02-01 19:56     ` Hemant Agrawal
2018-02-01 20:40  3%   ` Pavan Nikhilesh
2018-02-02  5:43  0%     ` Hemant Agrawal
2018-02-02  8:03 10% ` [dpdk-dev] [PATCH] doc: remove eal API for default mempool ops name Hemant Agrawal
2018-02-02  8:31 10%   ` [dpdk-dev] [PATCH v2] " Hemant Agrawal
2018-02-02 14:01  0%     ` Olivier Matz
2018-02-13 11:28  0%       ` Ferruh Yigit
2018-02-02 15:16  3% [dpdk-dev] [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
2018-02-02 15:16  3% ` [dpdk-dev] [PATCH v1 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
2018-02-02 16:46  3% ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
2018-02-02 16:46  3%   ` [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
2018-02-04 14:29         ` Thomas Monjalon
2018-02-05 11:24           ` Adrien Mazarguil
2018-02-05 12:13             ` Marcelo Ricardo Leitner
2018-02-05 12:24               ` Van Haaren, Harry
2018-02-05 12:58                 ` Marcelo Ricardo Leitner
2018-02-05 13:44  3%               ` Adrien Mazarguil
2018-02-05 14:16  0%                 ` Thomas Monjalon
2018-02-05 14:33  0%                   ` Adrien Mazarguil
2018-02-05 14:37  0%                   ` Marcelo Ricardo Leitner
2018-02-05 14:59  0%                     ` Adrien Mazarguil
2018-02-05 15:29  0%                       ` Marcelo Ricardo Leitner
2018-02-05 15:54  4%                         ` Adrien Mazarguil
2018-02-05 17:06  0%                           ` Marcelo Ricardo Leitner
2018-02-06 11:06  4%                             ` Adrien Mazarguil
2018-02-02 16:52  0%   ` [dpdk-dev] [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Nélio Laranjeiro
2018-02-06 11:31  0%   ` Shahaf Shuler
2018-02-04  7:24  4% [dpdk-dev] [PATCH] doc: annouce ABI change for RSS configuraiton structure Xueming Li
2018-02-06  7:38  4% ` [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure Xueming Li
2018-02-13  6:52  4%   ` Shahaf Shuler
2018-02-13 11:27  4%     ` Ferruh Yigit
2018-02-13 12:10  4%       ` Jerin Jacob
2018-02-14 16:28  4%         ` Thomas Monjalon
2018-02-05 12:16     [dpdk-dev] [PATCH 0/8] vhost: input validation enhancements Stefan Hajnoczi
2018-02-05 12:16  3% ` [dpdk-dev] [PATCH 2/8] vhost: avoid enum fields in VhostUserMsg Stefan Hajnoczi
2018-02-06  9:47  0%   ` Maxime Coquelin
2018-02-07  9:26 11% [dpdk-dev] [PATCH v1] doc: update deprecation notice of rte_devargs Gaetan Rivet
2018-02-10 13:15     [dpdk-dev] [PATCH] eal: fix rte_errno values for IPC API Anatoly Burakov
2018-02-11  1:09     ` Tan, Jianfeng
2018-02-13 13:50       ` Thomas Monjalon
2018-02-13 14:16         ` Van Haaren, Harry
2018-02-13 15:39  3%       ` Bruce Richardson
2018-02-12  4:53     [dpdk-dev] [PATCH 0/4] deferred queue setup Qi Zhang
2018-03-02  4:13     ` [dpdk-dev] [PATCH v2 " Qi Zhang
2018-03-02  4:13       ` [dpdk-dev] [PATCH v2 1/4] ether: support " Qi Zhang
2018-03-14 12:31  0%     ` Ananyev, Konstantin
2018-03-15  3:13  0%       ` Zhang, Qi Z
2018-03-15 13:16  0%         ` Ananyev, Konstantin
2018-03-15 15:08  0%           ` Zhang, Qi Z
2018-03-15 15:38  0%             ` Ananyev, Konstantin
2018-02-13  8:14     [dpdk-dev] [PATCH v2] net/tap: add CRC stripping capability Ophir Munk
2018-02-13 16:35     ` Thomas Monjalon
2018-02-15 21:55  3%   ` Stephen Hemminger
2018-02-16 13:00  0%     ` Thomas Monjalon
2018-02-14 12:21  6% [dpdk-dev] [PATCH v1] doc: update release notes for 18.02 John McNamara
2018-02-14 13:50  6% ` [dpdk-dev] [PATCH v2] " John McNamara
2018-02-14 12:32  4% [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors Shahaf Shuler
2018-02-14 13:50  4% ` Thomas Monjalon
2018-02-14 13:54  4% ` Doherty, Declan
2018-02-14 14:50  4% ` Remy Horton
2018-02-14 15:27  4% ` Boccassi, Luca
2018-02-14 15:54  4%   ` Jerin Jacob
2018-02-14 16:50  4%     ` Thomas Monjalon
2018-02-14 15:37  4% [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Adrien Mazarguil
2018-02-14 15:37  3% ` [dpdk-dev] [PATCH v1 3/4] doc: announce API change for flow RSS/RAW actions Adrien Mazarguil
2018-02-14 15:55  0% ` [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules Nélio Laranjeiro
2018-02-14 16:06  0%   ` Andrew Rybchenko
2018-02-14 19:11  3% [dpdk-dev] [dpdk-announce] DPDK 18.02 released Thomas Monjalon
2018-02-15 13:04  6% [dpdk-dev] [PATCH v1] doc: add template release notes for 18.05 John McNamara
2018-02-16 22:54  0% ` Carrillo, Erik G
2018-02-21 21:44     [dpdk-dev] [PATCH v2 0/2] lib/rib: Add Routing Information Base library Medvedkin Vladimir
2018-02-21 21:44     ` [dpdk-dev] [PATCH v2 1/2] Add RIB library Medvedkin Vladimir
2018-03-14 11:09  4%   ` Bruce Richardson
2018-02-22 12:15  8% [dpdk-dev] [PATCH] doc: fixing grammar Alejandro Lucero
2018-02-24 13:14     [dpdk-dev] [PATCH v2 0/7] crypto: add virtio poll mode driver Jay Zhou
2018-02-24 13:14  2% ` [dpdk-dev] [PATCH v2 1/7] crypto/virtio: add virtio related fundamental functions Jay Zhou
2018-02-27 10:29  3% [dpdk-dev] [PATCH] ethdev: remove versioning of ethdev filter control function Kirill Rybalchenko
2018-02-27 11:01  0% ` Ferruh Yigit
2018-02-27 13:45  3%   ` Thomas Monjalon
2018-02-27 14:18  7% ` [dpdk-dev] [PATCH v2] " Kirill Rybalchenko
2018-03-07 17:17  0%   ` Ferruh Yigit
2018-03-07 17:47  0%     ` Ferruh Yigit
2018-03-05 23:01     [dpdk-dev] [PATCH 1/2] eventdev: add device stop flush callback Gage Eads
2018-03-08 23:10     ` [dpdk-dev] [PATCH v2 " Gage Eads
2018-03-12  6:25  3%   ` Jerin Jacob
2018-03-12 14:30  3%     ` Eads, Gage
2018-03-12 14:38  0%       ` Jerin Jacob
2018-03-06 18:28  3% [dpdk-dev] [PATCH] eal: register rte_panic user callback Arnon Warshavsky
2018-03-07  8:32  0% ` Thomas Monjalon
2018-03-07  9:05  0%   ` Burakov, Anatoly
2018-03-07  9:59  0%     ` Thomas Monjalon
2018-03-07 11:29  0%       ` Burakov, Anatoly
2018-03-07 12:08  3% [dpdk-dev] [RFC PATCH v1 0/4] ethdev: add per-PMD tuning of RxTx parmeters Remy Horton
2018-03-07 17:44 23% [dpdk-dev] [RFC] config: remove RTE_NEXT_ABI Ferruh Yigit
2018-03-07 18:06  0% ` Luca Boccassi
2018-03-08  8:05  5% ` Thomas Monjalon
2018-03-08 11:43  3%   ` Ferruh Yigit
2018-03-08 15:17  0%     ` Thomas Monjalon
2018-03-08 15:35  0%       ` Neil Horman
2018-03-08 16:04  0%         ` Thomas Monjalon
2018-03-08 19:40  3%           ` Neil Horman
2018-03-08 21:34  4%             ` Thomas Monjalon
2018-03-09  0:18  4%               ` Neil Horman
2018-03-08  1:29     [dpdk-dev] [RFC PATCH 0/5] add framework to load and execute BPF code Konstantin Ananyev
2018-03-08  1:29  2% ` [dpdk-dev] [RFC PATCH 1/5] bpf: add BPF loading and execution framework Konstantin Ananyev
2018-03-09 16:42     [dpdk-dev] [PATCH v1 0/5] add framework to load and execute BPF code Konstantin Ananyev
2018-03-09 16:42  2% ` [dpdk-dev] [PATCH v1 1/5] bpf: add BPF loading and execution framework Konstantin Ananyev
2018-03-10  1:25     [dpdk-dev] [PATCH v1 0/6] net/mlx5: add Multi-Packet Rx support Yongseok Koh
2018-03-10  1:25     ` [dpdk-dev] [PATCH v1 3/6] net/mlx5: add a function to rdma-core glue Yongseok Koh
2018-03-12  9:13  3%   ` Nélio Laranjeiro
2018-03-12 15:55  1% [dpdk-dev] [RFC] Switch device offload with DPDK Adrien Mazarguil
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).